Skip to content

Abdullahi8852/Task-3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Task-3

Import libraries and load data import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder, StandardScaler from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

Load the dataset

df = pd.read_csv('Churn_Modelling.csv')

Display the first few rows of the dataset

print(df.head())

Drop unnecessary columns

df = df.drop(['RowNumber', 'CustomerId', 'Surname'], axis=1)

Check for missing values

print(df.isnull().sum())

No missing values in this dataset, so we can proceed

Encode categorical features

le = LabelEncoder() df['Geography'] = le.fit_transform(df['Geography']) df['Gender'] = le.fit_transform(df['Gender'])

Define features and target

X = df.drop(['Exited'], axis=1) y = df['Exited']

Split data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Scale features

scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test)

Train a random forest classifier

model = RandomForestClassifier(n_estimators=100, random_state=42) model.fit(X_train, y_train)

Make predictions

y_pred = model.predict(X_test)

Evaluate the model

accuracy = accuracy_score(y_test, y_pred) print("Model Accuracy:", accuracy)

Classification report

print("Classification Report:") print(classification_report(y_test, y_pred))

Confusion matrix

print("Confusion Matrix:") print(confusion_matrix(y_test, y_pred))

Get feature importance

feature_importance = model.feature_importances_ feature_names = X.columns

Create a dataframe to display feature importance

feature_importance_df = pd.DataFrame({'Feature': feature_names, 'Importance': feature_importance}) feature_importance_df = feature_importance_df.sort_values(by='Importance', ascending=False)

Display feature importance

print(feature_importance_df)

About

Import libraries and load data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published