Philippe Libioulle - Fab Futures - Data Science
Home About

< Previous dataset - Week 2 home - Next dataset>

Week 2: tools - "Loan approval" dataset¶

Context¶

  • Source: Kaggle
  • Description: complete dataset of 50,000 loan applications across Credit Cards, Personal Loans, and Lines of Credit. Includes customer demographics, financial profiles, credit behavior, and approval decisions based on real US & Canadian banking criteria.
  • Credit: Brian Risk on Kaggle

Load dataset¶

In [1]:
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv("datasets/Loan_approval_data_2025.csv", delimiter=',', encoding='ascii')
numeric_cols = df.select_dtypes(include=[np.number]).columns.tolist()
categorical_cols = df.select_dtypes(exclude=[np.number]).columns.tolist()

# 🧾 Display dataset informations
print("Dataset shape:", df.shape)
Dataset shape: (50000, 20)

Explore content¶

In [2]:
df.head()
Out[2]:
customer_id age occupation_status years_employed annual_income credit_score credit_history_years savings_assets current_debt defaults_on_file delinquencies_last_2yrs derogatory_marks product_type loan_intent loan_amount interest_rate debt_to_income_ratio loan_to_income_ratio payment_to_income_ratio loan_status
0 CUST100000 40 Employed 17.2 25579 692 5.3 895 10820 0 0 0 Credit Card Business 600 17.02 0.423 0.023 0.008 1
1 CUST100001 33 Employed 7.3 43087 627 3.5 169 16550 0 1 0 Personal Loan Home Improvement 53300 14.10 0.384 1.237 0.412 0
2 CUST100002 42 Student 1.1 20840 689 8.4 17 7852 0 0 0 Credit Card Debt Consolidation 2100 18.33 0.377 0.101 0.034 1
3 CUST100003 53 Student 0.5 29147 692 9.8 1480 11603 0 1 0 Credit Card Business 2900 18.74 0.398 0.099 0.033 1
4 CUST100004 32 Employed 12.5 63657 630 7.2 209 12424 0 0 0 Personal Loan Education 99600 13.92 0.195 1.565 0.522 1

Display a nice chart¶

In [3]:
plt.figure(figsize=(12, 8))
for idx, col in enumerate(numeric_cols):
    plt.subplot(4, 4, idx+1)
    sns.histplot(df[col], kde=True, bins=30)
    plt.title(col)
plt.tight_layout()
plt.show()
No description has been provided for this image
In [ ]: