Shweta Singh - Fab Futures - Data Science
Home About

< Home

In [1]:
# Week 1: Introduction

Assignment¶

This week assignment include Dataset selection and plotting.

Dataset¶

Student Success Predictor: Understanding Factors Affecting Academic Performance

Descriptors Description
Age The age of the student.
Gender The gender of the student (M for Male, F for Female).
Parental_Education The highest level of education attained by the student's parents.
Family_Income The family income level.
Previous_Grades The student's previous academic performance (A, B, or C grades).
Attendance The percentage of attendance in classes.
Class_Participation Participation level in class activities (Low, Medium, High).
Study_Hours Average number of study hours per week.
Major The student's major or field of study.
Uni_Type Type of University attended (Public or Private).
Financial_Status Financial status of the student (Low, Medium, High).
Parental_Involvement Parental involvement level (Low, Medium, High).
Educational_Resources Availability of educational resources at home (Yes or No).
Motivation Level of motivation (Low, Medium, High).
Self_Esteem Level of self-esteem (Low, Medium, High).
Stress_Levels Level of stress experienced (Low, Medium, High).
School_Environment Perception of school environment (Negative, Neutral, Positive).
Professor_Quality Quality of professors (Low, Medium, High).
Class_Size The size of the class.
Extracurricular_Activities Participation in extracurricular activities (Yes or No).
Sleep_Patterns Average hours of sleep per day.
Nutrition Quality of nutrition (Unhealthy, Balanced, Healthy).
Physical_Activity Level of physical activity (Low, Medium, High).
Screen_Time Hours spent on screen-based activities per day.
Educational_Tech_Use Use of educational technology (Yes or No).
Peer_Group Peer group influence (Negative, Neutral, Positive).
Bullying Experience of bullying (Yes or No).
Study_Space Availability of a study space at home (Yes or No).
Learning_Style Preferred learning style (Visual, Auditory, Kinesthetic).
Tutoring Participation in tutoring programs (Yes or No).
Mentoring Availability of mentoring support (Yes or No).
Lack_of_Interest Interest level in academics (Low, Medium, High).
Time_Wasted_on_Social_Media Time spent on social media platforms.
Sports_Participation Participation level in sports (Low, Medium, High).
Grades Final grades achieved (A, B, or C).
In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Load the dataset
df = pd.read_csv('datasets/week1/dataset.csv')

# Select numeric columns only
numeric_df = df.select_dtypes(include=np.number)

# Calculate the correlation matrix
corr = numeric_df.corr()

# Set the plot size
plt.figure(figsize=(16, 12))
plt.imshow(corr, cmap='coolwarm')  # Display the correlation matrix as an image

# Add color bar to indicate the scale
plt.colorbar()

# Add labels to the x and y axes
labels = corr.columns
plt.xticks(np.arange(len(labels)), labels, rotation=90)  # Set column names on x-axis and also rotating them vertically to avoid overlap 
plt.yticks(np.arange(len(labels)), labels)               # Set column names on y-axis

# Show the correlation value in each cell
for i in range(len(labels)):
	for j in range(len(labels)):
		plt.text(j, i, f"{corr.iloc[i, j]:.2f}", ha="center", va="center")

# Title and layout
plt.title("Correlation Heatmap of Factors Affecting Academic Performance")
plt.tight_layout()

# Save the image
#plt.savefig("correlation_heatmap.png")

# Show the plot
plt.show()
No description has been provided for this image
In [ ]: