< Home
In [1]:
# Week 1: Introduction
Assignment¶
This week assignment include Dataset selection and plotting.
Dataset¶
Student Success Predictor: Understanding Factors Affecting Academic Performance
| Descriptors | Description |
|---|---|
| Age | The age of the student. |
| Gender | The gender of the student (M for Male, F for Female). |
| Parental_Education | The highest level of education attained by the student's parents. |
| Family_Income | The family income level. |
| Previous_Grades | The student's previous academic performance (A, B, or C grades). |
| Attendance | The percentage of attendance in classes. |
| Class_Participation | Participation level in class activities (Low, Medium, High). |
| Study_Hours | Average number of study hours per week. |
| Major | The student's major or field of study. |
| Uni_Type | Type of University attended (Public or Private). |
| Financial_Status | Financial status of the student (Low, Medium, High). |
| Parental_Involvement | Parental involvement level (Low, Medium, High). |
| Educational_Resources | Availability of educational resources at home (Yes or No). |
| Motivation | Level of motivation (Low, Medium, High). |
| Self_Esteem | Level of self-esteem (Low, Medium, High). |
| Stress_Levels | Level of stress experienced (Low, Medium, High). |
| School_Environment | Perception of school environment (Negative, Neutral, Positive). |
| Professor_Quality | Quality of professors (Low, Medium, High). |
| Class_Size | The size of the class. |
| Extracurricular_Activities | Participation in extracurricular activities (Yes or No). |
| Sleep_Patterns | Average hours of sleep per day. |
| Nutrition | Quality of nutrition (Unhealthy, Balanced, Healthy). |
| Physical_Activity | Level of physical activity (Low, Medium, High). |
| Screen_Time | Hours spent on screen-based activities per day. |
| Educational_Tech_Use | Use of educational technology (Yes or No). |
| Peer_Group | Peer group influence (Negative, Neutral, Positive). |
| Bullying | Experience of bullying (Yes or No). |
| Study_Space | Availability of a study space at home (Yes or No). |
| Learning_Style | Preferred learning style (Visual, Auditory, Kinesthetic). |
| Tutoring | Participation in tutoring programs (Yes or No). |
| Mentoring | Availability of mentoring support (Yes or No). |
| Lack_of_Interest | Interest level in academics (Low, Medium, High). |
| Time_Wasted_on_Social_Media | Time spent on social media platforms. |
| Sports_Participation | Participation level in sports (Low, Medium, High). |
| Grades | Final grades achieved (A, B, or C). |
In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Load the dataset
df = pd.read_csv('datasets/week1/dataset.csv')
# Select numeric columns only
numeric_df = df.select_dtypes(include=np.number)
# Calculate the correlation matrix
corr = numeric_df.corr()
# Set the plot size
plt.figure(figsize=(16, 12))
plt.imshow(corr, cmap='coolwarm') # Display the correlation matrix as an image
# Add color bar to indicate the scale
plt.colorbar()
# Add labels to the x and y axes
labels = corr.columns
plt.xticks(np.arange(len(labels)), labels, rotation=90) # Set column names on x-axis and also rotating them vertically to avoid overlap
plt.yticks(np.arange(len(labels)), labels) # Set column names on y-axis
# Show the correlation value in each cell
for i in range(len(labels)):
for j in range(len(labels)):
plt.text(j, i, f"{corr.iloc[i, j]:.2f}", ha="center", va="center")
# Title and layout
plt.title("Correlation Heatmap of Factors Affecting Academic Performance")
plt.tight_layout()
# Save the image
#plt.savefig("correlation_heatmap.png")
# Show the plot
plt.show()
In [ ]: