Philippe Libioulle - Fab Futures - Data Science
Home About

< Previous dataset - Week 3 home - Next dataset>

Week 3: fitting - "AI impact jobs by 2030" dataset¶

Context¶

  • Source: Kaggle
  • Description: this dataset simulates the future of work in the age of artificial intelligence. It models how various professions, skills, and education levels might be impacted by AI-driven automation by the year 2030.

Load dataset¶

In [6]:
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv("datasets/AI_Impact_on_Jobs_2030.csv")
print("Dataset shape:", sorted_df.shape)
Dataset shape: (168, 18)

Explore content¶

In [7]:
df.head()
Out[7]:
Job_Title Average_Salary Years_Experience Education_Level AI_Exposure_Index Tech_Growth_Factor Automation_Probability_2030 Risk_Category Skill_1 Skill_2 Skill_3 Skill_4 Skill_5 Skill_6 Skill_7 Skill_8 Skill_9 Skill_10
0 Security Guard 45795 28 Master's 0.18 1.28 0.85 High 0.45 0.10 0.46 0.33 0.14 0.65 0.06 0.72 0.94 0.00
1 Research Scientist 133355 20 PhD 0.62 1.11 0.05 Low 0.02 0.52 0.40 0.05 0.97 0.23 0.09 0.62 0.38 0.98
2 Construction Worker 146216 2 High School 0.86 1.18 0.81 High 0.01 0.94 0.56 0.39 0.02 0.23 0.24 0.68 0.61 0.83
3 Software Engineer 136530 13 PhD 0.39 0.68 0.60 Medium 0.43 0.21 0.57 0.03 0.84 0.45 0.40 0.93 0.73 0.33
4 Financial Analyst 70397 22 High School 0.52 1.46 0.64 Medium 0.75 0.54 0.59 0.97 0.61 0.28 0.30 0.17 0.02 0.42

Display how automation probability is related (or not..) to salary¶

In [8]:
x = df['Average_Salary']
y = df['Automation_Probability_2030']
plt.plot(x,y,'o')
plt.show()
No description has been provided for this image

The graph shows there is no obvious link between salary and automation probability !

Display how automation probability is related (or not..) to years of experience¶

In [9]:
x = df['Years_Experience']
y = df['Automation_Probability_2030']
plt.plot(x,y,'o')
plt.show()
No description has been provided for this image

Nothing usefull here... Let's try again and focus on job title..¶

In [12]:
x = df['Job_Title']
y = df['Automation_Probability_2030']
plt.plot(x,y,'o')
plt.xticks(rotation=90)
plt.show()
No description has been provided for this image

Conclusion: this is not a good candidate for fitting...¶

In [ ]: