< Home
Week 1: Assignment 1 - Playground¶
DB selection¶
Title: National Survey on Self-Reported Well-Being (ENBIARE) 2021. Timely data¶
Description: The 2021 National Survey on Self-Reported Well-Being (ENBIARE) has the overall objective of producing statistical information on different dimensions of well-being measurement; highlighting how women and men perceive and evaluate their own situation; collecting information on a wide range of circumstances and events experienced by individuals in order to identify drivers and detractors of well-being; and highlighting inequalities between population groups. All of this applies to the entire adult population aged 18 and over, who are literate and Spanish-speaking, and who reside in urban and rural areas.
Modified: 2021-12-14
Publisher: Instituto Nacional de Estadística y Geografía, INEGI
Keyword: "actividad", "apoyo", "bienestar", "confianza", "doméstico", "encuesta", "hogar", "Indicadores", "población", "progreso", "redes", "redes", "servicio", "social", "socioeconómica", "subjetiva", "tiempo", "vulnerabilidad"
License: https://www.inegi.org.mx/inegi/terminos.html
Why this data base?
I found this database very interesting because it has 31,200 surveys with 285 very interesting questions, such as “How satisfied are you with your life right now?” and there is the possibility of relating it to a lot of data: security, government institutions, social media use, health, household expenses, household income, type of music you listen to, violence, drug addiction, educational level... With all this data, I think you can make some very interesting correlations.
Other DB used¶
After analyzing the data, I found that the database I had chosen did not meet the requirements to be analyzed properly, so I took different databases so that I could better understand the exercises. So I used the following:¶
Title: Traffic Accidents in Mexico DB¶
Description: The traffic accident DB shows all traffic accidents in different municipalities in Mexico. It includes accidents with injuries and death people.
License: https://www.inegi.org.mx/inegi/terminos.html
Title: Titanic DB¶
Description: Titanic - Machine Learning from Disaster Dataset is used for a competition in Kaggle. It shows survivors of the Titanic disaster by gender, class, age, fare, etc.
You can find the dataset here
Title: Diamonds DB¶
Description: Also find in Kaggle, it is integrated in the seaborn library. It is widely considered the "gold standard" for practicing distribution analysis because it is large (approx 54,000), clean, and contains continuous variables that follow interesting, non-normal distributions.
You can find the dataset here