[Pablo Nuñez] - Fab Futures 2025 - Data Science
Home About

< Home

Class 2: Showing data visualization¶

Os this first class Neil introduced us in the world of data science, explaining how understanding data and giving them interpretation could make a terrible difference in differents areas of life, from the history of a cholera spread in London to the discovering the Higgs Bosson as a very little deviation of an expected graphic of data.

So our goal this first class y to understand the varaity of uses that the data can have in our days. Right now everybody is collecting data from our surroundings, online services, goverments and private companies. And some of them are available for the people...but we dont know what to do with them.

Having this data and make something with them is the goal of this course, learning the tools and resources to deal with them. So, lets start!!!!

Starting example¶

First of all im going to try to visualize some data downloaded from a website. Im going to use the JSON file, and i will follow the example from this website: https://www.w3schools.com/python/pandas/pandas_json.asp

Im going to download a dataset from a spanish goverment initiaitive of open data: datos.gob.es

The data i will download data of ageing population from 1975 to 2024: https://datos.gob.es/es/catalogo/ea0010587-indice-de-envejecimiento-idb-identificador-api-1418

Then lets visualize it:

In [8]:
import pandas as pd #pd is a instance of pandas

# load the dataset into the variable df
df = pd.read_csv('datasets/1418.csv')

df.head(10) # show the 10 first rows of the data
Out[8]:
Totales Territoriales;Periodo;Total
Total Nacional;2024;142 35.0
Total Nacional;2023;137 33.0
Total Nacional;2022;133 64.0
Total Nacional;2021;129 16.0
Total Nacional;2020;125 82.0
Total Nacional;2019;123 NaN
Total Nacional;2018;120 56.0
Total Nacional;2017;118 36.0
Total Nacional;2016;116 33.0
Total Nacional;2015;114 69.0

Visualizing example¶

Cartography base from Spain in SHAPE file type: https://centrodedescargas.cnig.es/CentroDescargas/catalogo.do?Serie=CAANE From the Carthography institute.

Inside the folder of the data comes several files, like dbf, prj... so i made a folder inside the datasets folder, called "espana", and upload all the files inside that.

Now i need to install geopanda in my server, so lets try something like: conda install -c conda-forge geopandas

Here is the prompt that i use gemini to ask for a way to plot a map of this data: "jupyter visualizar mapa españa shapefile"

and give me this code: import geopandas as gpd file_path = './datasets/espana/ee89_14_admin_pais_a.shp' mapa_es = gpd.read_file(file_path)

mapa_es.plot()

It shows a map of the hole world, lets see if i can adjust to be only spain:

In [1]:
import pandas as pd

df = pd.read_csv('datasets/data.csv')
print(df.to_string()) #Display all rows and columns of a DataFrame as a string
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 import geopandas as gpd
      2 file_path = './datasets/espana/ee89_14_admin_pais_a.shp'
      3 mapa_es = gpd.read_file(file_path)

ModuleNotFoundError: No module named 'geopandas'

Now we are going to try other example i find in internet. Because i want to show a more datailed spain map, with the provinces and colours. I try this tutorial that gemini show me when i make this prompt: "jupyter plot un mapa de españa conlas provincias" First of all i need a map of spain divided by provinces, you can download here: https://gist.github.com/josemamira/3af52a4698d42b3f676fbc23f807a605?short_path=45ec3d9

In [21]:
import geopandas as gpd
import matplotlib.pyplot as plt

# Cargar el archivo GeoJSON con las provincias de España
provincias = gpd.read_file("./datasets/provincias_spain.geojson")

fig, ax = plt.subplots(figsize=(12, 8))
provincias.plot(ax=ax, edgecolor='black', linewidth=0.5, cmap='Set3')
ax.set_title('Map of spain with provinces', fontsize=16)
ax.set_axis_off()
plt.show()
No description has been provided for this image

Now lets use this example to show data in a map.

In [22]:
import plotly.graph_objects as go

import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_us_cities.csv')
df.head()

df['text'] = df['name'] + '<br>Population ' + (df['pop']/1e6).astype(str)+' million'
limits = [(0,3),(3,11),(11,21),(21,50),(50,3000)]
colors = ["royalblue","crimson","lightseagreen","orange","lightgrey"]
cities = []
scale = 5000

fig = go.Figure()

for i in range(len(limits)):
    lim = limits[i]
    df_sub = df[lim[0]:lim[1]]
    fig.add_trace(go.Scattergeo(
        locationmode = 'USA-states',
        lon = df_sub['lon'],
        lat = df_sub['lat'],
        text = df_sub['text'],
        marker = dict(
            size = df_sub['pop']/scale,
            color = colors[i],
            line_color='rgb(40,40,40)',
            line_width=0.5,
            sizemode = 'area'
        ),
        name = '{0} - {1}'.format(lim[0],lim[1])))

fig.update_layout(
        title_text = '2014 US city populations<br>(Click legend to toggle traces)',
        showlegend = True,
        geo = dict(
            scope = 'usa',
            landcolor = 'rgb(217, 217, 217)',
        )
    )

fig.show()
No description has been provided for this image
In [ ]:
 

LINKS¶

Como hacer un mapa con python https://geopandas.org/en/stable/docs/user_guide/mapping.html https://plotly.com/python/maps/

In [ ]:
 
In [ ]: