[Frosti Gíslason] - Fab Futures - Data Science
Home About

< Home

Week 1: Assignment 2¶

Visualize your data set(s)

Data sets¶

Idea is to analyse what fish is landed in Vestmannaeyjar harbor, and what happens to it? What boats bring the fish in? When?

Other thought could be where do the boats from Vestmannaeyjar land fish How much quota do the boats get? How much do they fish? And where do they land their catch? What happens to the catch? Is it locally processed or is it exported without processing?

  • https://island.is/v/gagnasidur-fiskistofu/gagnasidur?pageName=ReportSection81d8215e94990

Check out tutorials¶

Check out the python tutorial.

  • https://docs.python.org/3/tutorial/index.html

I check out tutorials about Panda

  • https://pandas.pydata.org/docs/getting_started/intro_tutorials/01_table_oriented.html
  • https://pandas.pydata.org/docs/getting_started/intro_tutorials/06_calculate_statistics.html

the amount of fish landed is listed in the column amountkg My goal is to create a visual monthly seasonal catch analyses for distinct species through out the whole dataset, how would I do that?

Here is a new dataset: Average temperature in the Nordic Capitals downloaded from https://nordicstatistics.org/areas/geography-and-climate/

In [27]:
import pandas as pd

# Load the dataset
df = pd.read_csv("datasets/nordicaveragetemp.csv")

# Show the first 5 rows
df.head()
Out[27]:
Category DK FO GL FI AX IS NO SE
0 1874 7.8 NaN NaN 4.8 NaN NaN 5.9 6.0
1 1875 6.9 NaN NaN 1.9 NaN NaN 4.3 4.3
2 1876 7.1 NaN NaN 3.1 NaN NaN 4.6 4.9
3 1877 6.8 NaN NaN 3.3 NaN NaN 3.6 4.7
4 1878 7.8 NaN NaN 5.2 NaN NaN 5.9 6.2

Show the last 5 rows

In [28]:
# Show the last 5 rows
df.tail()
Out[28]:
Category DK FO GL FI AX IS NO SE
146 2020 10.7 7.2 -0.7 8.7 8.5 5.1 8.9 9.7
147 2021 9.6 7.0 0.1 6.6 6.9 5.4 7.3 8.1
148 2022 10.3 7.3 -1.0 7.3 7.4 5.1 8.0 8.8
149 2023 10.1 7.3 -0.3 7.1 6.7 5.0 7.0 8.0
150 2024 10.5 7.1 -0.7 NaN 7.4 4.3 7.8 8.9

Now I want to visualize the data, I check out tutorial on https://matplotlib.org/stable/users/explain/quick_start.html#quick-start

I ask ChatGPT for help. Now I want to create a visualisation using matplotlib and numpy import matplotlib.pyplot as plt import numpy as np I want each Nordic country to have it's own color and have a contunuing line, and on x axis there would be the year and y axis the temperature.

In [29]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Load your dataset
df = pd.read_csv("datasets/nordicaveragetemp.csv")

# Convert the Category column (years) to integers
df["Category"] = df["Category"].astype(int)

# Nordic country columns
countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]

# Assign a color to each country
colors = {
    "DK": "red",
    "FO": "purple",
    "GL": "blue",
    "FI": "green",
    "AX": "orange",
    "IS": "cyan",
    "NO": "black",
    "SE": "brown"
}

# Create the plot
plt.figure(figsize=(14, 7))

for country in countries:
    if country in df.columns:
        plt.plot(
            df["Category"],
            df[country],
            color=colors[country],
            label=country,
            linewidth=2
        )

plt.title("Average Annual Temperature in Nordic Countries (°C)")
plt.xlabel("Year")
plt.ylabel("Temperature (°C)")
plt.grid(True, linestyle="--", alpha=0.4)
plt.legend(title="Country", ncol=4)
plt.tight_layout()
plt.show()
No description has been provided for this image

Min, Max and Average temperaure¶

Again I ask ChatGPT for further assistance.

Now I want to analyse the data and create a table for the max , min, and average temperature in each of those countries.

In [30]:
import pandas as pd

# Load data
df = pd.read_csv("datasets/nordicaveragetemp.csv")

# Convert years to integers
df["Category"] = df["Category"].astype(int)

# Nordic country columns
countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]

# Create a summary table
summary = pd.DataFrame({
    "Min °C": df[countries].min(),
    "Max °C": df[countries].max(),
    "Average °C": df[countries].mean()
})

# Display the table
summary
Out[30]:
Min °C Max °C Average °C
DK 5.9 10.7 7.962252
FO 4.9 8.1 6.595556
GL -4.4 2.6 -1.173077
FI 1.9 8.7 5.266667
AX 3.3 8.5 6.170833
IS 2.8 6.1 4.763158
NO 3.6 8.9 5.941722
SE 4.1 9.7 6.619868

I ask ChatGPT

I would like to work further to make interactive map, how can I start with that using Jypyter Notebook?

In [31]:
import pandas as pd

df = pd.read_csv("datasets/nordicaveragetemp.csv")
df["Category"] = df["Category"].astype(int)

countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]

summary = pd.DataFrame({
    "country": countries,
    "Min": df[countries].min().values,
    "Max": df[countries].max().values,
    "Avg": df[countries].mean().round(2).values
})

summary
Out[31]:
country Min Max Avg
0 DK 5.9 10.7 7.96
1 FO 4.9 8.1 6.60
2 GL -4.4 2.6 -1.17
3 FI 1.9 8.7 5.27
4 AX 3.3 8.5 6.17
5 IS 2.8 6.1 4.76
6 NO 3.6 8.9 5.94
7 SE 4.1 9.7 6.62
In [32]:
iso_map = {
    "DK": "DNK",
    "FO": "FRO",
    "GL": "GRL",
    "FI": "FIN",
    "AX": "ALA",
    "IS": "ISL",
    "NO": "NOR",
    "SE": "SWE"
}

summary["ISO3"] = summary["country"].map(iso_map)
summary
Out[32]:
country Min Max Avg ISO3
0 DK 5.9 10.7 7.96 DNK
1 FO 4.9 8.1 6.60 FRO
2 GL -4.4 2.6 -1.17 GRL
3 FI 1.9 8.7 5.27 FIN
4 AX 3.3 8.5 6.17 ALA
5 IS 2.8 6.1 4.76 ISL
6 NO 3.6 8.9 5.94 NOR
7 SE 4.1 9.7 6.62 SWE
In [38]:
import plotly.express as px

fig = px.choropleth(
    summary,
    locations="ISO3",
    color="Avg",
    hover_name="country",
    hover_data={"Min": True, "Max": True, "Avg": True, "ISO3": False},
    color_continuous_scale="RdBu_r",
    projection="mercator",
    title="Average Temperature in Nordic Countries"
)

fig.update_geos(fitbounds="locations", visible=False)

fig.show()
No description has been provided for this image

I ask ChatGPT for help: I would like to add the time dimension and have that as a slider for the years, under the map, and I would like to have the data displayed above the countries, depending on the year where the slider is

Prepare the dataset

In [36]:
import pandas as pd

df = pd.read_csv("datasets/nordicaveragetemp.csv")
df["Category"] = df["Category"].astype(int)

countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]

# Long format for animation frames
long_df = df.melt(
    id_vars="Category",
    value_vars=countries,
    var_name="country",
    value_name="temp"
).dropna()

# ISO codes
iso_map = {
    "DK": "DNK",
    "FO": "FRO",
    "GL": "GRL",
    "FI": "FIN",
    "AX": "ALA",
    "IS": "ISL",
    "NO": "NOR",
    "SE": "SWE"
}

long_df["ISO3"] = long_df["country"].map(iso_map)

# rough coordinates for labels
coords = {
    "DK": (10.0, 56.5),
    "FO": (-6.9, 62.0),
    "GL": (-41.0, 74.0),
    "FI": (25.0, 64.5),
    "AX": (19.9, 60.2),
    "IS": (-18.0, 65.0),
    "NO": (10.0, 62.0),
    "SE": (15.0, 62.0)
}

long_df["lon"] = long_df["country"].map(lambda x: coords[x][0])
long_df["lat"] = long_df["country"].map(lambda x: coords[x][1])

Build the interactive animation

In [37]:
import plotly.graph_objects as go

years = sorted(long_df["Category"].unique())

# Base figure
fig = go.Figure()

# --- FRAME GENERATION ---

frames = []

for year in years:
    df_year = long_df[long_df["Category"] == year]

    frame = go.Frame(
        name=str(year),
        data=[
            # Choropleth Layer
            go.Choropleth(
                locations=df_year["ISO3"],
                z=df_year["temp"],
                colorscale="RdBu_r",
                zmin=-10, zmax=10,
                marker_line_width=0,
                colorbar_title="°C"
            ),
            # Text Label Layer
            go.Scattergeo(
                lon=df_year["lon"],
                lat=df_year["lat"],
                text=df_year["temp"].round(1),
                mode="text",
                textfont=dict(size=14, color="black")
            )
        ]
    )

    frames.append(frame)

fig.frames = frames

# --- INITIAL DATA (first year) ---

df0 = long_df[long_df["Category"] == years[0]]

fig.add_trace(go.Choropleth(
    locations=df0["ISO3"],
    z=df0["temp"],
    colorscale="RdBu_r",
    zmin=-10, zmax=10,
    marker_line_width=0,
))

fig.add_trace(go.Scattergeo(
    lon=df0["lon"],
    lat=df0["lat"],
    text=df0["temp"].round(1),
    mode="text",
    textfont=dict(size=14, color="black")
))

# --- LAYOUT ---

fig.update_geos(
    fitbounds="locations",
    visible=False
)

fig.update_layout(
    title="Nordic Temperatures by Year (Interactive Slider)",
    margin=dict(l=0, r=0, t=40, b=0),
    updatemenus=[
        dict(
            type="buttons",
            showactive=False,
            x=0.05, y=0,
            buttons=[dict(label="Play", method="animate", args=[None])]
        )
    ],
    sliders=[
        dict(
            steps=[
                dict(method="animate",
                     args=[[str(year)], {"mode": "immediate"}],
                     label=str(year))
                for year in years
            ],
            x=0.1, y=-0.05,
            currentvalue={"prefix": "Year: "}
        )
    ]
)

fig.show()
No description has been provided for this image
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: