< Home
Week 1: Assignment 2¶
Visualize your data set(s)
Data sets¶
Idea is to analyse what fish is landed in Vestmannaeyjar harbor, and what happens to it? What boats bring the fish in? When?
Other thought could be where do the boats from Vestmannaeyjar land fish How much quota do the boats get? How much do they fish? And where do they land their catch? What happens to the catch? Is it locally processed or is it exported without processing?
the amount of fish landed is listed in the column amountkg My goal is to create a visual monthly seasonal catch analyses for distinct species through out the whole dataset, how would I do that?
Here is a new dataset: Average temperature in the Nordic Capitals downloaded from https://nordicstatistics.org/areas/geography-and-climate/
import pandas as pd
# Load the dataset
df = pd.read_csv("datasets/nordicaveragetemp.csv")
# Show the first 5 rows
df.head()
| Category | DK | FO | GL | FI | AX | IS | NO | SE | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 1874 | 7.8 | NaN | NaN | 4.8 | NaN | NaN | 5.9 | 6.0 |
| 1 | 1875 | 6.9 | NaN | NaN | 1.9 | NaN | NaN | 4.3 | 4.3 |
| 2 | 1876 | 7.1 | NaN | NaN | 3.1 | NaN | NaN | 4.6 | 4.9 |
| 3 | 1877 | 6.8 | NaN | NaN | 3.3 | NaN | NaN | 3.6 | 4.7 |
| 4 | 1878 | 7.8 | NaN | NaN | 5.2 | NaN | NaN | 5.9 | 6.2 |
Show the last 5 rows
# Show the last 5 rows
df.tail()
| Category | DK | FO | GL | FI | AX | IS | NO | SE | |
|---|---|---|---|---|---|---|---|---|---|
| 146 | 2020 | 10.7 | 7.2 | -0.7 | 8.7 | 8.5 | 5.1 | 8.9 | 9.7 |
| 147 | 2021 | 9.6 | 7.0 | 0.1 | 6.6 | 6.9 | 5.4 | 7.3 | 8.1 |
| 148 | 2022 | 10.3 | 7.3 | -1.0 | 7.3 | 7.4 | 5.1 | 8.0 | 8.8 |
| 149 | 2023 | 10.1 | 7.3 | -0.3 | 7.1 | 6.7 | 5.0 | 7.0 | 8.0 |
| 150 | 2024 | 10.5 | 7.1 | -0.7 | NaN | 7.4 | 4.3 | 7.8 | 8.9 |
Now I want to visualize the data, I check out tutorial on https://matplotlib.org/stable/users/explain/quick_start.html#quick-start
I ask ChatGPT for help. Now I want to create a visualisation using matplotlib and numpy import matplotlib.pyplot as plt import numpy as np I want each Nordic country to have it's own color and have a contunuing line, and on x axis there would be the year and y axis the temperature.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Load your dataset
df = pd.read_csv("datasets/nordicaveragetemp.csv")
# Convert the Category column (years) to integers
df["Category"] = df["Category"].astype(int)
# Nordic country columns
countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]
# Assign a color to each country
colors = {
"DK": "red",
"FO": "purple",
"GL": "blue",
"FI": "green",
"AX": "orange",
"IS": "cyan",
"NO": "black",
"SE": "brown"
}
# Create the plot
plt.figure(figsize=(14, 7))
for country in countries:
if country in df.columns:
plt.plot(
df["Category"],
df[country],
color=colors[country],
label=country,
linewidth=2
)
plt.title("Average Annual Temperature in Nordic Countries (°C)")
plt.xlabel("Year")
plt.ylabel("Temperature (°C)")
plt.grid(True, linestyle="--", alpha=0.4)
plt.legend(title="Country", ncol=4)
plt.tight_layout()
plt.show()
Min, Max and Average temperaure¶
Again I ask ChatGPT for further assistance.
Now I want to analyse the data and create a table for the max , min, and average temperature in each of those countries.
import pandas as pd
# Load data
df = pd.read_csv("datasets/nordicaveragetemp.csv")
# Convert years to integers
df["Category"] = df["Category"].astype(int)
# Nordic country columns
countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]
# Create a summary table
summary = pd.DataFrame({
"Min °C": df[countries].min(),
"Max °C": df[countries].max(),
"Average °C": df[countries].mean()
})
# Display the table
summary
| Min °C | Max °C | Average °C | |
|---|---|---|---|
| DK | 5.9 | 10.7 | 7.962252 |
| FO | 4.9 | 8.1 | 6.595556 |
| GL | -4.4 | 2.6 | -1.173077 |
| FI | 1.9 | 8.7 | 5.266667 |
| AX | 3.3 | 8.5 | 6.170833 |
| IS | 2.8 | 6.1 | 4.763158 |
| NO | 3.6 | 8.9 | 5.941722 |
| SE | 4.1 | 9.7 | 6.619868 |
I ask ChatGPT
I would like to work further to make interactive map, how can I start with that using Jypyter Notebook?
import pandas as pd
df = pd.read_csv("datasets/nordicaveragetemp.csv")
df["Category"] = df["Category"].astype(int)
countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]
summary = pd.DataFrame({
"country": countries,
"Min": df[countries].min().values,
"Max": df[countries].max().values,
"Avg": df[countries].mean().round(2).values
})
summary
| country | Min | Max | Avg | |
|---|---|---|---|---|
| 0 | DK | 5.9 | 10.7 | 7.96 |
| 1 | FO | 4.9 | 8.1 | 6.60 |
| 2 | GL | -4.4 | 2.6 | -1.17 |
| 3 | FI | 1.9 | 8.7 | 5.27 |
| 4 | AX | 3.3 | 8.5 | 6.17 |
| 5 | IS | 2.8 | 6.1 | 4.76 |
| 6 | NO | 3.6 | 8.9 | 5.94 |
| 7 | SE | 4.1 | 9.7 | 6.62 |
iso_map = {
"DK": "DNK",
"FO": "FRO",
"GL": "GRL",
"FI": "FIN",
"AX": "ALA",
"IS": "ISL",
"NO": "NOR",
"SE": "SWE"
}
summary["ISO3"] = summary["country"].map(iso_map)
summary
| country | Min | Max | Avg | ISO3 | |
|---|---|---|---|---|---|
| 0 | DK | 5.9 | 10.7 | 7.96 | DNK |
| 1 | FO | 4.9 | 8.1 | 6.60 | FRO |
| 2 | GL | -4.4 | 2.6 | -1.17 | GRL |
| 3 | FI | 1.9 | 8.7 | 5.27 | FIN |
| 4 | AX | 3.3 | 8.5 | 6.17 | ALA |
| 5 | IS | 2.8 | 6.1 | 4.76 | ISL |
| 6 | NO | 3.6 | 8.9 | 5.94 | NOR |
| 7 | SE | 4.1 | 9.7 | 6.62 | SWE |
import plotly.express as px
fig = px.choropleth(
summary,
locations="ISO3",
color="Avg",
hover_name="country",
hover_data={"Min": True, "Max": True, "Avg": True, "ISO3": False},
color_continuous_scale="RdBu_r",
projection="mercator",
title="Average Temperature in Nordic Countries"
)
fig.update_geos(fitbounds="locations", visible=False)
fig.show()
I ask ChatGPT for help: I would like to add the time dimension and have that as a slider for the years, under the map, and I would like to have the data displayed above the countries, depending on the year where the slider is
Prepare the dataset
import pandas as pd
df = pd.read_csv("datasets/nordicaveragetemp.csv")
df["Category"] = df["Category"].astype(int)
countries = ["DK", "FO", "GL", "FI", "AX", "IS", "NO", "SE"]
# Long format for animation frames
long_df = df.melt(
id_vars="Category",
value_vars=countries,
var_name="country",
value_name="temp"
).dropna()
# ISO codes
iso_map = {
"DK": "DNK",
"FO": "FRO",
"GL": "GRL",
"FI": "FIN",
"AX": "ALA",
"IS": "ISL",
"NO": "NOR",
"SE": "SWE"
}
long_df["ISO3"] = long_df["country"].map(iso_map)
# rough coordinates for labels
coords = {
"DK": (10.0, 56.5),
"FO": (-6.9, 62.0),
"GL": (-41.0, 74.0),
"FI": (25.0, 64.5),
"AX": (19.9, 60.2),
"IS": (-18.0, 65.0),
"NO": (10.0, 62.0),
"SE": (15.0, 62.0)
}
long_df["lon"] = long_df["country"].map(lambda x: coords[x][0])
long_df["lat"] = long_df["country"].map(lambda x: coords[x][1])
Build the interactive animation
import plotly.graph_objects as go
years = sorted(long_df["Category"].unique())
# Base figure
fig = go.Figure()
# --- FRAME GENERATION ---
frames = []
for year in years:
df_year = long_df[long_df["Category"] == year]
frame = go.Frame(
name=str(year),
data=[
# Choropleth Layer
go.Choropleth(
locations=df_year["ISO3"],
z=df_year["temp"],
colorscale="RdBu_r",
zmin=-10, zmax=10,
marker_line_width=0,
colorbar_title="°C"
),
# Text Label Layer
go.Scattergeo(
lon=df_year["lon"],
lat=df_year["lat"],
text=df_year["temp"].round(1),
mode="text",
textfont=dict(size=14, color="black")
)
]
)
frames.append(frame)
fig.frames = frames
# --- INITIAL DATA (first year) ---
df0 = long_df[long_df["Category"] == years[0]]
fig.add_trace(go.Choropleth(
locations=df0["ISO3"],
z=df0["temp"],
colorscale="RdBu_r",
zmin=-10, zmax=10,
marker_line_width=0,
))
fig.add_trace(go.Scattergeo(
lon=df0["lon"],
lat=df0["lat"],
text=df0["temp"].round(1),
mode="text",
textfont=dict(size=14, color="black")
))
# --- LAYOUT ---
fig.update_geos(
fitbounds="locations",
visible=False
)
fig.update_layout(
title="Nordic Temperatures by Year (Interactive Slider)",
margin=dict(l=0, r=0, t=40, b=0),
updatemenus=[
dict(
type="buttons",
showactive=False,
x=0.05, y=0,
buttons=[dict(label="Play", method="animate", args=[None])]
)
],
sliders=[
dict(
steps=[
dict(method="animate",
args=[[str(year)], {"mode": "immediate"}],
label=str(year))
for year in years
],
x=0.1, y=-0.05,
currentvalue={"prefix": "Year: "}
)
]
)
fig.show()