Paul Wensveen - Fab Futures - Data Science
Home About

< Home - Next Week >

Week 1: Introduction and data visualisation¶

Description of data set¶

My data set was collected using an animal-borne multisensor tag, known as the Mixed-DTAG, that was attached to a killer whale in Vestmannaeyjar on 30 June 2023. Deployment ID: oo23_181a. Data from the tag were pre-processed and downsampled in Matlab to create a multivariate time series at 1 Hz resolution.

Reference for data set:

  • Selbmann et al. (in review). Aversive behavioural responses of killer whales to sounds of long-finned pilot whales. Sci. Rep.

Variables¶

  • Time in UTC
  • Latitude in decimal degrees
  • Longitude in decimal degrees
  • Depth in m
  • Pitch in radians
  • Roll in radians
  • Heading in radians (relative to true North)
  • Acceleration in g, x
  • Acceleration in g, y
  • Acceleration in g, z

Load data set¶

In [2]:
import numpy as np 
#data = np.genfromtxt("datasets/data_oo23_181a.csv", delimiter=",", names=True)
#print(data)

Time stamps were not imported properly. Let's try pandas instead.

In [3]:
import pandas as pd
import math
df = pd.read_csv("datasets/data_oo23_181a.csv", sep=",") # import as data frame
#print(df) 

# Keep every 10th row for speed (row splicing) and reset index
df = df.iloc[::10].reset_index(drop=True)
print(df) 

# Convert radians to degrees
df[['pitch','roll','head']] = df[['pitch','roll','head']]/math.pi*180
                        t        lat        lon     depth     pitch      roll  \
0     30/06/2023 15:51:28  63.358561 -20.375201  0.211505  0.221146  0.055093   
1     30/06/2023 15:51:38  63.358584 -20.374982  1.183622 -0.021102 -0.000687   
2     30/06/2023 15:51:48  63.358596 -20.374764  0.154974  0.110676 -0.033282   
3     30/06/2023 15:51:58  63.358589 -20.374552  1.213925  0.137001  0.684219   
4     30/06/2023 15:52:08  63.358581 -20.374357  0.262711 -0.525213 -0.074451   
...                   ...        ...        ...       ...       ...       ...   
8887  01/07/2023 16:32:38  63.498557 -21.444829  9.743754  0.614204 -0.027725   
8888  01/07/2023 16:32:48  63.498464 -21.444778  4.602014 -0.222737  3.024607   
8889  01/07/2023 16:32:58  63.498305 -21.445001  2.618553 -0.111240 -3.014576   
8890  01/07/2023 16:33:08  63.498140 -21.445293  2.097909 -0.155277 -2.432014   
8891  01/07/2023 16:33:18  63.497971 -21.445446  0.765376  0.368527 -0.227269   

          head        ax        ay        az  
0    -0.839003  0.204646  0.050123  0.908871  
1    -0.088148 -0.020562 -0.000670  0.974270  
2     0.213532  0.107333 -0.032139  0.965299  
3    -0.022524  0.138110  0.633191  0.776290  
4     0.038756 -0.473372 -0.060760  0.814592  
...        ...       ...       ...       ...  
8887 -2.709003  0.600632 -0.023611  0.851398  
8888  1.795635 -0.230087  0.118570 -1.008914  
8889  2.012999 -0.107815 -0.122270 -0.957441  
8890  2.716053 -0.154473 -0.642924 -0.748633  
8891  2.761948  0.357027 -0.208314  0.900760  

[8892 rows x 10 columns]

Visualise data¶

Depth, pitch and roll plots¶

In [4]:
import numpy as np 
import matplotlib.pyplot as plt

# Convert to numpy array but skip the time stamps in column 0
#data = np.array(df) # use only with numerical data
data = df.iloc[:, 1:].to_numpy(dtype=float) 

t = np.arange(data.shape[0])/3600*10  # time in hours since start

# Plot dive profile
plt.figure(figsize=(6, 4))
plt.subplot(3,1,1)
plt.plot(t, -data[:,2], '-', linewidth = .5)
plt.ylabel('Depth (m)')

# Plot pitch and roll angles
plt.subplot(3,1,2)
plt.plot(t, data[:,3], 'g-', linewidth = .5)
plt.ylabel('Pitch (deg)')

plt.subplot(3,1,3)
plt.plot(t, data[:,4], 'r-', linewidth = .5)
plt.xlabel('Time (h)')
plt.ylabel('Roll (deg)')
plt.tight_layout()
#plt.show()
No description has been provided for this image

And the same variables but for a shorter period. Would be nice to try some interactive plotting in the future.

In [5]:
## Zoom in on a selection
idx = np.arange(21*360,23*360)

# Plot dive profile
plt.figure(figsize=(6, 4))
plt.subplot(3,1,1)
plt.plot(t[idx], -data[idx,2], '-', linewidth = .5)
plt.ylabel('Depth (m)')

# Plot pitch and roll angles
pi = math.pi
plt.subplot(3,1,2)
plt.plot(t[idx], data[idx,3], 'g-', linewidth = .5)
plt.ylabel('Pitch (deg)')

plt.subplot(3,1,3)
plt.plot(t[idx], data[idx,4], 'r-', linewidth = .5)
plt.xlabel('Time (h)')
plt.ylabel('Roll (deg)')
plt.tight_layout()
#plt.show()
No description has been provided for this image

Track plot¶

I asked ChatGPT:

  • what's a nice way to plot an animal track using matplotlib
  • actually format is lat/long so I need different scaling

and adapted the code based on that.

In [5]:
# Note: uses a simple conversion from lat/lon to x,y that is OK for short tracks

# reference latitude (midpoint)
lat0 = np.mean(data[:,0])

# approximate meters per degree
meters_per_deg_lat = 111_320
meters_per_deg_lon = 111_320 * np.cos(np.radians(lat0))

# convert to Cartesian coordinates
x = (data[:,1] - data[0,1]) * meters_per_deg_lon
y = (data[:,0] - data[0,0]) * meters_per_deg_lat

# Plot the track
pmin = 25 # min depth to plot
plt.figure(figsize=(10, 6))
plt.scatter(x[data[:,2]>pmin]/1000, y[data[:,2]>pmin]/1000, c=data[data[:,2]>pmin,2], cmap="jet", s=30)
plt.plot(x/1000, y/1000, 'k-', linewidth=1)
plt.axis('equal')     # keeps scale consistent
plt.title("Track of killer whale oo23_181a")
plt.xlabel("x (km)")
plt.ylabel("y (km)")
plt.grid(True, alpha=0.3)
plt.colorbar(label="depth (m)")
plt.show()
No description has been provided for this image

Pair plot¶

Here's a fun plot to inspect the data. Orange represents depths deeper than 25 meter. More specific acceleration during deep diving?

In [6]:
import seaborn as sns  # Load seaborn library

# Create categorical variable with deep vs shallow
df = df.assign(deep=df[['depth']]>pmin)
df["deep"] = df["deep"].map({True: ">25m", False: "<=25m"}).astype("category")

#print(df.columns) # column names

# Pair Plot
sns.pairplot(df[['depth','pitch','roll','head','ax','ay','az','deep']], hue='deep', corner='true')
plt.show()
No description has been provided for this image
In [ ]: