A first look at ApRES data

A first look at ApRES data#

Earlier in this chapter we looked at the theory behind the autonomous phase-sensitive radio-echo sounder (ApRES), developed by the British Antarctic Survey (BAS). We have seen how it is the frequency content of the signal recorded by ApRES which tells us the range to sub-surface reflectors and how to use a fourier transform to estimate this frequency content and estimate the range to reflectors.

Below we will load and examine ApRES data collected on Thwaites Glacier as part of the International Thwaites Glacier Collaboration (ITGC).

Load packages#

As well as numpy and matplotlib, we will use a package called xapres which was developed to load and process ApRES data in python. xapres is built around xarray, a package for working with multi-dimensional arrays.

import numpy as np
import matplotlib.pyplot as plt
import os
import math
import sys
#sys.path.append("/Users/jkingslake/Documents/science/ApRES/xapres/")
import xapres as xa

The data#

The data we will use was collected on Thwaites Glacier, West Antarctica, by Elizabeth Case as part of the GHOST overland traverse in the 2022-2023 field season. Follow up data was collected Alex Brisbourne and Sierra Melton in 2023-2024. Part of these ApRES surveys was a line of measurements taken along a profile approximately aligned with flow down the middle of the glacier.

As described in previous pages, ApRES data is collected in ‘chirps’, each one second in length. Generally, many chirps are recorded repeatedly at each location and together these are known as a burst.

At each of ~250 waypoints, Elizabeth collected a burst consisting of 100 chirps.

ApRES data is stored in a binary format with the file extension .dat. The xapres package has tools to load this data format.

Plot the chirps#

Plotting one of these chirps, we get something superficially reminiscent of the synthetic chirp we generated earlier in this chapter using 10 randomly placed reflectors.

chirp.isel(chirp_num=0).plot(x = 'chirp_time', figsize = (15, 5))
plt.title("The first chirp from this burst");

../../../_images/75594a167a71503ff6545c9961edfdf089e09e06f7cf5da6c6e812b3002adbcd.png

We can also plot all 100 chirps on the same plot to see how they compare. There are at least two way to do this. One is to plot each chirp as a line on the same plot:

chirp.plot(x = 'chirp_time', hue = 'chirp_num',  add_legend = False, figsize = (15, 5))
plt.title("All the chirps from this burst");

../../../_images/fd42449fb031217f646a54b6e9d0161ea6bcddb03608037cc343465572a89767.png

another is to plot a 2D plot with chirp number on one axis and chirp time on the other:

chirp.plot(figsize = (15, 5))
plt.title("All the chirps from this burst");

../../../_images/665ecdaf2859d4d3425ad406336282a6fcaba70ca14f013f995b7039c4fec890.png

It is evident just from looking at either of the two plots above that (1) there is a high degree of agreement between the chirps, and (2) some frequencies are particularly prominent.

Agreement between chirps#

Let’s zoom in on the line plot above to examine the agreement between chirps.

chirp.plot(x = 'chirp_time', hue = 'chirp_num',  add_legend = False, figsize = (15, 5), xlim = (0, 0.1))
plt.title('All the chirps from this burst');

../../../_images/7fcd9fde4bd4ed69bc9d1948a8dc03de4c3787faff893b3f33eca971cbf62bf3.png

At this scale its clear that the chirps are similar, particularly at long periods, but not identical.

More useful still is to compute the deviation from the mean chirp.

mean_chirp = chirp.mean(dim='chirp_num')
(chirp - mean_chirp).plot(x = 'chirp_time', hue = 'chirp_num', add_legend = False, figsize = (15, 5))
plt.title('Chirps with the mean removed');

../../../_images/1c779085bd705b655abcdd573d0d39ac71e717ce61645789aae32a32ed4ec077.png

This highlights how the deviations are an order of magnitude smaller than the signal and relatively high frequency. This is good news, as these chirps were all recorded at the same location within a 100-second interval, so we would hope for significant agreement.

A prominent frequency#

To examine the second point above, as a very approximate estimate, you can manually count the peaks (or the horizontal lines in the lower plot). Doing this I counted around 25 peaks in the first 0.2 seconds of the chirp. This suggests a frequency of around 125 Hz.

From earlier in this chapter, the range \(R\) is related to the frequency \(f_d\) by

\[ R = \frac{c}{2\sqrt\epsilon} \frac{f_d}{K}, \]

where \(c\) is the speed of light, \(\epsilon\) is the permittivity of the ice, and \(K\) is the rate of change of frequency during the chirps. This can be understood by considering that the delay in the received signal returning to the radar is the two-way travel time \(\tau = 2R\frac{\sqrt\epsilon}{c}\) and because \(K\) is constant, this delay is linearly proportional to the difference in the transmitted and received frequencies: \(f_d = K\tau.\) eliminating \(\tau\) between these two expressions gives the equation above.

Using this equation for \(R\), a frequency of 125Hz corresponds to a range of around 53 m:

T = 1               # chirp duration [s]
f_1 = 200e6         # starting frequency [Hz]
f_2 = 400e6         # ending frequency [Hz]
B = f_2 - f_1       # bandwidth [Hz]
K = B/T             # rate of chnge of frequency [Hz/s]
c = 300000000.0     # speed of lightin a vacuum [m/s]
ep = 3.18           # permittivity of ice

def range(frequencies):
    return c * frequencies / (2*np.sqrt(ep)*K)

range(125)

52.57238944612915

So we will know to look out for a reflector at around 53 m depth in the amplitude plot we will make next.

Plot the power#

As discussed, the frequency content of the chirps contains infomation on the locations of sub-surface reflectors. We can estimate this frequency content using a fourier transform, and then plot the result as a function of range to produce something similar to the time series of return strength you would record with a conventional radar system.

Initially we will use the same fourier transform code as we used earlier in the chapter when playing with synthetic data.

First let’s extract just one chirp on which to perform the fourier transform.

s_one_chirp = chirp.isel(chirp_num=0).values.squeeze()  # again, .squeeze() removes unnecessary singleton dimensions)

Then we define a function that takes a chirp, performs an fft on it, and returns the result and the frequencies of the bins.

Note

It is very nearly identical to the code used in Autonomous phase-sensitive radio-echo sounder theory) except for the addition of , axis=0 in the call to np.fft.fft. This has no effect except to allows us to use the same function to process every chirp in one go later on.

dt = chirp.chirp_time.values[1] - chirp.chirp_time.values[0]
sampling_frequency = 1/dt   # [Hz]
def fft(s):
    no_of_samples = len(s)
    S = np.fft.fft(s, axis=0)/no_of_samples         
    indexes      = np.arange(no_of_samples) 
    frequencies  = indexes * sampling_frequency/no_of_samples
    return S, frequencies

Next we perform the fft and plot the results. The fft returns a complex number for each frequency bin. We can start by plotting the magnitude of these complex numbers:

S_one_chirp, frequencies = fft(s_one_chirp)
r = range(frequencies)

plt.figure(figsize = (15, 5))
plt.plot(r, abs(S_one_chirp)); 
plt.xlabel('range, $R$ [m]')
plt.ylabel('amplitude')
plt.xlim([0, 2500])
plt.title('FFT of the first chirp');

../../../_images/c0a49c4c7bd3dbf5831cf99e1c89fb95d849dd1f9c924baaa4a87fd7b9f380b6.png

It is difficult to see details in the region \(R>\) 500 me plot above. A logarithmic scale will be useful here. In particular, radar data is often plotted on a decibel scale as follows:

def dB(S):
    return 20*np.log10(np.abs(S))

plt.figure(figsize = (15, 5))
plt.plot(r, dB(S_one_chirp))
plt.xlabel('range, $R$ [m]')
plt.ylabel('Amplitude')
plt.xlim([0, 2500]);
plt.title('FFT of one chirp');

../../../_images/33a48945d2267b8c2372b8919ed2678a20f0a2537b07d9abc7e12bce14709e23.png

Now we can see more detail. The power initially increases with range to a peak at around 50 m, then decreases steadily until around 1000 m, after which the power stays approximately uniform with range. A peak in power around 50 m is consistent with our simple estimate based on counting peaks in the chirp above. Based on past experience with similar data, we can intepret all the little spikes in power in the region between the surface (\(R\) = 0) and \(R\) = 1000 m as englacial reflectors. While where \(R>1000\) m, any reflectors are too lower power to be detected above the noise and the power we see in that region is background noise from the environment and possibly the radar system.

Stacking#

To reduce the level of noise we can ‘stack’ the data. This involves averaging returns from multiple chirps so that the random noise tends to cancel out while the signal (reflections from sub-surface reflectors) remains.

Order of operations#

You can either average all the chirps and compute the fft of the result, or you can compute the fft of each chirp and average the resulting power profiles. We will start with the simpler, former option.

First, we average the chirps acorss the chirp_num dimension:

s_stacked = chirp.mean(dim='chirp_num').values.squeeze()

Then we perform the fft and plot the results as before:

S_stacked, frequencies = fft(s_stacked)
r = range(frequencies)

plt.figure(figsize = (15, 5))
plt.plot(r, dB(S_stacked)); 
plt.xlabel('range, $R$ [m]')
plt.ylabel('Amplitude')
plt.xlim([0, 2500])
plt.title('FFT of 100 chirps stacked together');

../../../_images/18edcfc566af01753b3ea38521e0f875631294ab9bfe7406e1ec177689fa03dd.png

For completeness, we also perform the stacking operations in the other order.

We feed the whole 2D array of chirps into our fft function

s_all = chirp.values
S_all, frequencies = fft(s_all)

Then plot the result and verify that the two methods give the same result to within numeical precision:

plt.figure(figsize = (7, 6))
plt.plot(dB(S_all.mean(axis=1)), dB(S_stacked), marker = '.') 
plt.xlabel('stacked power from computing the fft of each chirp, then averaging [dB]')
plt.ylabel('stacked power from averaging the chirps, then computing the fft [dB]')
plt.title('stacked using with different order of operations');

assert np.allclose(S_all.mean(axis=1), S_stacked)

../../../_images/0b72a055cb3688d1d3d28ae9a68ba7da666268414043da760acd1af99bb42670.png

Stacking reveals more details#

Comparing the single-chirp plot to the stacked plot above shows how stacking lowers the level of noise and reveals more detail at depth.

fig = plt.figure(figsize=(15, 10))
gs = fig.add_gridspec(2, 2, height_ratios=[1, 1])

ax1 = fig.add_subplot(gs[0, :])  # Top subplot spanning both columns
ax2 = fig.add_subplot(gs[1, 0])  # Bottom left subplot
ax3 = fig.add_subplot(gs[1, 1])  # Bottom right subplot

for ax in [ax1, ax2, ax3]:
    ax.plot(r, dB(S_one_chirp), label='one chirp'); 
    ax.plot(r, dB(S_stacked), label = '100 chirps stacked together'); 
    ax.set_xlabel('range, $R$ [m]')
    ax.set_ylabel('amplitude')
    ax.legend()

ax1.set_title('FFT of 100 chirps stacked together');
ax2.set_title('zoom in around 500 m');
ax3.set_title('zoom in around 1200 m');

ax1.set_xlim([0, 2500])
ax2.set_xlim([500, 550])
ax3.set_xlim([1200, 1250]);

../../../_images/e5119be5e762fe1efbc6a4e24a1403a244f441d0c9498b290b6f42787971f55c.png

The bottom left plot above shows a zoom in of the unstacked and stacked power profiles around \(R=\) 500m. At this depth the reflectors are much higher power than the noise and there is little difference between the curves. The bottom right plot above is a zoom in around \(R=\) 1200 m. In this region stacking significantly affects the power. This potentially allows reflectors to be detected that would not have been detected in a power profile from a single chirp.

This is consistent with our observation earlier that the chirps agree well at low frequencies but appeared to disagree at higher frequencies; low frequencies correspond to shallow depths (low \(R\)), while higher frequencies correspond to deeper depths (higher \(R\)).

One important reflector that the stacking reveals is the reflection from the bed. It is visible at around \(R=\) 2200m; a local peak in the power is visible in the stacked data, but not in the unstacked data. This reflection is weak compared to reflectors higher up because the radio signal is attenuated during its 4400 m journey through the ice, but incredibly it can be traced between stacked ApRES measurements taken at adjacent waypoints along the TG flowline traverse. See Elizabeth Case’s thesis for more details.

Preprocessing#

So far we have simply directly computed the FFT of the chirps.

Based on software developed by BAS, several preprocessing steps can be applied to ApRES data before the FFT is computed, including applying a blackman filter and padding the data with zeros to increase the number of frequency bins. This helps with abotaininga clreaer power return, revealing additional details in the data, and helps with subsequent processing steps aimed at measuring the displacement of reflectors.

Trim the data#

It is useful to defining two simple rounding functions for making sure the data have an even number of samples.

def rdi(x):
    """round down to the nearest integer and return an integer"""
    return int(np.floor(x))
def rdei(x):
    """round down to the nearest even integer and return an integerr"""
    return int(np.floor(x/2) * 2)

Next extract and trim the data to have an even number of samples

t = chirp.chirp_time.values
N = rdei(len(s_one_chirp))
s_t = s_one_chirp[:N]             # t suffix stands for 'trimmed'
t_t = t[:N]                       # t suffix stands for 'trimmed'

Filtering#

We filter the data using the numpy function np.blackman and simply multipying the data by the output of this function.

s_tw = s_t * np.blackman(N)       # w suffix stands for 'windowed'
plt.figure(figsize = (15, 5))
plt.plot(t_t, s_t, label = 'chirp')
plt.plot(t_t, s_tw, label = 'windowed chirp')
plt.legend()
plt.title('trimmed and windowed chirp');

../../../_images/511187e5394ab20906d1a5c45eaca64ca6f775ee00794baedb813f4b349749af.png

Padding the data#

Padding the data with zeros increases the frequency resolution of the resulting prower profile, simplifying later processing steps.

We define a pad factor, which is the factor by which we will increase the number of samples in the data. I.e. if there are initially 40,000 samples in the chirp (after trimming), and we set pad=2, there will be 80,000 samples in the padded chirp.

pad_factor = 2

Numpy’s pad function requires the number of zeros to add to each side of the array. Considering the total number of additional elements (N*pad_factor - N ) and dividing this by two relates this to the pad factor as (N*pad_factor-N)/2.

s_twp = np.pad(s_tw, pad_width=int((N*pad_factor-N)/2))      # p suffix stands for 'padded'
plt.figure(figsize = (15, 5))
plt.plot(s_twp)
plt.title('trimmed, windowed and padded chirp')
len(s_twp)

../../../_images/0aefaa8f2bc32da2801396cd98bbcc9a586a8f8d03a4dc9343c5fad6cbc2db28.png

Roll#

Another step in the preprocessing is to rearrange the data with a roll operation as follows:

s_twpr = np.roll(s_twp, shift = int(N*pad_factor/2))   # r suffix stands for 'rolled'
plt.figure(figsize = (15, 5))
plt.plot(s_twpr)
plt.title('trimmed, windowed, padded and rolled chirp');

../../../_images/fb689714bf817d671940047be490f3c3cc6c8805f5badc35e80a17e810613986.png

Perform the FFT on the rolled, padded chirp#

S_twpr, frequencies = fft(s_twpr) 
S_twpr = S_twpr * np.sqrt(2*pad_factor) 
r_twpr = range(frequencies)

You can see the effect of padding on the range resolution by comparing the ragne bins obtained with and without padding

print(f"Range resolution without padding; {r[1] - r[0]} m.")
print(f"Range resolution with padding: {r_twpr[1] - r_twpr[0]} m.")

Range resolution without padding; 0.4205686013539993 m.
Range resolution with padding: 0.2102895577845166 m.

Finally, we plot the power profiles from the preprocessed an original chirps.

plt.figure(figsize = (15, 5))
plt.plot(r, dB(S_one_chirp), label = 'FFT of original chirp')
plt.plot(r_twpr, dB(S_twpr), label = 'FFT of trimmed, windowed, padded, rolled chirp')
plt.xlabel('range, $R$ [m]')
plt.ylabel('Amplitude')
plt.xlim([0, 2500])
plt.title('the effect of preprocessing a chirp')
plt.legend();

../../../_images/2522ccd0c9e80239916abdf330299d1734bb8d26b9d2510ad5b32385dc11f745.png

Performing all these steps with xapres#

The xapres package performs all these steps on the chirps as they are loaded and saves the result into another data variable called profile. The package also adds a method called .dB to the xarray to simplify plotting the power profile on a decibel scale.

To plot the the power profiles you can use xarray’s plotting tools as follows:

ds.profile.isel(chirp_num=0).dB().plot(figsize = (15, 5), marker = '.',label = 'from xapres')
plt.plot(r_twpr, dB(S_twpr), marker = '.', label = 'FFT of trimmed, windowed, padded, rolled chirp, processed above')
plt.xlim([0, 2500])
plt.title("a profile loaded from xapres and a profile processed above")
plt.legend()

<matplotlib.legend.Legend at 0x17fe36050>

../../../_images/c0dc880fc268d2724596601e04131e52ceb1c375c3166db3493b77c8e6893a4d.png

The results from tha analysis above and those produced by xapres are very similar. They differ in low-power locations. The reason is explained in this issue in the xapres github repository. An upcoming, small correction to the xapres code will remedy this.

Having the results in xarray makes stacking profiles very simple:

ds.profile.mean(dim='chirp_num').dB().plot(figsize = (15, 5), label = 'processed with xapres and stacked')
plt.xlim([0, 2500])

(0.0, 2500.0)

../../../_images/f73a32431eab53dfdae185157510062b590930a385f12f4297b107bda74d6b4c.png

The pre-processing, in particular the filtering, has improved the visibility of the bed in comparison to the stacked profile we plotted above.

Load part of a much larger dataset#

We can also use xapres to load a much larger ApRES dataset collected in Greenland. This dataset was collected in the ablation zone of the Greenland Ice Sheet by a team led by Meredith Nettles (Lamont-Doherty Earth Observatory, LDEO) and Laura Stevens (University of Oxford), including George Lu (LDEO), Stacy Larochelle (LDEO), Marianne Okal (Earthscope), Kristin Arnold (IRIS Alpine), and Josh Rines (Stanford University). The project was funded by the US National Science Foundation (project number: 2003464).

Three ApRES units were positioned near several supraglacial lakes that periodically drain to the bed of the ice sheet. The units collected a burst every 15 minutes for up to 18 months.

We have already processed this large dataset and saved the result in a google bucket (a cloud storage location that can be quickly accessed from using cloud computing, e.g. AWS, GCS).

We can easily load an example large dataset using xapres as follows:

ds_GR = xa.load.load_zarr()
ds_GR.nbytes/1e9

202.513838596

This data has only bean ‘lazily’ loaded, meaning that the data has not yet been loaded into memory (see here for more details). We will load just a few time slices as a demonstration. To load the full dataset, you could use a cloud-computing service like cryocloud and coarsen the data before plotting.

The line of code below selects the timeslices and attenuator setting pair we want, selects the profile variable, computes the mean along the chirp_num dimension (i.e. stacks the chirps), and then plots the result on a decibel scale.

ds_GR.isel(time=slice(300,2000,150), attenuator_setting_pair = 0).profile.mean(dim = 'chirp_num').dB().plot(y = 'profile_range', figsize = (15, 5), yincrease = False);

../../../_images/048841bcbd22f7cbd7181a19e668481eae62e578ffcdecad87ce6e54c4ffa79e.png

The time slices plotted above were chosen rather arbitrarily; they span about two weeks at the start of July 2022. The bed is visible at around 1100 m depth and between the surface and 500 m englacial layers are visible across the 12 stacked profiles.

Summary#

On this page we:

loaded ApRES data collected in Antarctica and Greenland using a package called xapres,
examined the chirps in one burst of data from Antarctica individually and compared them to each other,
performed an fft on the chirps and plotted the power profile as a function of range,
stacked the data (i.e. computed the mean across all the chirps in a burst) and demonstrated that it doesnt matter if you stack the chirps then perform the fft, or perform the fft on each chirp and stack the results,
preprocessed the data to improve the visibility of the bed and englacial layers in the power profiles,
used the xapres package to process the Antarctic data, and
used the xapres package to load and plot a much larger dataset collected in Greenland.

In later pages we will demonstrate how to copute displacement from ApRES data using the phase of the power profiles.