ml4gw.transforms.whitening

Classes

FixedWhiten(num_channels, kernel_length, ...)

Transform that whitens timeseries by a fixed power spectral density that's determined by calling the .fit method.

Whiten(fduration, sample_rate[, highpass, ...])

Normalize the frequency content of timeseries data by a provided power spectral density, such that if the timeseries are sampled from the same distribution as the PSD the normalized power will be approximately unity across all frequency bins.

class ml4gw.transforms.whitening.FixedWhiten(num_channels, kernel_length, sample_rate, dtype=torch.float64)

Bases: FittableSpectralTransform

Transform that whitens timeseries by a fixed power spectral density that's determined by calling the .fit method.

Parameters:
  • num_channels (float) -- Number of channels to whiten

  • kernel_length (float) -- Expected length of tensors to whiten in seconds. Determines the number of frequency bins in the fit PSD.

  • sample_rate (float) -- Rate at which timeseries will be sampled, in Hz

  • dtype (dtype) -- Datatype with which background PSD will be stored

fit(fduration, *background, fftlength=None, highpass=None, lowpass=None, overlap=None)

Compute the PSD of channel-wise background to use to whiten timeseries at call time. PSDs will be resampled to have self.kernel_length * self.sample_rate // 2 + 1 frequency bins.

Parameters:
  • fduration (float) -- Desired length of the impulse response of the whitening filter, in seconds. Fit PSDs will have their spectrum truncated to approximate this response time. A longer fduration will be able to handle narrower spikes in frequency, but at the expense of longer filter settle-in time. As such fduration / 2 seconds of data will be removed from each edge of whitened timeseries.

  • *background (Union[Float[Tensor, 'time'], Float[Tensor, 'frequency']]) -- 1D arrays capturing the signal to be used to whiten each channel at call time. If fftlength is left as None, it will be assumed that these already represent frequency-domain data that will be possibly resampled and truncated to whiten timeseries at call time. Otherwise, it will be assumed that these represent time-domain data that will be converted to the frequency domain via Welch's method using the specified fftlength and overlap, with a Hann window used to window the FFT frames by default. Should have the same number of args as self.num_channels.

  • fftlength (Optional[float]) -- Length of frames used to convert time-domain data to the frequency-domain via Welch's method. If left as None, it will be assumed that the background arrays passed already represent frequency- domain data and don't require any conversion.

  • highpass (Optional[float]) -- Cutoff frequency, in Hz, used for highpass filtering with the fit whitening filter. This is achieved by setting the frequency response of the fit PSDs in the frequency bins below this value to 0. If left as None, the fit filter won't have any highpass filtering properties.

  • lowpass (Optional[float]) -- Cutoff frequency, in Hz, used for lowpass filtering with the fit whitening filter. This is achieved by setting the frequency response of the fit PSDs in the frequency bins above this value to 0. If left as None, the fit filter won't have any lowpass filtering properties.

  • overlap (Optional[float]) -- Overlap between FFT frames used to convert time-domain data to the frequency domain via Welch's method. If fftlength is None, this is ignored. Otherwise, if left as None, it will be set to half of fftlength by default.

Return type:

None

forward(X)

Whiten the input timeseries tensor using the PSD fit by the .fit method, which must be called before the first call to .forward.

Return type:

Float[Tensor, 'batch channel time']

Parameters:

X (Float[Tensor, 'batch channel time'])

class ml4gw.transforms.whitening.Whiten(fduration, sample_rate, highpass=None, lowpass=None)

Bases: Module

Normalize the frequency content of timeseries data by a provided power spectral density, such that if the timeseries are sampled from the same distribution as the PSD the normalized power will be approximately unity across all frequency bins. The whitened timeseries will then also have 0 mean and unit variance.

In order to avoid edge effects due to filter settle-in, the provided PSDs will have their spectrum truncated such that their impulse response time in the time domain is fduration seconds, and fduration / 2 seconds worth of data will be removed from each edge of the whitened timeseries.

For more information, see the documentation for whiten().

Parameters:
  • fduration (float) -- The length of the whitening filter's impulse response, in seconds. fduration / 2 seconds worth of data will be cropped from the edges of the whitened timeseries.

  • sample_rate (float) -- Rate at which timeseries data passed at call time is expected to be sampled

  • highpass (Optional[float]) -- Cutoff frequency to apply highpass filtering during whitening. If left as None, no highpass filtering will be performed.

  • lowpass (Optional[float]) -- Cutoff frequency to apply lowpass filtering during whitening. If left as None, no lowpass filtering will be performed.

forward(X, psd)

Whiten a batch of multichannel timeseries by a background power spectral density.

Parameters:
  • X (Float[Tensor, 'batch channel time']) -- Batch of multichannel timeseries to whiten. Should have the shape (B, C, N), where B is the batch size, C is the number of channels, and N is the number of seconds in the timeseries times self.sample_rate.

  • psd (Union[Float[Tensor, 'frequency'], Float[Tensor, 'channel frequency'], Float[Tensor, 'batch channel frequency']]) -- Power spectral density used to whiten the provided timeseries. Can be either 1D, 2D, or 3D, with the last dimension representing power at each frequency value. All other dimensions must match their corresponding value in X, starting from the right. (e.g. if psd.ndim == 2, psd.size(1) should be equal to X.size(1). If psd.ndim == 3, psd.size(1) and psd.size(0) should be equal to X.size(1) and X.size(0), respectively.) For more information about what these different shapes for psd represent, consult the documentation for whiten().

Return type:

Float[Tensor, 'batch channel time']

Returns:

Whitened timeseries, with fduration * sample_rate / 2

samples cropped from each edge. Output shape will then be (B, C, N - fduration * sample_rate).