ml4gw.dataloading.chunked_dataset
Classes
|
Wrapper dataset that will loop through chunks of timeseries data produced by another iterable and sample windows from these chunks. |
- class ml4gw.dataloading.chunked_dataset.ChunkedTimeSeriesDataset(chunk_it, kernel_size, batch_size, batches_per_chunk, coincident=True, device='cpu')
Bases:
IterableDataset
Wrapper dataset that will loop through chunks of timeseries data produced by another iterable and sample windows from these chunks.
- Parameters:
chunk_it (
Iterable
) -- Iterator which will produce chunks of timeseries data to sample windows from. Should have shape(N, C, T)
, whereN
is the number of chunks to sample from,C
is the number of channels, andT
is the number of samples along the time dimension for each chunk.kernel_size (
float
) -- Size of windows to be sampled from each chunk. Should be less than the size of each chunk along the time dimension.batch_size (
int
) -- Number of windows to sample at each iterationbatches_per_chunk (
int
) -- Number of batches of windows to sample from each chunk before moving on to the next one. Sampling fewer batches from each chunk means a lower likelihood of sampling duplicate windows, but an increase in chunk-loading overhead.coincident (
bool
) -- Whether the windows sampled from individual channels in each batch element should be sampled coincidentally, i.e. consisting of the same timesteps, or whether each window should be sample independently from the others.device (
str
) -- Which device chunks should be moved to upon loading.