RegularTimeSeries

class RegularTimeSeries(*, sampling_rate, domain=None, domain_start=0.0, **kwargs)[source]

Bases: ArrayDict

A regular time series is the same as an irregular time series, but it has a regular sampling rate. This allows for faster indexing, possibility of patching data and meaningful Fourier operations. The first dimension of all attributes must be the time dimension.

Note

If you have a matrix of shape (N, T), where N is the number of channels and T is the number of time points, you should transpose it to (T, N) before passing it to the constructor, since the first dimension should always be time.

Parameters:
  • sampling_rate (float) – Sampling rate in Hz.

  • domain (Union[Interval, Literal['auto'], None]) – "auto" or an Interval object that defines the domain over which the timeseries is defined.

  • **kwargs (ndarray) – Arbitrary keyword arguments where the values are arbitrary multi-dimensional (2d, 3d, …, nd) arrays with shape (N, *).

Example

>>> import numpy as np
>>> from temporaldata import RegularTimeSeries

>>> lfp = RegularTimeSeries(
...     raw=np.zeros((1000, 128)),
...     sampling_rate=250.,
...     domain=Interval(0., 4.),
... )

>>> lfp.slice(0, 1)
RegularTimeSeries(
  raw=[250, 128]
)

>>> lfp.to_irregular()
IrregularTimeSeries(
  timestamps=[1000],
  raw=[1000, 128]
)
property sampling_rate: float

Returns the sampling rate in Hz.

property domain: Interval

Returns the domain of the time series.

select_by_mask(mask)[source]

Return a new ArrayDict object where all array attributes are indexed using the boolean mask.

Parameters:
  • mask (ndarray) – Boolean array used for masking. The mask needs to be 1-dimensional, and of equal length as the first dimension of the ArrayDict.

  • **kwargs – Private attributes that will not be masked will need to be passed as arguments.

Example

>>> from temporaldata import ArrayDict
>>> import numpy as np

>>> units = ArrayDict(
...     unit_id=np.array(["unit01", "unit02"]),
...     brain_region=np.array(["M1", "M1"]),
...     waveform_mean=np.random.rand(2, 48),
... )

>>> units_subset = units.select_by_mask(np.array([True, False]))
>>> units_subset
ArrayDict(
  unit_id=[1],
  brain_region=[1],
  waveform_mean=[1, 48]
)
slice(start, end, reset_origin=True, eps=1e-09)[source]

Returns a new RegularTimeSeries object that contains the data between the start (inclusive) and end (exclusive) times (i.e., [start, end)]).

Parameters:
  • start (float) – Start time.

  • end (float) – End time.

  • reset_origin (bool) – If True, all time attributes will be updated to be relative to the new start time. Defaults to True.

  • eps (float) – A tiny ‘rounding buffer’ to handle floating-point noise when computing indices. If your sampling rate is very high, you may need to increase this (e.g., to 1e-7) to avoid off-by-one errors.

Returns:

A new instance of the same class containing a subset of the data. The new object will have a modified Interval domain reflecting the actual sampled boundaries.

Return type:

RegularTimeSeries

to_irregular()[source]

Converts the time series to an irregular time series.

property timestamps

Returns the timestamps of the time series.

to_hdf5(file)[source]

Saves the data object to an HDF5 file.

Parameters:

file (h5py.File) – HDF5 file.

import h5py
from temporaldata import RegularTimeSeries

data = RegularTimeSeries(
    raw=np.zeros((1000, 128)),
    sampling_rate=250.,
    domain=Interval(0., 4.),
)

with h5py.File("data.h5", "w") as f:
    data.to_hdf5(f)
classmethod from_hdf5(file)[source]

Loads the data object from an HDF5 file.

Parameters:

file (h5py.File) – HDF5 file.

Note

This method will load all data in memory, if you would like to use lazy loading, call LazyRegularTimeSeries.from_hdf5() instead.

import h5py
from temporaldata import RegularTimeSeries

with h5py.File("data.h5", "r") as f:
    data = RegularTimeSeries.from_hdf5(f)
classmethod from_dataframe(df, unsigned_to_long=True, **kwargs)

Creates an ArrayDict object from a pandas DataFrame.

The columns in the DataFrame are converted to arrays when possible, otherwise they will be skipped.

Parameters:
  • df (pandas.DataFrame) – DataFrame.

  • unsigned_to_long (bool, optional) – If True, automatically converts unsigned integers to int64. Defaults to True.

keys()

Returns a list of all array attribute names.

Return type:

List[str]

materialize()

Materializes the data object, i.e., loads into memory all of the data that is still referenced in the HDF5 file.

Return type:

ArrayDict

class LazyRegularTimeSeries(*, sampling_rate, domain=None, domain_start=0.0, **kwargs)[source]

Bases: RegularTimeSeries

Lazy variant of RegularTimeSeries. The data is not loaded until it is accessed. This class is meant to be used when the data is too large to fit in memory, and is intended to be intantiated via. LazyRegularTimeSeries.from_hdf5.

Note

To access an attribute without triggering the in-memory loading use self.__dict__[key] otherwise using self.key or getattr(self, key) will trigger the lazy loading and will automatically convert the h5py dataset to a numpy array as well as apply any outstanding masks.

slice(start, end, reset_origin=True, eps=1e-09)[source]

Returns a new RegularTimeSeries object that contains the data between the start and end times.

Parameters:
  • start (float) – Start time.

  • end (float) – End time.

  • reset_origin (bool) – If True, all time attributes will be updated to be relative to the new start time. Defaults to True.

  • eps (float) – A tiny ‘rounding buffer’ to handle floating-point noise when computing indices. If your sampling rate is very high, you may need to increase this (e.g., to 1e-7) to avoid off-by-one errors.

Returns:

A new instance of the same class containing a subset of the data. The new object will have a modified Interval domain reflecting the actual sampled boundaries.

Return type:

LazyRegularTimeSeries

to_hdf5(file)[source]

Saves the data object to an HDF5 file.

Parameters:

file (h5py.File) – HDF5 file.

import h5py
from temporaldata import RegularTimeSeries

data = RegularTimeSeries(
    raw=np.zeros((1000, 128)),
    sampling_rate=250.,
    domain=Interval(0., 4.),
)

with h5py.File("data.h5", "w") as f:
    data.to_hdf5(f)
classmethod from_hdf5(file)[source]

Loads the data object from an HDF5 file.

Parameters:

file (h5py.File) – HDF5 file.

import h5py
from temporaldata import ArrayDict

with h5py.File("data.h5", "r") as f:
    data = ArrayDict.from_hdf5(f)
property domain: Interval

Returns the domain of the time series.

classmethod from_dataframe(df, unsigned_to_long=True, **kwargs)

Creates an ArrayDict object from a pandas DataFrame.

The columns in the DataFrame are converted to arrays when possible, otherwise they will be skipped.

Parameters:
  • df (pandas.DataFrame) – DataFrame.

  • unsigned_to_long (bool, optional) – If True, automatically converts unsigned integers to int64. Defaults to True.

keys()

Returns a list of all array attribute names.

Return type:

List[str]

materialize()

Materializes the data object, i.e., loads into memory all of the data that is still referenced in the HDF5 file.

Return type:

ArrayDict

property sampling_rate: float

Returns the sampling rate in Hz.

select_by_mask(mask)

Return a new ArrayDict object where all array attributes are indexed using the boolean mask.

Parameters:
  • mask (ndarray) – Boolean array used for masking. The mask needs to be 1-dimensional, and of equal length as the first dimension of the ArrayDict.

  • **kwargs – Private attributes that will not be masked will need to be passed as arguments.

Example

>>> from temporaldata import ArrayDict
>>> import numpy as np

>>> units = ArrayDict(
...     unit_id=np.array(["unit01", "unit02"]),
...     brain_region=np.array(["M1", "M1"]),
...     waveform_mean=np.random.rand(2, 48),
... )

>>> units_subset = units.select_by_mask(np.array([True, False]))
>>> units_subset
ArrayDict(
  unit_id=[1],
  brain_region=[1],
  waveform_mean=[1, 48]
)
property timestamps

Returns the timestamps of the time series.

to_irregular()

Converts the time series to an irregular time series.