clouddrift.binning#
Module for binning Lagrangian data.
Functions
|
Perform N-dimensional binning and compute statistics of values in each bin. |
A decorator to handle datetime64/timedelta64 conversion for statistics functions. |
- clouddrift.binning.binned_statistics(coords: ndarray | list[ndarray], data: ndarray | list[ndarray] | None = None, bins: int | list = 10, bins_range: list | None = None, dim_names: list[str] | None = None, output_names: list[str] | None = None, statistics: str | list | Callable[[ndarray], float] = 'count') Dataset [source]#
Perform N-dimensional binning and compute statistics of values in each bin. The result is returned as an Xarray Dataset.
Parameters#
- coordsarray-like or list of array-like
Array(s) of Lagrangian data coordinates to be binned. For 1D, provide a single array. For N-dimensions, provide a list of N arrays, each giving coordinates along one dimension.
- dataarray-like or list of array-like
Data values associated with the Lagrangian coordinates in coords. Can be a single array or a list of arrays for multiple variables. Complex values are supported for the supported statistics except for ‘min’, ‘max’, and ‘median’.
- binsint or lists, optional
Number of bins or bin edges per dimension. It can be: - An int: same number of bins for all dimensions, - A list of ints or arrays: one per dimension, specifying either bin count or bin edges, - None: defaults to 10 bins per dimension.
- bins_rangelist of tuples, optional
Outer bin limits for each dimension.
- statisticsstr or list of str, Callable[[np.ndarray], float] or list[Callable[[np.ndarray], float]]
Statistics to compute for each bin. It can be: - a string, supported values: ‘count’, ‘sum’, ‘mean’, ‘median’, ‘std’, ‘min’, ‘max’, (default: “count”), - a custom function as a callable for univariate statistics that take a 1D array of values and return a single value.
The callable is applied to each variable of data.
a tuple of (output_name, callable) for multivariate statistics. ‘output_name’ is used to identify the resulting variables. In this case, the callable will receive the list of arrays provided in data. For example, to calculate kinetic energy from data with velocity components u and v, you can pass data = [u, v] and statistics=(“ke”, lambda data: np.sqrt(np.mean(data[0] ** 2 + data[1] ** 2))).
a list containing any combination of the above, e.g., [‘mean’, np.nanmax, (‘ke’, lambda data: np.sqrt(np.mean(data[0] ** 2 + data[1] ** 2)))].
- dim_nameslist of str, optional
Names for the dimensions of the output xr.Dataset. If None, default names are “coord_0”, “coord_1”, etc.
- output_nameslist of str, optional
Names for output variables in the xr.Dataset. If None, default names are “data_0_{statistic}”, “data_1_{statistic}”, etc.
Returns#
- xr.Dataset
Xarray dataset with binned means and count for each variable.
- clouddrift.binning.handle_datetime_conversion(func: Callable) Callable [source]#
A decorator to handle datetime64/timedelta64 conversion for statistics functions. For datetime values, it converts the time to float seconds since epoch before calling the function, and converts the result back to datetime64 after the function call.
Assumes that the function accepts values as keyword arguments.