clouddrift.ragged.apply_ragged#
- clouddrift.ragged.apply_ragged(func: callable, arrays: list[~numpy.ndarray | ~xarray.core.dataarray.DataArray] | ~numpy.ndarray | ~xarray.core.dataarray.DataArray, rowsize: list[int] | ~numpy.ndarray[int] | ~xarray.core.dataarray.DataArray, *args: tuple, rows: int | ~collections.abc.Iterable[int] = None, axis: int = 0, executor: ~concurrent.futures._base.Executor = <concurrent.futures.thread.ThreadPoolExecutor object>, **kwargs: dict) tuple[ndarray] | ndarray [source]#
Apply a function to a ragged array.
The function
func
will be applied to each contiguous row ofarrays
as indicated by row sizesrowsize
. The output offunc
will be concatenated into a single ragged array.You can pass
arrays
as NumPy arrays or xarray DataArrays, however, the result will always be a NumPy array. Passingrows
as an integer or a sequence of integers will makeapply_ragged
process and return only those specific rows, and otherwise, all rows in the input ragged array will be processed. Further, you can use theaxis
parameter to specify the ragged axis of the input array(s) (default is 0).By default this function uses
concurrent.futures.ThreadPoolExecutor
to runfunc
in multiple threads. The number of threads can be controlled by passing themax_workers
argument to the executor instance passed toapply_ragged
. Alternatively, you can pass theconcurrent.futures.ProcessPoolExecutor
instance to use processes instead. Passing alternative (3rd party library) concurrent executors may work if they follow the same executor interface as that ofconcurrent.futures
, however this has not been tested yet.Parameters#
- funccallable
Function to apply to each row of each ragged array in
arrays
.- arrayslist[np.ndarray] or np.ndarray or xr.DataArray
An array or a list of arrays to apply
func
to.- rowsizelist[int] or np.ndarray[int] or xr.DataArray[int]
List of integers specifying the number of data points in each row.
- *argstuple
Additional arguments to pass to
func
.- rowsint or Iterable[int], optional
The row(s) of the ragged array to apply
func
to. Ifrows
isNone
(default), thenfunc
will be applied to all rows.- axisint, optional
The ragged axis of the input arrays. Default is 0.
- executorconcurrent.futures.Executor, optional
Executor to use for concurrent execution. Default is
ThreadPoolExecutor
with the default number ofmax_workers
. Another supported option isProcessPoolExecutor
.- **kwargsdict
Additional keyword arguments to pass to
func
.
Returns#
- outtuple[np.ndarray] or np.ndarray
Output array(s) from
func
.
Examples#
Using
velocity_from_position
withapply_ragged
, calculate the velocities of multiple particles, the coordinates of which are found in the ragged arrays x, y, and t that share row sizes 2, 3, and 4:>>> from clouddrift.kinematics import velocity_from_position >>> rowsize = [2, 3, 4] >>> x = np.array([1, 2, 10, 12, 14, 30, 33, 36, 39]) >>> y = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8]) >>> t = np.array([1, 2, 1, 2, 3, 1, 2, 3, 4]) >>> u1, v1 = apply_ragged(velocity_from_position, [x, y, t], rowsize, coord_system="cartesian") >>> u1 array([1., 1., 2., 2., 2., 3., 3., 3., 3.]) >>> v1 array([1., 1., 1., 1., 1., 1., 1., 1., 1.])
To apply
func
to only a subset of rows, use therows
argument:>>> u1, v1 = apply_ragged(velocity_from_position, [x, y, t], rowsize, rows=0, coord_system="cartesian") >>> u1 array([1., 1.]) >>> v1 array([1., 1.]) >>> u1, v1 = apply_ragged(velocity_from_position, [x, y, t], rowsize, rows=[0, 1], coord_system="cartesian") >>> u1 array([1., 1., 2., 2., 2.]) >>> v1 array([1., 1., 1., 1., 1.])
Raises#
- ValueError
If the sum of
rowsize
does not equal the length ofarrays
.- IndexError
If empty
arrays
.