clouddrift.pairs#
Functions to analyze pairs of contiguous data segments.
Functions
|
Given two sets of longitudes, latitudes, and times arrays, return in pairs the indices of collocated data points that are within prescribed distances in space and time. |
|
Return all chance pairs of contiguous trajectories in a ragged array, and their collocated points in space and (optionally) time, given input ragged arrays of longitude, latitude, and (optionally) time, and chance pair criteria as maximum allowable distances in space and time. |
|
Given two arrays of longitudes and latitudes, return boolean masks for their overlapping bounding boxes. |
|
Given two arrays of longitudes and latitudes, return the distance on a sphere between all pairs of points. |
|
Given two arrays of times (or any other monotonically increasing quantity), return the temporal distance between all pairs of times. |
|
Given two arrays of times (or any other monotonically increasing quantity), return indices where the times are within a prescribed distance. |
- clouddrift.pairs.chance_pair(lon1: list[float] | ndarray[float] | Series | DataArray, lat1: list[float] | ndarray[float] | Series | DataArray, lon2: list[float] | ndarray[float] | Series | DataArray, lat2: list[float] | ndarray[float] | Series | DataArray, time1: list[float] | ndarray[float] | Series | DataArray | None = None, time2: list[float] | ndarray[float] | Series | DataArray | None = None, space_distance: float = 0, time_distance: float = 0)[source]#
Given two sets of longitudes, latitudes, and times arrays, return in pairs the indices of collocated data points that are within prescribed distances in space and time. Also known as chance pairs.
Parameters#
- lon1array_like
First array of longitudes in degrees.
- lat1array_like
First array of latitudes in degrees.
- lon2array_like
Second array of longitudes in degrees.
- lat2array_like
Second array of latitudes in degrees.
- time1array_like, optional
First array of times.
- time2array_like, optional
Second array of times.
- space_distancefloat, optional
Maximum allowable space distance in meters for a pair to qualify as chance pair. If the separation is within this distance, the pair is considered to be a chance pair. Default is 0, or no distance, i.e. the positions must be exactly the same.
- time_distancefloat, optional
Maximum allowable time distance for a pair to qualify as chance pair. If a separation is within this distance, and a space distance condition is satisfied, the pair is considered a chance pair. Default is 0, or no distance, i.e. the times must be exactly the same.
Returns#
- indices1np.ndarray[int]
Indices within the first set of arrays that lead to chance pair.
- indices2np.ndarray[int]
Indices within the second set of arrays that lead to chance pair.
Examples#
In the following example, we load the GLAD dataset, extract the first two trajectories, and find between these the array indices that satisfy the chance pair criteria of 6 km separation distance and no time separation:
>>> from clouddrift.datasets import glad >>> from clouddrift.pairs import chance_pair >>> from clouddrift.ragged import unpack >>> ds = glad() >>> lon1 = unpack(ds["longitude"], ds["rowsize"], rows=0).pop() >>> lat1 = unpack(ds["latitude"], ds["rowsize"], rows=0).pop() >>> time1 = unpack(ds["time"], ds["rowsize"], rows=0).pop() >>> lon2 = unpack(ds["longitude"], ds["rowsize"], rows=1).pop() >>> lat2 = unpack(ds["latitude"], ds["rowsize"], rows=1).pop() >>> time2 = unpack(ds["time"], ds["rowsize"], rows=1).pop() >>> i1, i2 = chance_pair(lon1, lat1, lon2, lat2, time1, time2, 6000, np.timedelta64(0)) >>> i1, i2 (array([177, 180, 183, 186, 189, 192]), array([166, 169, 172, 175, 178, 181]))
Check to ensure our collocation in space worked by calculating the distance between the identified pairs:
>>> sphere.distance(lon1[i1], lat1[i1], lon2[i2], lat2[i2]) array([5967.4844, 5403.253 , 5116.9136, 5185.715 , 5467.8555, 5958.4917], dtype=float32)
Check the collocation in time:
>>> time1[i1] - time2[i2] <xarray.DataArray 'time' (obs: 6)> array([0, 0, 0, 0, 0, 0], dtype='timedelta64[ns]') Coordinates: time (obs) datetime64[ns] 2012-07-21T21:30:00.524160 ... 2012-07-22T0... Dimensions without coordinates: obs
Raises#
- ValueError
If
time1
andtime2
are not both provided or both omitted.
- clouddrift.pairs.chance_pairs_from_ragged(lon: list[float] | ndarray[float] | Series | DataArray, lat: list[float] | ndarray[float] | Series | DataArray, rowsize: list[float] | ndarray[float] | Series | DataArray, space_distance: float = 0, time: list[float] | ndarray[float] | Series | DataArray | None = None, time_distance: float = 0) list[tuple[tuple[int, int], tuple[ndarray, ndarray]]] [source]#
Return all chance pairs of contiguous trajectories in a ragged array, and their collocated points in space and (optionally) time, given input ragged arrays of longitude, latitude, and (optionally) time, and chance pair criteria as maximum allowable distances in space and time.
If
time
andtime_distance
are omitted, the search will be done only on the spatial criteria, and the result will not include the time arrays.If
time
andtime_distance
are provided, the search will be done on both the spatial and temporal criteria, and the result will include the time arrays.Parameters#
- lonarray_like
Array of longitudes in degrees.
- latarray_like
Array of latitudes in degrees.
- rowsizearray_like
Array of rowsizes.
- space_distancefloat, optional
Maximum space distance in meters for the pair to qualify as chance pair. If the separation is within this distance, the pair is considered to be a chance pair. Default is 0, or no distance, i.e. the positions must be exactly the same.
- timearray_like, optional
Array of times.
- time_distancefloat, optional
Maximum time distance allowed for the pair to qualify as chance pair. If the separation is within this distance, and the space distance condition is satisfied, the pair is considered a chance pair. Default is 0, or no distance, i.e. the times must be exactly the same.
Returns#
- pairsList[Tuple[Tuple[int, int], Tuple[np.ndarray, np.ndarray]]]
List of tuples, each tuple containing a Tuple of integer indices that corresponds to the trajectory rows in the ragged array, indicating the pair of trajectories that satisfy the chance pair criteria, and a Tuple of arrays containing the indices of the collocated points for each trajectory in the chance pair.
Examples#
In the following example, we load GLAD dataset as a ragged array dataset, subset the result to retain the first five trajectories, and finally find all trajectories that satisfy the chance pair criteria of 12 km separation distance and no time separation, as well as the indices of the collocated points for each pair.
>>> from clouddrift.datasets import glad >>> from clouddrift.pairs import chance_pairs_from_ragged >>> from clouddrift.ragged import subset >>> ds = subset(glad(), {"id": ["CARTHE_001", "CARTHE_002", "CARTHE_003", "CARTHE_004", "CARTHE_005"]}, id_var_name="id") >>> pairs = chance_pairs_from_ragged( ds["longitude"].values, ds["latitude"].values, ds["rowsize"].values, space_distance=12000, time=ds["time"].values, time_distance=np.timedelta64(0) ) [((0, 1), (array([153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198, 201, 204, 207, 210, 213, 216]), array([142, 145, 148, 151, 154, 157, 160, 163, 166, 169, 172, 175, 178, 181, 184, 187, 190, 193, 196, 199, 202, 205]))), ((3, 4), (array([141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183]), array([136, 139, 142, 145, 148, 151, 154, 157, 160, 163, 166, 169, 172, 175, 178])))]
The result above shows that 2 chance pairs were found.
Raises#
- ValueError
If
rowsize
has fewer than two elements.
- clouddrift.pairs.pair_bounding_box_overlap(lon1: list[float] | ndarray[float] | Series | DataArray, lat1: list[float] | ndarray[float] | Series | DataArray, lon2: list[float] | ndarray[float] | Series | DataArray, lat2: list[float] | ndarray[float] | Series | DataArray, distance: float = 0) tuple[ndarray[bool], ndarray[bool]] [source]#
Given two arrays of longitudes and latitudes, return boolean masks for their overlapping bounding boxes.
Parameters#
- lon1array_like
First array of longitudes in degrees.
- lat1array_like
First array of latitudes in degrees.
- lon2array_like
Second array of longitudes in degrees.
- lat2array_like
Second array of latitudes in degrees.
- distancefloat, optional
Distance in degrees for the overlap. If the overlap is within this distance, the bounding boxes are considered to overlap. Default is 0.
Returns#
- overlap1np.ndarray[int]
Indices
lon1
andlat1
where their bounding box overlaps with that oflon2
andlat2
.- overlap2np.ndarray[int]
Indices
lon2
andlat2
where their bounding box overlaps with that oflon1
andlat1
.
Examples#
>>> lon1 = [0, 0, 1, 1] >>> lat1 = [0, 0, 1, 1] >>> lon2 = [1, 1, 2, 2] >>> lat2 = [1, 1, 2, 2] >>> pair_bounding_box_overlap(lon1, lat1, lon2, lat2, 0.5) (array([2, 3]), array([0, 1]))
- clouddrift.pairs.pair_space_distance(lon1: list[float] | ndarray[float] | Series | DataArray, lat1: list[float] | ndarray[float] | Series | DataArray, lon2: list[float] | ndarray[float] | Series | DataArray, lat2: list[float] | ndarray[float] | Series | DataArray) ndarray[float] [source]#
Given two arrays of longitudes and latitudes, return the distance on a sphere between all pairs of points.
Parameters#
- lon1array_like
First array of longitudes in degrees.
- lat1array_like
First array of latitudes in degrees.
- lon2array_like
Second array of longitudes in degrees.
- lat2array_like
Second array of latitudes in degrees.
Returns#
- distancenp.ndarray[float]
Array of distances between all pairs of points.
Examples#
>>> lon1 = [0, 0, 1, 1] >>> lat1 = [0, 0, 1, 1] >>> lon2 = [1, 1, 2, 2] >>> lat2 = [1, 1, 2, 2] >>> pair_space_distance(lon1, lat1, lon2, lat2) array([[157424.62387233, 157424.62387233, 0. , 0. ], [157424.62387233, 157424.62387233, 0. , 0. ], [314825.26360286, 314825.26360286, 157400.64794884, 157400.64794884], [314825.26360286, 314825.26360286, 157400.64794884, 157400.64794884]])
- clouddrift.pairs.pair_time_distance(time1: list[float] | ndarray[float] | Series | DataArray, time2: list[float] | ndarray[float] | Series | DataArray) ndarray[float] [source]#
Given two arrays of times (or any other monotonically increasing quantity), return the temporal distance between all pairs of times.
Parameters#
- time1array_like
First array of times.
- time2array_like
Second array of times.
Returns#
- distancenp.ndarray[float]
Array of distances between all pairs of times.
Examples#
>>> time1 = np.arange(4) >>> time2 = np.arange(2, 6) >>> pair_time_distance(time1, time2) array([[2, 1, 0, 1], [3, 2, 1, 0], [4, 3, 2, 1], [5, 4, 3, 2]])
- clouddrift.pairs.pair_time_overlap(time1: list[float] | ndarray[float] | Series | DataArray, time2: list[float] | ndarray[float] | Series | DataArray, distance: float = 0) tuple[ndarray[int], ndarray[int]] [source]#
Given two arrays of times (or any other monotonically increasing quantity), return indices where the times are within a prescribed distance.
Although higher-level array containers like xarray and pandas are supported for input arrays, this function is an order of magnitude faster when passing in numpy arrays.
Parameters#
- time1array_like
First array of times.
- time2array_like
Second array of times.
- distancefloat
Maximum distance within which the values of
time1
andtime2
are considered to overlap. Default is 0, or, the values must be exactly the same.
Returns#
- overlap1np.ndarray[int]
Indices of
time1
where its time overlaps withtime2
.- overlap2np.ndarray[int]
Indices of
time2
where its time overlaps withtime1
.
Examples#
>>> time1 = np.arange(4) >>> time2 = np.arange(2, 6) >>> pair_time_overlap(time1, time2) (array([2, 3]), array([0, 1]))
>>> pair_time_overlap(time1, time2, 1) (array([1, 2, 3]), array([0, 1, 2]))