clouddrift.adapters.gdp.gdp6h.to_raggedarray

Contents

clouddrift.adapters.gdp.gdp6h.to_raggedarray#

clouddrift.adapters.gdp.gdp6h.to_raggedarray(drifter_ids: list[int] | None = None, n_random_id: int | None = None, tmp_path: str = '/tmp/clouddrift/gdp6h', skip_download: bool = False) RaggedArray[source]#

Download and process individual GDP 6-hourly files and return a RaggedArray instance with the data.

Parameters#

drifter_idslist[int], optional

List of drifters to retrieve (Default: all)

n_random_idlist[int], optional

Randomly select n_random_id drifter NetCDF files

tmp_pathstr, optional

Path to the directory where the individual NetCDF files are stored (default varies depending on operating system; /tmp/clouddrift/gdp6h on Linux)

skip_downloadbool, optional

If True, make no network requests: discover drifter IDs by scanning tmp_path for existing drifter_6h_*.nc files and use locally cached dirfl_*.dat metadata files. Default is False.

Returns#

outRaggedArray

A RaggedArray instance of the requested dataset.

Raises#

ValueError

If no matching drifter files are found for the requested selection.

Examples#

Invoke to_raggedarray without any arguments to download all drifter data from the 6-hourly GDP feed:

>>> from clouddrift.adapters.gdp.gdp6h import to_raggedarray
>>> ra = to_raggedarray()

To download a random sample of 100 drifters, for example for development or testing, use the n_random_id argument:

>>> ra = to_raggedarray(n_random_id=100)

To download a specific list of drifters, use the drifter_ids argument:

>>> ra = to_raggedarray(drifter_ids=[54375, 114956, 126934])

The function to_raggedarray returns a RaggedArray instance which provides a convenience method to produce a xarray.Dataset instance for analysis:

>>> ds = ra.to_xarray()

To write the ragged array dataset to a NetCDF file or a Zarr file on disk, you can use the to_netcdf or to_zarr method of the xarray.Dataset instance:

>>> ds.to_netcdf("gdp6h.nc")
>>> ds.to_zarr("gdp6h.zarr", mode="w")

To write the ragged array dataset to a Parquet file, you can directly use the to_parquet method of the RaggedArray instance:

>>> ra.to_parquet("gdp6h.parquet")