clouddrift.adapters.gdp.gdp1h.to_raggedarray#
- clouddrift.adapters.gdp.gdp1h.to_raggedarray(drifter_ids: list[int] | None = None, n_random_id: int | None = None, url: str = 'https://www.aoml.noaa.gov/ftp/pub/phod/buoydata/hourly_product/v2.01', tmp_path: str | None = None) RaggedArray [source]#
Download and process individual GDP hourly files and return a RaggedArray instance with the data.
Parameters#
- drifter_idslist[int], optional
List of drifters to retrieve (Default: all)
- n_random_idlist[int], optional
Randomly select n_random_id drifter NetCDF files
- urlstr
URL from which to download the data (Default: GDP_DATA_URL). Alternatively, it can be GDP_DATA_URL_EXPERIMENTAL.
- tmp_pathstr, optional
Path to the directory where the individual NetCDF files are stored (default varies depending on operating system; /tmp/clouddrift/gdp on Linux)
Returns#
- outRaggedArray
A RaggedArray instance of the requested dataset
Examples#
Invoke to_raggedarray without any arguments to download all drifter data from the 2.01 GDP feed:
>>> from clouddrift.adapters.gdp1h import to_raggedarray >>> ra = to_raggedarray()
To download a random sample of 100 drifters, for example for development or testing, use the n_random_id argument:
>>> ra = to_raggedarray(n_random_id=100)
To download a specific list of drifters, use the drifter_ids argument:
>>> ra = to_raggedarray(drifter_ids=[44136, 54680, 83463])
To download the experimental 2.01 GDP feed, use the url argument to specify the experimental feed URL:
>>> from clouddrift.adapters.gdp1h import GDP_DATA_URL_EXPERIMENTAL, to_raggedarray >>> ra = to_raggedarray(url=GDP_DATA_URL_EXPERIMENTAL)
Finally, to_raggedarray returns a RaggedArray instance which provides a convenience method to emit a xarray.Dataset instance:
>>> ds = ra.to_xarray()
To write the ragged array dataset to a NetCDF file on disk, do
>>> ds.to_netcdf("gdp1h.nc", format="NETCDF4")
Alternatively, to write the ragged array to a Parquet file, first create it as an Awkward Array:
>>> arr = ra.to_awkward() >>> arr.to_parquet("gdp1h.parquet")