clouddrift.datasets.gdp_source

Contents

clouddrift.datasets.gdp_source#

clouddrift.datasets.gdp_source(tmp_path: str = '/tmp/clouddrift/gdpsource', max: int | None = None, skip_download: bool = False, use_fill_values: bool = True, decode_times: bool = True) Dataset[source]#

Returns the NOAA Global Drifter Program (GDP) source (raw) dataset as a ragged array Xarray dataset.

The function will first look for the ragged-array dataset on the local filesystem. If it is not found, the dataset will be downloaded using the corresponding adapter function and stored as zarr archive for later access.

The data is accessed from a public HTTPS server at NOAA’s Atlantic Oceanographic and Meteorological Laboratory (AOML) at https://www.aoml.noaa.gov/ftp/pub/phod/pub/pazos/data/shane/sst/.

Parameters#

tmp_path: str, default adapter temp path (default)

Temporary path where intermediary files are stored.

max: int, optional

Maximum number of files to retrieve and parse to generate the aggregate file. Mainly used for testing purposes.

skip_download: bool, False (default)

If True, skips downloading the data files and the code assumes the files have already been downloaded. This is mainly used to skip downloading files if the remote doesn’t provide the HTTP Last-Modified header.

use_fill_values: bool, True (default)

When True, missing metadata fields are replaced with fill values. When False and no metadata is found for a given drifter its observations are ignored.

decode_timesbool, True (default)

If True, decode the time coordinate into a datetime object. If False, the time coordinate will be an int64 or float64 array of increments since the origin time indicated in the units attribute. Default is True.

Returns#

xarray.Dataset

source GDP dataset as a ragged array

Examples#

>>> from clouddrift.datasets import gdp_source
>>> ds = gdp_source()
>>> ds
<xarray.Dataset> Size: ...
Dimensions:            (traj: ..., obs: ...)
Coordinates:
    id                 (traj) int64 222kB ...
    obs_index          (obs) int32 1GB ...
Dimensions without coordinates: traj, obs
Data variables: (12/22)
    buoys_type         (traj) <U5 ...
    death_code         (traj) int64 ...
    drogue             (obs) float32 ...
    drogue_off_date    (traj) datetime64[ns] ...
    end_date           (traj) datetime64[ns] ...
    end_lat            (traj) float64 222kB ...
    ...                 ...
    sst                (obs) float32 ...
    start_date         (traj) datetime64[ns] ...
    start_lat          (traj) float64 ...
    start_lon          (traj) float64 ...
    voltage            (obs) float32 ...
    wmo_number         (traj) int64 ...