clouddrift.adapters.gdp#

This module provides functions and metadata to convert the Global Drifter Program (GDP) data to a clouddrift.RaggedArray instance. The functions defined in this module are common to both hourly (clouddrift.adapters.gdp1h) and six-hourly (clouddrift.adapters.gdp6h) GDP modules.

Functions

cast_float64_variables_to_float32(ds[, ...])

Cast all float64 variables except variables_to_skip to float32.

cut_str(value, max_length)

Cut a string to a specific length and return it as a numpy chararray.

decode_date(t)

The date format is specified as 'seconds since 1970-01-01 00:00:00' but the missing values are stored as -1e+34 which is not supported by the default parsing mechanism in xarray.

drogue_presence(lost_time, time)

Create drogue status from the drogue lost time and the trajectory time.

fetch_netcdf(url, file)

Download and save the file from the given url, if not already downloaded.

fill_values(var[, default])

Change fill values (-1e+34, inf, -inf) in var array to the value specified by default.

get_gdp_metadata([tmp_path])

Download and parse GDP metadata and return it as a Pandas DataFrame.

order_by_date(df, idx)

From the previously sorted DataFrame of directory files, return the unique set of drifter IDs sorted by their start date (the date of the first quality-controlled data point).

parse_directory_file(filename, tmp_path)

Read a GDP directory file that contains metadata of drifter releases.

rowsize(index, **kwargs)

str_to_float(value[, default])

Convert a string to float, while returning the value of default if the string is not convertible to a float, or if it's a NaN.

clouddrift.adapters.gdp.cast_float64_variables_to_float32(ds: Dataset, variables_to_skip: list[str] = ['time', 'lat', 'lon']) Dataset[source]#

Cast all float64 variables except variables_to_skip to float32. Extra precision from float64 is not needed and takes up memory and disk space.

Parameters#

dsxr.Dataset

Dataset to modify

variables_to_skiplist[str]

List of variables to skip; default is [“time”, “lat”, “lon”].

Returns#

dsxr.Dataset

Modified dataset

clouddrift.adapters.gdp.cut_str(value: str, max_length: int) chararray[source]#

Cut a string to a specific length and return it as a numpy chararray.

Parameters#

valuestr

String to cut

max_lengthint

Length of the output

Returns#

outnp.chararray

String with max_length characters

clouddrift.adapters.gdp.decode_date(t)[source]#

The date format is specified as ‘seconds since 1970-01-01 00:00:00’ but the missing values are stored as -1e+34 which is not supported by the default parsing mechanism in xarray.

This function returns replaced the missing value by NaN and returns a datetime instance.

Parameters#

tarray

Array of time values

Returns#

outdatetime

Datetime instance with the missing value replaced by NaN

clouddrift.adapters.gdp.drogue_presence(lost_time, time) ndarray[source]#

Create drogue status from the drogue lost time and the trajectory time.

Parameters#

lost_time

Timestamp of the drogue loss (or NaT)

time

Observation time

Returns#

outbool

True if drogues and False otherwise

clouddrift.adapters.gdp.fetch_netcdf(url: str, file: str)[source]#

Download and save the file from the given url, if not already downloaded.

Parameters#

urlstr

URL from which to download the file.

filestr

Name of the file to save.

clouddrift.adapters.gdp.fill_values(var, default=nan)[source]#

Change fill values (-1e+34, inf, -inf) in var array to the value specified by default.

Parameters#

vararray

Array to fill

defaultfloat

Default value to use for fill values

clouddrift.adapters.gdp.get_gdp_metadata(tmp_path: str = '/tmp/clouddrift/gdp') DataFrame[source]#

Download and parse GDP metadata and return it as a Pandas DataFrame.

Returns#

dfpd.DataFrame

Sorted list of drifters as a pandas DataFrame.

clouddrift.adapters.gdp.order_by_date(df: DataFrame, idx: list[int]) list[int][source]#

From the previously sorted DataFrame of directory files, return the unique set of drifter IDs sorted by their start date (the date of the first quality-controlled data point).

Parameters#

idxlist

List of drifters to include in the ragged array

Returns#

idxlist

Unique set of drifter IDs sorted by their start date.

clouddrift.adapters.gdp.parse_directory_file(filename: str, tmp_path: str) DataFrame[source]#

Read a GDP directory file that contains metadata of drifter releases.

Parameters#

filenamestr

Name of the directory file to parse.

Returns#

dfpd.DataFrame

List of drifters from a single directory file as a pandas DataFrame.

clouddrift.adapters.gdp.str_to_float(value: str, default: float = nan) float[source]#

Convert a string to float, while returning the value of default if the string is not convertible to a float, or if it’s a NaN.

Parameters#

valuestr

String to convert to float

defaultfloat

Default value to return if the string is not convertible to float

Returns#

outfloat

Float value of the string, or default if the string is not convertible to float.

Modules

gdp1h

This module provides functions and metadata that can be used to convert the hourly Global Drifter Program (GDP) data to a clouddrift.RaggedArray instance.

gdp6h

This module provides functions and metadata that can be used to convert the 6-hourly Global Drifter Program (GDP) data to a clouddrift.RaggedArray instance.

gdpsource