hydro.exchange#

Submodules#

Package Contents#

Classes#

ExchangeBottleFlag

Enum representing a WHP Bottle flag.

ExchangeCTDFlag

Enum where members are also (and must be) ints

ExchangeFlag

Enum where members are also (and must be) ints

ExchangeSampleFlag

Enum where members are also (and must be) ints

FileType

Create a collection of name/value pairs.

_ExchangeData

Dataclass containing exchange data which has been parsed into ndarrays

_ExchangeInfo

Low level dataclass containing the parts of an exchange file

CheckOptions

Flags and config that controll how strict the file checks are

Functions#

_has_no_nones(val)

_transform_whp_to_csv(params, units)

_get_params(params_units)

_ctd_get_header(line[, dtype])

_is_all_dataarray(val)

flatten_cdom_coordinate(dataset)

Takes the a dataset with a CDOM wavelength and explocdes it back into individual variables

add_cdom_coordinate(dataset)

Find all the paraters in the cdom group and add their wavelength in a new coordinate

add_geometry_var(dataset)

Adds a CF-1.8 Geometry container variable to the dataset

add_profile_type(dataset, ftype)

Adds a profile_type string variable to the dataset.

finalize_ancillary_variables(dataset)

Turn the ancillary variable attr into a space seperated string

combine_bottle_time(dataset)

Combine the bottle dates and times if present

check_is_subset_shape(a1, a2[, strict])

Ensure that the shape of the data in a2 is a subset (or strict subset) of the data shape of a1

check_flags(dataset[, raises])

Check WOCE flag values agaisnt their param and ensure that the param either has a value or is "nan" depedning on the flag definition.

_get_fill_locs(arr[, fill_values])

extract_numeric_precisions(data)

Get the numeric precision of a printed decimal number

_is_valid_exchange_numeric(data)

_combine_dt_ndarray(date_arr[, time_arr, time_pad])

sort_ds(dataset)

Sorts the data values in the dataset

check_sorted(dataset)

Check that the dataset is sorted by the rules in sort_ds()

combine_dt(dataset[, is_coord, date_name, time_name, ...])

Combine the exchange style string variables of date and optinally time into a single

set_axis_attrs(dataset)

Set the CF axis attribute on our axis variables (XYZT)

set_coordinate_encoding_fill(dataset)

Sets the _FillValue encoidng to None for 1D coordinate vars

_load_raw_exchange(filename_or_obj, *[, ...])

all_same(ndarr)

Test if all the values of an ndarray are the same value

read_csv(filename_or_obj, *[, fill_values, ftype, ...])

read_exchange(filename_or_obj, *[, fill_values, ...])

Loads the data from filename_or_obj and returns a xr.Dataset with the CCHDO

_from_exchange_data(exchange_data, *[, ftype, checks])

Attributes#

exception hydro.exchange.ExchangeBOMError[source]#

Bases: ExchangeError

Error raised when the exchange file has a byte order mark.

exception hydro.exchange.ExchangeDataFlagPairError(error_data)[source]#

Bases: ExchangeDataError

There is a mismatch between what the flag value expects, and the fill/data value.

Examples#

  • something with a flag of 9 has a non fill value

  • something with a flag of 2 as a fill value instead of data

Parameters:

error_data (xarray.Dataset) –

exception hydro.exchange.ExchangeDataInconsistentCoordinateError[source]#

Bases: ExchangeDataError

Error raised if the reported latitude, longitude, and date (and time) vary for a single profile.

A “profile” in an exchange file is a grouping of data rows which all have the same EXPOCODE, STNNBR, and CASTNO. The SAMPNO/CTDPRS is allowed/requried to vary for a single profile and is what identifies samples within one profile.

exception hydro.exchange.ExchangeDataPartialCoordinateError[source]#

Bases: ExchangeDataError

Error raised if values for latitude, longitude, or pressure are missing.

It is OK by the standard to omit the time of day.

exception hydro.exchange.ExchangeDataPartialKeyError[source]#

Bases: ExchangeDataError

Error raised when there is no value for one (or more) of the following parameters.

  • EXPOCODE

  • STNNBR

  • CASTNO

  • SAMPNO (only for bottle files)

  • CTDPRS (only for CTD files)

These form the “composite key” which uniquely identify the “row” of exchange data.

exception hydro.exchange.ExchangeDuplicateKeyError[source]#

Bases: ExchangeDataError

Error raised when there is a duplicate composite key in the exchange file.

This would occur if the exact values for the following parameters occur in more than one data row:

  • EXPOCODE

  • STNNBR

  • CASTNO

  • SAMPNO (only for bottle files)

  • CTDPRS (only for CTD files)

exception hydro.exchange.ExchangeDuplicateParameterError[source]#

Bases: ExchangeParameterError

Error raised when the same parameter/unit pair occurs more than once in the excahnge file.

exception hydro.exchange.ExchangeEncodingError[source]#

Bases: ExchangeError

Error raised when the bytes for some exchange file cannot be decoded as UTF-8.

exception hydro.exchange.ExchangeError[source]#

Bases: ValueError

This is the base exception which all the other exceptions derive from. It is a subclass of ValueError.

exception hydro.exchange.ExchangeFlaglessParameterError[source]#

Bases: ExchangeParameterError

Error raised when a parameter has a flag column when it is not supposed to.

exception hydro.exchange.ExchangeInconsistentMergeType[source]#

Bases: ExchangeError

Error raised when the merge_ex method is called on mixed ctd and bottle exchange types.

exception hydro.exchange.ExchangeMagicNumberError[source]#

Bases: ExchangeError

Error raised when the exchange file does not start with BOTTLE or CTD.

exception hydro.exchange.ExchangeOrphanErrorError[source]#

Bases: ExchangeParameterError

Error raised when there exists an error column with no corresponding parameter column.

exception hydro.exchange.ExchangeOrphanFlagError[source]#

Bases: ExchangeParameterError

Error raised when there exists a flag column with no corresponding parameter column.

exception hydro.exchange.ExchangeParameterUndefError(error_data)[source]#

Bases: ExchangeParameterError

Error raised when the library does not have a definition for a parameter/unit pair in the exchange file.

Parameters:

error_data (list[str]) –

exception hydro.exchange.ExchangeParameterUnitAlignmentError[source]#

Bases: ExchangeParameterError

Error raised when there is a mismatch between the number of parameters and number of units in the exchange file.

class hydro.exchange.ExchangeBottleFlag(flag)[source]#

Bases: ExchangeFlag

Enum representing a WHP Bottle flag.

This flag represents information about the sampling device itself (i.e. the niskin bottle). It should only be used for “BTLNBR_FLAG_W” values and should never be used with CTD files.

property _no_data_flags#
property _flag_definitions#
NOFLAG = 0#
NO_INFO = 1#
GOOD = 2#
LEAKING = 3#
BAD_TRIP = 4#
NOT_REPORTED = 5#
DISCREPANCY = 6#
UNKNOWN = 7#
PAIR = 8#
NOT_SAMPLED = 9#
class hydro.exchange.ExchangeCTDFlag(flag)[source]#

Bases: ExchangeFlag

Enum where members are also (and must be) ints

property _no_data_flags#
property _flag_definitions#
NOFLAG = 0#
UNCALIBRATED = 1#
GOOD = 2#
QUESTIONABLE = 3#
BAD = 4#
NOT_REPORTED = 5#
INTERPOLATED = 6#
DESPIKED = 7#
NOT_SAMPLED = 9#
class hydro.exchange.ExchangeFlag(flag)[source]#

Bases: enum.IntEnum

Enum where members are also (and must be) ints

property definition#
property cf_def#
property has_value#
class hydro.exchange.ExchangeSampleFlag(flag)[source]#

Bases: ExchangeFlag

Enum where members are also (and must be) ints

property _no_data_flags#
property _flag_definitions#
NOFLAG = 0#
MISSING = 1#
GOOD = 2#
QUESTIONABLE = 3#
BAD = 4#
NOT_REPORTED = 5#
MEAN = 6#
CHROMA_MANUAL = 7#
CHROMA_IRREGULAR = 8#
NOT_SAMPLED = 9#
hydro.exchange.CCHDO_VERSION[source]#
hydro.exchange.log[source]#
hydro.exchange.DIMS = ('N_PROF', 'N_LEVELS')[source]#
hydro.exchange.EXPOCODE[source]#
hydro.exchange.STNNBR[source]#
hydro.exchange.CASTNO[source]#
hydro.exchange.SAMPNO[source]#
hydro.exchange.DATE[source]#
hydro.exchange.TIME[source]#
hydro.exchange.LATITUDE[source]#
hydro.exchange.LONGITUDE[source]#
hydro.exchange.CTDPRS[source]#
hydro.exchange.BTLNBR[source]#
hydro.exchange.COORDS[source]#
hydro.exchange.FLAG_SCHEME: dict[str, type[flags.ExchangeFlag]][source]#
hydro.exchange.GEOMETRY_VARS = ('expocode', 'station', 'cast', 'section_id', 'time')[source]#
hydro.exchange.FILLS_MAP[source]#
hydro.exchange.FileTypes[source]#
class hydro.exchange.FileType(*args, **kwds)[source]#

Bases: enum.Enum

Create a collection of name/value pairs.

Example enumeration:

>>> class Color(Enum):
...     RED = 1
...     BLUE = 2
...     GREEN = 3

Access them by:

  • attribute access:

>>> Color.RED
<Color.RED: 1>
  • value lookup:

>>> Color(1)
<Color.RED: 1>
  • name lookup:

>>> Color['RED']
<Color.RED: 1>

Enumerations can be iterated over, and know how many members they have:

>>> len(Color)
3
>>> list(Color)
[<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]

Methods can be added to enumerations, and members can have their own attributes – see the documentation for details.

CTD = 'C'[source]#
BOTTLE = 'B'[source]#
hydro.exchange.WHPNameIndex[source]#
hydro.exchange.WHPParamUnit[source]#
hydro.exchange._has_no_nones(val)[source]#
Parameters:

val (list[str | None]) –

Return type:

TypeGuard[list[str]]

hydro.exchange._transform_whp_to_csv(params, units)[source]#
Parameters:
Return type:

list[str]

hydro.exchange._get_params(params_units)[source]#
Parameters:

params_units (collections.abc.Iterable[str]) –

Return type:

tuple[WHPNameIndex, WHPNameIndex, WHPNameIndex]

hydro.exchange._ctd_get_header(line, dtype=str)[source]#
hydro.exchange._is_all_dataarray(val)[source]#
Parameters:

val (list[Any]) –

Return type:

TypeGuard[list[xarray.DataArray]]

hydro.exchange.flatten_cdom_coordinate(dataset)[source]#

Takes the a dataset with a CDOM wavelength and explocdes it back into individual variables

Parameters:

dataset (xarray.Dataset) –

Return type:

xarray.Dataset

hydro.exchange.add_cdom_coordinate(dataset)[source]#

Find all the paraters in the cdom group and add their wavelength in a new coordinate

Parameters:

dataset (xarray.Dataset) –

Return type:

xarray.Dataset

hydro.exchange.add_geometry_var(dataset)[source]#

Adds a CF-1.8 Geometry container variable to the dataset

This allows for compatabiltiy with tools like gdal

Parameters:

dataset (xarray.Dataset) –

Return type:

xarray.Dataset

hydro.exchange.add_profile_type(dataset, ftype)[source]#

Adds a profile_type string variable to the dataset.

This is for ODV compatability

Warning

Currently mixed profile types are not supported

Parameters:
Return type:

xarray.Dataset

hydro.exchange.finalize_ancillary_variables(dataset)[source]#

Turn the ancillary variable attr into a space seperated string

It is nice to have the ancillary variable be a list while things are being read into it

Parameters:

dataset (xarray.Dataset) –

hydro.exchange.combine_bottle_time(dataset)[source]#

Combine the bottle dates and times if present

Raises if only one is present

Parameters:

dataset (xarray.Dataset) –

hydro.exchange.check_is_subset_shape(a1, a2, strict='disallowed')[source]#

Ensure that the shape of the data in a2 is a subset (or strict subset) of the data shape of a1

For a given set of param, flag, and error arrays you would want to ensure that:

  • errors are a subset of params (strict is allowed)

  • params are a subset of flags (strict is allowed)

For string vars, the empty string is considered the “nothing” value. For woce flags, flag 9s should be converted to nans (depending on scheme flag 5 and 1 may not have param values)

Return a boolean array of invalid locations

Parameters:
  • a1 (numpy.typing.NDArray) –

  • a2 (numpy.typing.NDArray) –

Return type:

numpy.typing.NDArray[numpy.bool_]

hydro.exchange.check_flags(dataset, raises=True)[source]#

Check WOCE flag values agaisnt their param and ensure that the param either has a value or is “nan” depedning on the flag definition.

Return a boolean array of invalid locations?

Parameters:

dataset (xarray.Dataset) –

class hydro.exchange._ExchangeData[source]#

Dataclass containing exchange data which has been parsed into ndarrays

single_profile: bool[source]#
param_cols: dict[cchdo.params.WHPName, numpy.ndarray][source]#
flag_cols: dict[cchdo.params.WHPName, numpy.ndarray][source]#
error_cols: dict[cchdo.params.WHPName, numpy.ndarray][source]#
param_precisions: dict[cchdo.params.WHPName, numpy.typing.NDArray[numpy.int_]][source]#
error_precisions: dict[cchdo.params.WHPName, numpy.typing.NDArray[numpy.int_]][source]#
comments: str[source]#
__post_init__()[source]#
set_expected(params, flags, errors)[source]#

Puts fill columns for expected params which are missing

This can occur when there are disjoint columns in CTD files

Parameters:
  • params (set[cchdo.params.WHPName]) –

  • flags (set[cchdo.params.WHPName]) –

  • errors (set[cchdo.params.WHPName]) –

split_profiles()[source]#

Split into single profile containing _ExchangeData instances

Done by looking at the expocode+station+cast composate keys

str_lens()[source]#

Figure out the length of all the string params

The char size can vary by platform.

Return type:

dict[cchdo.params.WHPName, int]

hydro.exchange._get_fill_locs(arr, fill_values=('-999',))[source]#
Parameters:

fill_values (tuple[str, Ellipsis]) –

class hydro.exchange._ExchangeInfo[source]#

Low level dataclass containing the parts of an exchange file

property stamp[source]#

Returns the filestamp of the exchange file

e.g. “BOTTLE,20210301CCHSIOAMB”

property comments[source]#

Returns the comments of the exchange file with leading # stripped

property ctd_headers[source]#

Returns a dict of the CTD headers and their value

property data[source]#

Returns the data block of an exchange file as a tuple of strs. One line per entry.

property post_data[source]#

Returns any post data content as a tuple of strs

property whp_params[source]#
property whp_flags[source]#

Parses the params and units for flag values

returns a dict with a WHPName to column index of flags mapping

property whp_errors[source]#

Parses the params and units for uncertanty values

returns a dict with a WHPName to column index of errors mapping

property _np_data_block[source]#
stamp_slice: slice[source]#
comments_slice: slice[source]#
ctd_headers_slice: slice[source]#
params_idx: int[source]#
units_idx: int[source]#
data_slice: slice[source]#
post_data_slice: slice[source]#
_raw_lines: tuple[str, Ellipsis][source]#
_ctd_override: bool = False[source]#
params()[source]#

Returns a list of all parameters in the file (including CTD “headers”)

units()[source]#

Returns a list of all the units in the file (including CTD “headers”)

Will have the same shape as params

_whp_param_info()[source]#

Parses the params and units for base parameters

Returns a dict with a WHPName to column index mapping

finalize(fill_values=('-999',), precision_source='file')[source]#

Parse all the data into ndarrays of the correct dtype and shape

Returns an ExchangeData dataclass

Return type:

_ExchangeData

classmethod from_lines(lines, ftype)[source]#

Figure out the line numbers/indicies of the parts of the exchange file

Parameters:
hydro.exchange.extract_numeric_precisions(data)[source]#

Get the numeric precision of a printed decimal number

Parameters:

data (list[str] | numpy.typing.NDArray[numpy.str_]) –

Return type:

numpy.typing.NDArray[numpy.int_]

hydro.exchange._is_valid_exchange_numeric(data)[source]#
Parameters:

data (numpy.typing.NDArray[numpy.str_]) –

Return type:

numpy.bool_

hydro.exchange.ExchangeIO[source]#
hydro.exchange._combine_dt_ndarray(date_arr, time_arr=None, time_pad=False)[source]#
Parameters:
  • date_arr (numpy.typing.NDArray[numpy.str_]) –

  • time_arr (numpy.typing.NDArray[numpy.str_] | None) –

Return type:

numpy.ndarray

hydro.exchange.sort_ds(dataset)[source]#

Sorts the data values in the dataset

Ensures that profiles are in the following order:

  • Earlier before later (time will increase)

  • Southerly before northerly (latitude will increase)

  • Westerly before easterly (longitude will increase)

The two xy sorts are esentially tie breakers for when we are missing “time”

Inside profiles:

  • Shallower before Deeper (pressure will increase)

Parameters:

dataset (xarray.Dataset) –

Return type:

xarray.Dataset

hydro.exchange.check_sorted(dataset)[source]#

Check that the dataset is sorted by the rules in sort_ds()

Parameters:

dataset (xarray.Dataset) –

Return type:

bool

hydro.exchange.WHPNameAttr[source]#
hydro.exchange.combine_dt(dataset, is_coord=True, date_name=DATE, time_name=TIME, time_pad=False)[source]#

Combine the exchange style string variables of date and optinally time into a single variable containing real datetime objects

This will remove the time variable if present, and replace then rename the date variable. Date is replaced/renamed to maintain variable order in the xr.DataSet

Parameters:
  • dataset (xarray.Dataset) –

  • is_coord (bool) –

  • date_name (cchdo.params.WHPName) –

  • time_name (cchdo.params.WHPName) –

Return type:

xarray.Dataset

hydro.exchange.set_axis_attrs(dataset)[source]#

Set the CF axis attribute on our axis variables (XYZT)

  • longitude = “X”

  • latitude = “Y”

  • pressure = “Z”, addtionally, positive is down

  • time = “T”

Parameters:

dataset (xarray.Dataset) –

Return type:

xarray.Dataset

hydro.exchange.set_coordinate_encoding_fill(dataset)[source]#

Sets the _FillValue encoidng to None for 1D coordinate vars

Parameters:

dataset (xarray.Dataset) –

Return type:

xarray.Dataset

hydro.exchange._load_raw_exchange(filename_or_obj, *, file_seperator=None, keep_seperator=True)[source]#
Parameters:
  • filename_or_obj (ExchangeIO) –

  • file_seperator (str | None) –

Return type:

list[str]

hydro.exchange.all_same(ndarr)[source]#

Test if all the values of an ndarray are the same value

Parameters:

ndarr (numpy.ndarray) –

Return type:

numpy.bool_

class hydro.exchange.CheckOptions[source]#

Bases: TypedDict

Flags and config that controll how strict the file checks are

flags: bool[source]#
hydro.exchange.read_csv(filename_or_obj, *, fill_values=('-999',), ftype=FileType.BOTTLE, checks=None, precision_source='file')[source]#
Parameters:
  • filename_or_obj (ExchangeIO) –

  • ftype (FileType | FileTypes) –

  • checks (CheckOptions | None) –

Return type:

xarray.Dataset

hydro.exchange.read_exchange(filename_or_obj, *, fill_values=('-999',), checks=None, precision_source='file', file_seperator=None, keep_seperator=True)[source]#

Loads the data from filename_or_obj and returns a xr.Dataset with the CCHDO CF/netCDF structure

Parameters:
  • filename_or_obj (ExchangeIO) –

  • checks (CheckOptions | None) –

Return type:

xarray.Dataset

hydro.exchange._from_exchange_data(exchange_data, *, ftype=FileType.BOTTLE, checks=None)[source]#
Parameters:
Return type:

xarray.Dataset