Dataset

class typhon.datasets.dataset.Dataset(name=None, **kwargs)[source]

Represents a dataset.

This is an abstract class. More specific subclasses are SingleFileDataset and MultiFileDataset.

To add a dataset, subclass one of the subclasses of Dataset, such as MultiFileDataset, and implement the abstract methods.

Dataset objects have a limited number of attributes. To limit the occurence of bugs, dynamically setting non-pre-existing attributes is limited. Attributes can be set either by passing keyword arguments when creating the object, or by setting the appropriate field in your typhon configuration file (such as .typhonrc). The configuration section will correspond to the object name, the key to the attribute, and the value to the value assigned to the attribute. See also typhon.config.

start_date

Starting date for dataset. May be used to search through ALL granules. WARNING! If this is set at a time t_0 before the actual first measurement t_1, then the collocation algorith (see CollocatedDataset) will conclude that there are 0 collocations in [t_0, t_1], and will not realise if data in [t_0, t_1] are actually added later!

Type:

datetime.datetime or numpy.datetime64

end_date

Similar to start_date, but for ending.

Type:

datetime.datetime or numpy.datetime64

name

Name for the dataset. Used to make sure there is only a single dataset with the same name for any particular dataset. If a dataset is initiated with a pre-exisitng name, the previous product is called.

Type:

str

aliases

Aliases for field. Dictionary can be useful if you want to programmatically loop through the same field for many different datasets, but they are named differently. For example, an alias could be “ch4_profile”.

Type:

Mapping[str, str]

unique_fields

Set of fields that make any individual measurement unique. For example, the default value is {“time”, “lat”, “lon”}.

Type:

Container[str]

related

Dictionary whose keys may refer to other datasets with related information, such as DMPs or flags.

Type:

Mapping[str, Dataset]

__init__(**kwargs)[source]

Initialise a Dataset object.

All keyword arguments will be translated into attributes. Does not take positional arguments.

Note that if you create a dataset with a name that already exists, the existing object is returned, but __init__ is still called (Python does this, see https://docs.python.org/3.7/reference/datamodel.html#object.__new__).

Methods

__init__(**kwargs)

Initialise a Dataset object.

as_xarray_dataset()

combine(my_data, other_obj[, other_data, ...])

Combine with data from other dataset.

find_granules([start, end, include_last_before])

Loop through all granules for indicated period.

find_granules_sorted([start, end])

Yield all granules sorted by starting time then ending time.

find_most_recent_granule_before(instant, ...)

Find granule covering instant

get_additional_field(M, fld)

Get additional field.

read([f, fields, pseudo_fields])

Read granule in file and do some other fixes

read_period([start, end, onerror, fields, ...])

Read all granules between start and end, in bulk.

setlocal()

Set local attributes, from config or otherwise.

Attributes

aliases

concat_coor

default_orbit_filters

end_date

mandatory_fields

maxsize

my_pseudo_fields

name

read_returns

related

section

start_date

time_field

unique_fields