matcal.core.data
The data module contains classes and functions for converting data into the structure that MatCal requires for studies.
Functions
|
Given a list of |
Converts a MatCal |
|
|
Takes a dictionary and attempts to create a MatCal |
|
Scales all data sets in a data collection that have the requested field. |
Classes
This data conditioner will condition data such that each field from the initializing data list is on the order of -1 to 1. |
|
|
Data is the base data structure for all MatCal data. |
|
A collection of |
|
This class can be used to calculate basic statistics on the data in a data collection by state and field. |
This is the base class for MatCal data conditioners. |
|
This data conditioner will condition data such that each field from the initializing data list is in the range of -1 to 1. |
|
This data conditioner will condition data such that each field from the initializing data list is in the range of 0 to 1. |
|
This data conditioner will make no changes to the data sets included in the evaluation set. |
|
|
This class is used to apply a scaling multiplier and an offset to a specific field of a |
|
A collection of |
- class matcal.core.data.Data(data, state=<matcal.core.state.SolitaryState object>, name=None)[source]
Data is the base data structure for all MatCal data. This data structure is an interface to data that are used for MatCal studies. It is derived from a NumPy ndarrays but adds name and state, so that the data can be uniquely identified.
Construction / initialization
Data may only be constructed from:
A NumPy structured/record array (i.e., an
np.ndarraywithdtype.names is not Noneor annp.record), orA
dict/OrderedDictmapping field names to array-like values. If a dictionary is passed, it is converted usingconvert_dictionary_to_data().
Passing anything else (including a plain/unstructured
np.ndarray) raises a built-inTypeError.Accessing fields through field names returns the data for that field in either 1D or 2D arrays. If the data is ‘global’ such as time or load, the data will be reported as a 1D [n_times] array. If the data is field based the data is reported back as a 2D [n_times, n_points] array.
- Parameters:
data (numpy.ndarray | numpy.record | dict | OrderedDict) –
data to be added to the MatCal data object. Must be either: (1) a NumPy structured/record array, or (2) a dict/OrderedDict of field_name -> array-like, which will be converted
using
convert_dictionary_to_data().state (
State) – the state associated with the data. If none is passed it will be assigned the default state.name (str) – the name for the data. By default it is set to “data_set_#” name with a unique id number. If
FileData()is used to import data, then its name is set to the filename from which the data was imported.
- set_state(state)[source]
Sets the optional state value for the data.
- Parameters:
state (
State) – The state for this particular data set.
- set_name(name)[source]
Sets the optional name value for the data. If the data is imported using
FileData(), the name is set to the filename from which the data was imported. If no name is passed and the data was created from the constructor or another function, an arbitrary name will be given to the data.- Parameters:
name (str) – The name for this particular data set.
- add_field(field_name, data)[source]
Adds a new 1D field to the data and returns the updated data. The original data object is not modified. The added field must have the same length as the existing fields.
- Parameters:
field_name (str) – The name of the new field to be added.
data (ArrayLike) – the data to be added.
- Returns:
the data with newly added field
- Return type:
~matcal.core.data.Data
- property length
- Returns:
The length of the data for each field.
- Return type:
int
- property state
- Returns:
The physical state of the data corresponding to the experimental conditions.
- Return type:
- property field_names
- Returns:
list of strings of all field names.
- Return type:
list
- property name
Returns the name for the data. If the data is imported using
FileData(), the name is set to the filename from which the data was imported. If no name is passed and the data was created from the constructor or another function, an arbitrary name will be given to the data.- Rtype name:
str
- remove_field(field)[source]
Returns a copy of the Data class with the desired field removed. The original data object is not modified.
- Return type:
- rename_field(old_name, new_name)[source]
Returns the Data class with the desired the field name changed. Note that the old name is overwritten and not saved.
- Parameters:
old_name (str) – the old field name that is to be updated
new_name (str) – the replacement field name for the field name that is being changed.
- class matcal.core.data.DataCollection(name, *data_sets)[source]
A collection of
Dataobjects to be used for a study. No restrictions are enforced on the type or contact ofDataobjects added to the collection. However, they are meant to hold data that is related by experiment and should generally have the same if not similar fields.Exceptions to this rule may be when two different types of data are taken from the same experiment using different data acquisition hardware. In this case it may make sense to store
Dataobjects in a data collection with different fields.Warning
Not all MatCal objects or methods support data collections with
Dataobjects that contain different field names. Appropriate errors should be used if such data collections are passed to them.- Parameters:
- Raises:
CollectionValueError – If name is a an empty string.
CollectionTypeError – If name is not a string and the data objects to be added to the collection are not of the correct type.
- property field_names
- Returns:
a list of field names that exist in the data collection. These may not exist in all data objects or states and may only be in one data object in the collection.
- property state_names
- Returns:
the names of the
Stateobjects in the data collection.- Return type:
list(str)
- state_field_names(state)[source]
Return all the field names in all Data objects for the given state. Note that not all Data objects need to have all field names. This is just a comprehensive list of field names that exist across all Data objects in the DataCollection for this state.
- Parameters:
state (str or
State) – the state of interest to get all field names for- Returns:
a list of all field names
- Return type:
list(str)
- state_common_field_names(state)[source]
Return all the field names common to all Data objects for the given state.
- Parameters:
state (str or
State) – the state of interest to get all field names for- Returns:
a list of all field names that are common to all data sets for that state
- Return type:
list(str)
- add(item)[source]
Add a
Dataobject to a data collection.- Parameters:
item (
Data) – Data object to be added to the data collection.
- remove_field(field_name)[source]
Removes the field from all data sets stored in the data collection that have the passed field name. If the data collection does not have any data sets with the specified field name, a warning will be sent to MatCal output.
- Parameters:
field_name (str) – the name of the field to remove
- plot(independent_field: str, dependent_field: str, plot_function=None, figure=None, show: bool = True, labels: str = None, state: State = None, block: bool = True, **kwargs) None[source]
Plots the data with the independent field on the horizontal axis and dependent field on the vertical axis. It plots each state on a separate figure.
- Parameters:
independent_field (str) – field name to use as horizontal axis variable.
dependent_field (str) – field name to use as vertical axis variable.
plot_function (matplotlib plot function) – a valid matplotlib plot function such as plot, semilogx, etc
figure (matplotlib Figure) – a valid matplotlib Figure for the data collection to be plotted on.
show (bool) – option to show or not show plot
labels (str) – provide a label for each data set other than the data set name. This can take the form of “suppress”, “{user_provided_label}” or “{user_provided_label} (#)”. If “suppress” is passed, none of the data will be labeled. If “{user_provided_label}” is passed, the first data set will be labeled once as “{user_provided_label}” where “user_provided_label” can be any user provided string. The rest will not be labeled. If the last option is used, where labels=”{user_provided_label} (#)”, each data set will be labeled with “{user_provided label}” and a number based on the order it is pulled from the data set. For example, a data collection with three data sets and this function called with labels=”experiment (#)”, the labels will be “experiment 1”, “experiment 2”, “experiment 3”.
state (
Stateor str) – specify a specific state to plot using the state name or state objectblock (bool) – stops Python from executing code after the plot figure is created. Follow-on code will not execute until the figure is closed. Default is to block (e.g. block=True).
kwargs (dict(str, str)) – a set of valid keyword argument pairs for the Matplotlib plotting function
- get_data_by_state_values(**kwargs)[source]
Get a
DataCollectioncontaining data that has the state variables with values passed into this method.- Parameters:
kwargs (dict(str, str or float)) – keyword/value pairs of the desired state variables
- Returns:
all data in the data collection that have states with the state variable and values specified in kwargs.
- Return type:
- get_states_by_state_values(**kwargs)[source]
Get a
StateCollectioncontaining the states with the state variable values passed into this method.- Parameters:
kwargs (dict(str, str or float)) – keyword/value pairs of the desired state variables
- Returns:
a state collection that has all states with the state variables and values specified in kwargs.
- Return type:
- report_statistics(independent_field: str) dict[source]
Get a summary of the statistics information. The method will report the mean and standard deviation for all dependent fields across the independent within each state. The data will be collocated to a common set of locations within the independent field. Statistics near the limits of the independent field range may be less accurate than of those in the interior because of errors due to extrapolation that may occur in the collocation process.
- Parameters:
independent_field (str) – The string to designate which field should be interpreted as the independent field.
- Returns:
a dictionary that contains the statistical measurements of the data fields. the data is organized by [field_name][state_name][stat_name]
- Return type:
dict
- dict()
- Returns:
the collection as a dictionary of items with name/value pairs.
- classmethod get_collection_type()
- Returns:
the data type the collection stores
- get_item_names()
- Returns:
a list of the names of all items added to the collection.
- get_number_of_items()
- Returns:
the number of items in the collection
- items()
- Returns:
a list of tuples of key, value pairs contained in the collection.
- keys()
- Returns:
a list of all available keys in the collection.
- property name
- Returns:
the name of the collection
- Return type:
str
- set_name(name)
Sets the name of the collection.
- Parameters:
name (str) – the new collection name
- values()
- Returns:
a list of all values in the collection.
- class matcal.core.data.Scaling(field, scalar=1, offset=0)[source]
This class is used to apply a scaling multiplier and an offset to a specific field of a
Dataclass. The offset is applied first, followed by the scale factor.- Parameters:
field (str) – The name of the field to be scaled.
scalar (float) – The magnitude of the scaling to be applied to the specified field.
offset – The magnitude of the offset to be applied to the specified field.
- Raises:
TypeError – If the scaling object name and the field names are not strings.
TypeError – If the scalar value passed in is not a number.
- property field
- Returns:
The name of the field to be scaled by the scaling object.
- Return type:
str
- set_scalar(value)[source]
Sets the scalar value to a different value if needed.
- Parameters:
value (float) – the new scalar value for the scaling object.
- property scalar
- Returns:
the scaling value for the scaling object.
- property offset
- Returns:
the offset value for the scaling object.
- class matcal.core.data.ScalingCollection(name, *scalings)[source]
A collection of
Scalingobjects. This is used to combine multiple scaling objects so that more than one scaling function or value can be applied to a data set. This class is used when applying different scaling functions or values to different fields within a data set.- Parameters:
name (str) – the name for the scaling collection used for identification for error catching.
scalings (list(
Scaling)) – The scaling items to be added to the collection. They can be passed in as comma separated list or an unpacked list. Unpack a list using *list_name.
- Raises:
CollectionValueError – If name is an empty string.
CollectionTypeError – If name is not a string and the scalings to be added to the collection are not of the correct type.
- add(scaling)[source]
Adds a
Scalingobject to the collection.- Parameters:
scaling (
Scaling) – scaling object to be added to the collection
- dict()
- Returns:
the collection as a dictionary of items with name/value pairs.
- classmethod get_collection_type()
- Returns:
the data type the collection stores
- get_item_names()
- Returns:
a list of the names of all items added to the collection.
- get_number_of_items()
- Returns:
the number of items in the collection
- items()
- Returns:
a list of tuples of key, value pairs contained in the collection.
- keys()
- Returns:
a list of all available keys in the collection.
- property name
- Returns:
the name of the collection
- Return type:
str
- set_name(name)
Sets the name of the collection.
- Parameters:
name (str) – the new collection name
- values()
- Returns:
a list of all values in the collection.
- class matcal.core.data.DataConditionerBase[source]
This is the base class for MatCal data conditioners. The data conditioners attempt to modify all data sets for a state in a single evaluation set such that the experimental data is on the order of -1 to 1. The data is modified according to:
where
is a vector created from all data sets included in a single state,
is a scalar data offset calculated from
, and
is a scalar scale factor calculated from
. If
after it is calculated, the base conditioner class will change the scale factor such that
or the average of the absolute value of the relevant data. If
is still near zero, then the vector is full of zero or near zero values and the base conditioner sets the scale factor to
The calculation of
and
is specific to the derived conditioner class. The abstract methods
get_scale_for_data_field()andget_scale_for_data_field()define the calculations forand
. A custom user class can be defined to implement conditioning of the user’s choice by including only the implementation of these methods.
- apply_to_data(passed_data)[source]
Apply the conditioner to a data set. This can be any data set and does not need to be the one that was used to initialize the data set.
If a field name in a the data set passed to this method was not in the data set used to initialize the conditioner, the passed data field is returned unchanged.
- Parameters:
passed_data (
Data) – a data set to be conditioned using an initialized conditioner.
- initialize_data_conditioning_values(data_list)[source]
Initialize the conditioner for a given list of data sets from a single state of a data collection.
- Param:
list of data sets to be used for conditioning. Generally passed as a
__getitem_of a state from aDataCollection.- Parameters:
type – list(
Data)
- abstractmethod get_scale_for_data_field(field_data)[source]
Calculates the scale factor
for the data conditioner given all values for a specific field name from the data collection for a single state. This scale factor will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- abstractmethod get_offset_for_data_field(field_data)[source]
Calculates the offset
for the data conditioner given all values for a specific field name from the data collection for a single state. This offset will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- class matcal.core.data.ReturnPassedDataConditioner[source]
This data conditioner will make no changes to the data sets included in the evaluation set. Its scale and offset values are given by
and
- get_scale_for_data_field(field_data)[source]
Calculates the scale factor
for the data conditioner given all values for a specific field name from the data collection for a single state. This scale factor will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- get_offset_for_data_field(field_data)[source]
Calculates the offset
for the data conditioner given all values for a specific field name from the data collection for a single state. This offset will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- apply_to_data(passed_data)
Apply the conditioner to a data set. This can be any data set and does not need to be the one that was used to initialize the data set.
If a field name in a the data set passed to this method was not in the data set used to initialize the conditioner, the passed data field is returned unchanged.
- Parameters:
passed_data (
Data) – a data set to be conditioned using an initialized conditioner.
- initialize_data_conditioning_values(data_list)
Initialize the conditioner for a given list of data sets from a single state of a data collection.
- Param:
list of data sets to be used for conditioning. Generally passed as a
__getitem_of a state from aDataCollection.- Parameters:
type – list(
Data)
- class matcal.core.data.RangeDataConditioner[source]
This data conditioner will condition data such that each field from the initializing data list is in the range of 0 to 1. To do so the scale and offset values are calculated as
and
.
- get_scale_for_data_field(field_data)[source]
Calculates the scale factor
for the data conditioner given all values for a specific field name from the data collection for a single state. This scale factor will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- get_offset_for_data_field(field_data)[source]
Calculates the offset
for the data conditioner given all values for a specific field name from the data collection for a single state. This offset will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- apply_to_data(passed_data)
Apply the conditioner to a data set. This can be any data set and does not need to be the one that was used to initialize the data set.
If a field name in a the data set passed to this method was not in the data set used to initialize the conditioner, the passed data field is returned unchanged.
- Parameters:
passed_data (
Data) – a data set to be conditioned using an initialized conditioner.
- initialize_data_conditioning_values(data_list)
Initialize the conditioner for a given list of data sets from a single state of a data collection.
- Param:
list of data sets to be used for conditioning. Generally passed as a
__getitem_of a state from aDataCollection.- Parameters:
type – list(
Data)
- class matcal.core.data.MaxAbsDataConditioner[source]
This data conditioner will condition data such that each field from the initializing data list is in the range of -1 to 1. To do so, the scale values are calculated as
and
. Note that this only guarantees the data will be in the range of -1 to 1, it does not enforce that the data spans the entirety of -1 to 1.
- get_scale_for_data_field(field_data)[source]
Calculates the scale factor
for the data conditioner given all values for a specific field name from the data collection for a single state. This scale factor will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- get_offset_for_data_field(field_data)[source]
Calculates the offset
for the data conditioner given all values for a specific field name from the data collection for a single state. This offset will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- apply_to_data(passed_data)
Apply the conditioner to a data set. This can be any data set and does not need to be the one that was used to initialize the data set.
If a field name in a the data set passed to this method was not in the data set used to initialize the conditioner, the passed data field is returned unchanged.
- Parameters:
passed_data (
Data) – a data set to be conditioned using an initialized conditioner.
- initialize_data_conditioning_values(data_list)
Initialize the conditioner for a given list of data sets from a single state of a data collection.
- Param:
list of data sets to be used for conditioning. Generally passed as a
__getitem_of a state from aDataCollection.- Parameters:
type – list(
Data)
- class matcal.core.data.AverageAbsDataConditioner[source]
This data conditioner will condition data such that each field from the initializing data list is on the order of -1 to 1. To do so, the scale values are calculated as
and
. Note that this likely puts the all data in the field on the order of -1 to 1, but the data could be well outside of this range depending on the values in the data.
- get_scale_for_data_field(field_data)[source]
Calculates the scale factor
for the data conditioner given all values for a specific field name from the data collection for a single state. This scale factor will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- get_offset_for_data_field(field_data)[source]
Calculates the offset
for the data conditioner given all values for a specific field name from the data collection for a single state. This offset will be used to condition all data with this state and field name when compared using an evaluation set.
- Parameters:
field_data (ArrayLike) – all data for a specific field from a single state of a data collection used to calculate an objective in an evaluation set.
- apply_to_data(passed_data)
Apply the conditioner to a data set. This can be any data set and does not need to be the one that was used to initialize the data set.
If a field name in a the data set passed to this method was not in the data set used to initialize the conditioner, the passed data field is returned unchanged.
- Parameters:
passed_data (
Data) – a data set to be conditioned using an initialized conditioner.
- initialize_data_conditioning_values(data_list)
Initialize the conditioner for a given list of data sets from a single state of a data collection.
- Param:
list of data sets to be used for conditioning. Generally passed as a
__getitem_of a state from aDataCollection.- Parameters:
type – list(
Data)
- matcal.core.data.combine_data_sets_in_data_list(data_list)[source]
Given a list of
Dataobjects, this function will return a dictionary where each item is all values from the same field in from all data sets and the key for the items are the field names.- Parameters:
data_list (list(
Data)) – list of data sets that will be combined.
- matcal.core.data.scale_data_collection(data_collection, field_name, scale, offset=0)[source]
Scales all data sets in a data collection that have the requested field. It will apply the correct scale factor and offset to each data set and return a new data collection that is scaled. Note that if both are used, the offset is applied first and then the results are scaled. A new scaled data collection is returned and the old one is unmodified.
- Parameters:
data_collection (
DataCollection) – the data collection to be scaledfield_name – the name of the field to be modified
scale (float) – a linear scale factor to scale the field
offset (float) – a constant offset to be added to the field
- Returns:
new scaled data collection
- Return type:
- matcal.core.data.convert_data_to_dictionary(data)[source]
Converts a MatCal
Dataclass into a dictionary of np.arrays.- Parameters:
data (
Data) – a MatCal data set- Returns:
dictionary conversion of the data object
- Return type:
OrderedDict
- matcal.core.data.convert_dictionary_to_data(dict_data)[source]
Takes a dictionary and attempts to create a MatCal
Dataobject. The keys for the dictionary are expected to be strings for the field names and the values are expected to be valid numeric or string data.- Parameters:
dict_data (dict or OrderedDict) – a dictionary with field names as keys and the data values as the dictionary values.
- Returns:
a Data object with the default state
SolitaryState.- Return type: