matcal.core.calibration_studies

Classes

ScipyLeastSquaresStudy(*parameters[, method])

This study class is the MatCal interface to the Scipy least_squares function.

ScipyMinimizeStudy(*parameters[, method])

This study class is the MatCal interface to the Scipy minimize function.

class matcal.core.calibration_studies.ScipyMinimizeStudy(*parameters, method=None, **kwargs)[source]

This study class is the MatCal interface to the Scipy minimize function. It can be used to perform local calibrations to objective functions that are generally smooth and convex. It has access to both gradient based methods and gradient free methods. We support all Scipy minimize methods except for the trust-krylov method. For methods that require Hessians and/or gradients, we use an internal finite difference algorithm so that we can take advantage of parallelism for expensive objective function evaluations. However, if desired, the jac and hess keyword arguments can be used to override using the MatCal finite difference algorithm and use any valid Scipy option.

Note

MatCal’s finite difference steps do not currently adhere to bounds or constraints.

We default to the Scipy minimize l-bfgs-b method that is a gradient method using only finite difference gradients (no Hessian) and enforces upper and lower bounds.

Parameters:
  • parameters (list(Parameter) or ParameterCollection) – The parameters of interest for the study.

  • method (str) – specify a specific method that is valid for the Scipy optimize function used by the study

  • kwargs (dict(str, float or str or dict(str, float or str))) – pass valid keyword arguments for the chosen method. The ‘bounds’ keyword argument is set by MatCal and cannot be used.

add_evaluation_set(model, objectives, data=None, states=None, data_conditioner_class=<class 'matcal.core.data.MaxAbsDataConditioner'>)

Adds an evaluation set to the study. An evaluation set is a set of datasets, objectives and states that are applicable to a model. For each evaluation set, the model will be evaluated for every state in the set. The results from each model state will be compared to each dataset its state. This comparison consists of each objective in the passed objectives.

Parameters:
  • model (valid model type from the models module) – The model that will generate results for comparison to the data in the set.

  • objectives (Objective or ObjectiveCollection) – The objectives to quantitatively compare the model results to the data.

  • data (Data or DataCollection) – The data to be evaluated with this evaluation set. Data is not required when this method is called with a SimulationResultsSynchronizer.

  • states (State or StateCollection) – A subset of states in the data that are of interest for this study.

  • data_conditioner_class – the class that will be used as a data conditioner for this evaluation set. See data for valid data conditioners.

Raises:
  • StudyTypeError – if passed arguments are of the incorrect type.

  • StudyError – if all the passed states are not in the data.

add_parameter_preprocessor(parameter_preprocessor)

Add a parameter preprocessor to the study that will operate on the parameters before they are sent to the models. See UserDefinedParameterPreprocessor.

Parameters:

parameter_preprocessor (UserDefinedParameterPreprocessor) – the parameter preprocessor that will modify and update the given model parameters

property final_results_filename

Returns the filename for the final results file for the current study.

return: final results filename as an absolute path rtype: str

launch()

Scipy calibration studies return calibration information in an OptimizeResult object. This includes the best fit parameter set, the final objective, and the Jacobian and/or Hessian of the objective if available. It also includes other useful information such as messages related to the success or failure of the chosen method, number of evaluations, and number of method iterations. For more information on what is returned in an OptimizeResult object for a given method see the Scipy documentation.

plot_progress()

Calling this method will cause matcal to generate automatic plots after each batch of parameter evaluations. These plots are made using the standard plotter and will show things such as objective value evolution.

restart()

Restarts not supported with Scipy studies.

property results

Return access to the study’s results. Will return None, if study has not been run.

run_in_serial()

Tell MatCal to run evaluations in serial. This is only recommended if the study is serial, like a MCMC Bayes Study, and the model evaluations are fast, like a python model.

Running in serial avoids the overhead of reloading large data sets that are necessary in async studies.

set_cleanup_mode(new_pruner: DirectoryPrunerBase)

Changes the pruner to the object passed as an argument

set_core_limit(core_limit, override_max_limit=False)

Sets the total number of cores that the study may use.

Parameters:
  • core_limit (int) – The max number of cores that the study can use at any time.

  • override_max_limit – Override the default max cores that can be specified for a given study. The current limit of 500 is recommended by the MatCal team but might not be best for all cases.

Raises:

StudyTypeError – if the passed value is not an int.

set_parameters(*parameters)
Parameters:

parameters (Parameter or ParameterCollection) – The parameters of interest for the study.

Raises:

StudyTypeError – if the parameters are of incorrect type.

set_results_storage_options(data: bool = True, qois: bool = True, residuals: bool = True, objectives: bool = True, weighted_conditioned: bool = False, results_save_frequency: int = 1)

Set which history information to save and return with the study results. You can also down sample which evaluations to save using results_save_frequency. This is particularly useful if you wish to not store finite difference evaluations for gradient based studies. The total objective is always stored.

Parameters:
  • data (bool) – Store the raw data for each simulation and the raw experimental data for each objective for each desired evaluation.

  • qois (bool) – Store the QoIs for each objective for each desired evaluation. This includes both experiment and simulation QoIs

  • residuals (bool) – Store the residuals for each objective for each desired evaluation.

  • objectives (bool) – Store the objective by state and evaluation set for each desired evaluation.

  • weighted_conditioned (bool) – Store the weighted and conditioned values for each desired evaluation. This will save the weighted and conditioned, residuals, simulation qois and experiment qois.

  • results_save_frequency (int) – Set how the results save interval. For studies where finite difference derivatives are used, an interval of n+1 will exclude finite difference results from the saved results history.

set_step_size(step_size=5e-05)

When a MatCal calculated finite difference gradient or hessian is used for a method, this will set the finite difference relative step size.

Warning

If using Scipy finite difference methods, use the appropriate keyword argument for the method in the study __init__ to set the step size, not this method.

Parameters:

step_size (float) – the relative step size desired for finite difference gradients and hessians.

set_use_threads(always_use_threads=False)

By default, MatCal assumes that the model being run is CPU intensive. As a result, it runs each model in a subprocess which can result in some additional overhead. If running studies cheaper python models, it may be beneficial to use threading instead of a subprocess. Using this method will run the study with threading if only one model can be evaluated at a time. You can optionally run with threads even with concurrent model evaluations with the “always_use_threads” option; however, this can be less reliable. For large memory calibrations, we always recommend using subprocess.

Finally, any external executable is always run using subprocess, but threading can be use to manage that job and return its results.

Parameters:

always_use_threads (bool) – if true, MatCal will use threads over subprocess for concurrent modeling jobs. Defaults to False.

set_working_directory(working_directory, remove_existing=False)

By default, MatCal runs in the current working directory. This method allows the user to specify a subdirectory in the current directory for the study to be run in. This method will create only the last directory in the path. So if the desired subdirectory is under a multiple folders from the current directory MatCal will error if the head of the path does not exist. See os.path.split() for a definition of the path “head”.

Parameters:
  • working_directory (str) – The desired working directory for the current study. MatCal will only create the last folder if the path is a nested path.

  • remove_existing – If True, then the directory will be removed if pre-existing at study launch.

use_three_point_finite_difference(use_three_point_finite_difference=True)

This method sets the finite difference stencil for gradients to a three point finite difference scheme.

Note

This only affects the gradients. Only two point finite difference hessians are available. However, if the method requires a finite difference hessian and gradient, a three point gradient is automatically used.

Parameters:

use_three_point_finite_difference (bool) – an optional boolean that can be passed as False to turn of using a three point finite difference stencil for the gradient. By default it is True, so that this method turns on the three point finite difference for the gradient.

class matcal.core.calibration_studies.ScipyLeastSquaresStudy(*parameters, method=None, **kwargs)[source]

This study class is the MatCal interface to the Scipy least_squares function. It can be used to perform local calibrations to objective functions that are smooth and convex. We support all Scipy least_squares methods. All methods require calculation of the Jacobian, and we use an internal finite difference algorithm so that we can take advantage of parallelism for expensive objective function evaluations.

Note

MatCal’s finite difference steps do not currently adhere to bounds or constraints.

We default to the Scipy least_squares trf method that enforces upper and lower bounds.

Parameters:
  • parameters (list(Parameter) or ParameterCollection) – The parameters of interest for the study.

  • method (str) – specify a specific method that is valid for the Scipy optimize function used by the study

  • kwargs (dict(str, float or str or dict(str, float or str))) – pass valid keyword arguments for the chosen method. The ‘bounds’ keyword argument is set by MatCal and cannot be used.

add_evaluation_set(model, objectives, data=None, states=None, data_conditioner_class=<class 'matcal.core.data.MaxAbsDataConditioner'>)

Adds an evaluation set to the study. An evaluation set is a set of datasets, objectives and states that are applicable to a model. For each evaluation set, the model will be evaluated for every state in the set. The results from each model state will be compared to each dataset its state. This comparison consists of each objective in the passed objectives.

Parameters:
  • model (valid model type from the models module) – The model that will generate results for comparison to the data in the set.

  • objectives (Objective or ObjectiveCollection) – The objectives to quantitatively compare the model results to the data.

  • data (Data or DataCollection) – The data to be evaluated with this evaluation set. Data is not required when this method is called with a SimulationResultsSynchronizer.

  • states (State or StateCollection) – A subset of states in the data that are of interest for this study.

  • data_conditioner_class – the class that will be used as a data conditioner for this evaluation set. See data for valid data conditioners.

Raises:
  • StudyTypeError – if passed arguments are of the incorrect type.

  • StudyError – if all the passed states are not in the data.

add_parameter_preprocessor(parameter_preprocessor)

Add a parameter preprocessor to the study that will operate on the parameters before they are sent to the models. See UserDefinedParameterPreprocessor.

Parameters:

parameter_preprocessor (UserDefinedParameterPreprocessor) – the parameter preprocessor that will modify and update the given model parameters

property final_results_filename

Returns the filename for the final results file for the current study.

return: final results filename as an absolute path rtype: str

launch()

Scipy calibration studies return calibration information in an OptimizeResult object. This includes the best fit parameter set, the final objective, and the Jacobian and/or Hessian of the objective if available. It also includes other useful information such as messages related to the success or failure of the chosen method, number of evaluations, and number of method iterations. For more information on what is returned in an OptimizeResult object for a given method see the Scipy documentation.

plot_progress()

Calling this method will cause matcal to generate automatic plots after each batch of parameter evaluations. These plots are made using the standard plotter and will show things such as objective value evolution.

restart()

Restarts not supported with Scipy studies.

property results

Return access to the study’s results. Will return None, if study has not been run.

run_in_serial()

Tell MatCal to run evaluations in serial. This is only recommended if the study is serial, like a MCMC Bayes Study, and the model evaluations are fast, like a python model.

Running in serial avoids the overhead of reloading large data sets that are necessary in async studies.

set_cleanup_mode(new_pruner: DirectoryPrunerBase)

Changes the pruner to the object passed as an argument

set_core_limit(core_limit, override_max_limit=False)

Sets the total number of cores that the study may use.

Parameters:
  • core_limit (int) – The max number of cores that the study can use at any time.

  • override_max_limit – Override the default max cores that can be specified for a given study. The current limit of 500 is recommended by the MatCal team but might not be best for all cases.

Raises:

StudyTypeError – if the passed value is not an int.

set_parameters(*parameters)
Parameters:

parameters (Parameter or ParameterCollection) – The parameters of interest for the study.

Raises:

StudyTypeError – if the parameters are of incorrect type.

set_results_storage_options(data: bool = True, qois: bool = True, residuals: bool = True, objectives: bool = True, weighted_conditioned: bool = False, results_save_frequency: int = 1)

Set which history information to save and return with the study results. You can also down sample which evaluations to save using results_save_frequency. This is particularly useful if you wish to not store finite difference evaluations for gradient based studies. The total objective is always stored.

Parameters:
  • data (bool) – Store the raw data for each simulation and the raw experimental data for each objective for each desired evaluation.

  • qois (bool) – Store the QoIs for each objective for each desired evaluation. This includes both experiment and simulation QoIs

  • residuals (bool) – Store the residuals for each objective for each desired evaluation.

  • objectives (bool) – Store the objective by state and evaluation set for each desired evaluation.

  • weighted_conditioned (bool) – Store the weighted and conditioned values for each desired evaluation. This will save the weighted and conditioned, residuals, simulation qois and experiment qois.

  • results_save_frequency (int) – Set how the results save interval. For studies where finite difference derivatives are used, an interval of n+1 will exclude finite difference results from the saved results history.

set_step_size(step_size=5e-05)

When a MatCal calculated finite difference gradient or hessian is used for a method, this will set the finite difference relative step size.

Warning

If using Scipy finite difference methods, use the appropriate keyword argument for the method in the study __init__ to set the step size, not this method.

Parameters:

step_size (float) – the relative step size desired for finite difference gradients and hessians.

set_use_threads(always_use_threads=False)

By default, MatCal assumes that the model being run is CPU intensive. As a result, it runs each model in a subprocess which can result in some additional overhead. If running studies cheaper python models, it may be beneficial to use threading instead of a subprocess. Using this method will run the study with threading if only one model can be evaluated at a time. You can optionally run with threads even with concurrent model evaluations with the “always_use_threads” option; however, this can be less reliable. For large memory calibrations, we always recommend using subprocess.

Finally, any external executable is always run using subprocess, but threading can be use to manage that job and return its results.

Parameters:

always_use_threads (bool) – if true, MatCal will use threads over subprocess for concurrent modeling jobs. Defaults to False.

set_working_directory(working_directory, remove_existing=False)

By default, MatCal runs in the current working directory. This method allows the user to specify a subdirectory in the current directory for the study to be run in. This method will create only the last directory in the path. So if the desired subdirectory is under a multiple folders from the current directory MatCal will error if the head of the path does not exist. See os.path.split() for a definition of the path “head”.

Parameters:
  • working_directory (str) – The desired working directory for the current study. MatCal will only create the last folder if the path is a nested path.

  • remove_existing – If True, then the directory will be removed if pre-existing at study launch.

use_three_point_finite_difference(use_three_point_finite_difference=True)

This method sets the finite difference stencil for gradients to a three point finite difference scheme.

Note

This only affects the gradients. Only two point finite difference hessians are available. However, if the method requires a finite difference hessian and gradient, a three point gradient is automatically used.

Parameters:

use_three_point_finite_difference (bool) – an optional boolean that can be passed as False to turn of using a three point finite difference stencil for the gradient. By default it is True, so that this method turns on the three point finite difference for the gradient.