Generate data calibration

Generate a calibration for a specific variable based on one or multiple calibration models. Requires properly nested and grouped data, see iso_prepare_for_calibration for details. Note that to calibrate different variables, separate calls to this function should be issued each with different calibration names.

iso_generate_calibration(
  dt,
  model,
  calibration = "",
  use_in_calib = default(is_std_peak),
  min_n_datapoints = 2,
  is_std_peak = default(is_std_peak),
  is_standard = default(is_std_peak),
  quiet = default(quiet)
)

Arguments

dt	nested data table with column `all_data` (see iso_prepare_for_calibration)
model	a single regression model (usually lm or glm) or a list of multiple alternative regression models for the calibration. If a named list is provided, the name(s) will be used instead of the formulas for the model identification column. If multiple models are provided, the entire data table rows will be duplicated to consider the different models in parallel. Note that loess models are supported but discouraged (and will cause a warning) because local polynomial regression fitting does not calibrate based on a hypothesized regression model and can easily mis-calibrate sparse data. The exception is for non-linear temporal drift corrections (use `calibration="drift"` to flag as such) which may reasonably require local polynomical regression fitting of the type `loess(y ~ file_datetime)` to account for temporal machine variations.
calibration	an informative name for the calibration (could be e.g. `"d13C"` or `"conc"`). If provided, will be used as a prefix for the new columns generated by this function. This parameter is most useful if there are multiple variables in the data set that need to be calibrated (e.g. multiple delta values, concentration, etc.). If there is only a single variable to calibrate, the `calibration` parameter is completely optional and can just be left blank (the default).
use_in_calib	column or filter condition to determine which subset of data to actually use for the calibration (default is the `is_std_peak` field introduced by `iso_add_standards`).
min_n_datapoints	the minimum number of data points required for applying the model(s). Note that there is always an additional check to make sure the minimum number of degrees of freedom for each model is met. If the minimum number of degrees of freedom required is not met, the model will/can not be calculated no matter what `min_n_datapoints` is set to.
is_std_peak	deprecated in favor of `use_in_calib`
is_standard	deprecated in favor of `use_in_calib`
quiet	whether to display (quiet=FALSE) or silence (quiet = TRUE) information messages.

Value

the data table with the following columns added (prefixed by the calibration parameter if provided):

calib: the name of the calibration if provided in the model parameter, otherwise the formula
calib_ok: a TRUE/FALSE column indicating whether there was enough data for calibration to be generated
calib_points: an integer column indicating the total number of data points used in the calibration. Note that this field counts replicate data points: multiple data points that fall at the exact same x or y-value still count as individual data points for this metric.
calib_params: a nested dataframe that holds the actual regression model fit, coefficients, summary and data range. These parameters are most easily accessed using the functions iso_unnest_calibration_coefs, iso_unnest_calibration_summary, iso_unnest_calibration_parameters, iso_unnest_calibration_range, or directly via unnest
resid within all_data: a new column within the nested all_data that holds the residuals for all standards used in the regression model

Arguments

Value

See also