API Reference¶
RCR (Robust Chauvenet Outlier Rejection) Package API Details.
-
class
rcr.
FunctionalForm
¶ class. Class used to initialize functional form/model-fitting RCR (see Rejecting Outliers While Model Fitting).
Constructor arguments:
Parameters: - f (function) –
Model function \(y(\vec{x}|\vec{\theta})\) to fit data to while performing outlier rejection, where \(\vec{x}\) is an \(n\)-dimensional list/array_like (or float, for 1D models) of independent variables and \(\vec{\theta}\) is an \(M\)-dimensional list/array_like of model parameters. Arguments for
f
must follow this prototype:param x: Independent variable(s) of model type x: float or 1D list/array_like param params: Parameters of model type params: list/array_like, 1D returns: y – Model evaluated at the corresponding values of x
andparams
.rtype: float - xdata (list/array_like, 1D or 2D) – \(n\)-dimensional independent variable data to fit model to. For 1D models (\(n=1\)), this will be a 1D list/array_like, while for \(n\)-D models, this will be a 2D list/array_like where each entry is a list/array_like of length \(n\).
- ydata (list/array_like, 1D) – Dependent variable (model function evaluation) data to fit model to.
- model_partials (list of functions) –
A list of functions that return the partial derivatives of the model function
f
with respect to each, ordered, model parameter \(\vec{\theta}\) (See Rejecting Outliers While Model Fitting for an example). Arguments for each one of these functions must follow this prototype (same as for the model functionf
):param x: Independent variable(s) of model type x: float or 1D list/array_like param params: Parameters of model type params: list/array_like, 1D returns: y – Derivative of model (with respect to given model parameter), evaluated at the corresponding values of x
andparams
.rtype: float - guess (list/array_like, 1D) – Guess for best fit values of model parameters \(\vec{\theta}\) (for the fitting algorithm).
- weights (list/array_like, optional, 1D) – Optional weights to be applied to dataset (see Weighting Data).
- error_y (list/array_like, optional, 1D) – Optional error bars/\(y\)-uncertainties to be applied to dataset (see Data with Uncertainties and/or Weights).
- tol (float, optional) – Default:
1e-6
. Convergence tolerance of modified Gauss-Newton fitting algorithm. - has_priors (bool, optional) – Default:
False
. Set toTrue
if you’re going to apply statistical priors to your model parameters (see Applying Prior Knowledge to Model Parameters (Advanced); you’ll also need to create an instance ofrcr.Priors
and set thepriors
attribute of this instance ofFunctionalForm
equal to it). - pivot_function (function, optional) –
Default:
None
. Function that returns the pivot point of some linearized model (see pivots). Must be of the form/prototype of:param xdata: \(n\)-dimensional independent variable data to fit model to; same as above``xdata``. type xdata: list/array_like, 1D or 2D param weights: Optional weights to be applied to dataset (see Weighting Data). type weights: list/array_like, optional, 1D param f: Model function; same as above f
.type f: function param params: Parameters of model type params: list/array_like, 1D returns: - pivot (float or 1D list/array_like) – Pivot point(s) of the model; (
float
if you’re using a one-dimensional model/independent variable,list/array_like
if \(n\)-dimensional.) - However, note that all arguments need to be actually used for the pivot point computation. For example,
- a simple linear model \(y(x|b,m) = b + m(x-x_p)\) has a pivot point found by \(x_p=\sum_iw_ix_i/\sum_iw_i\), where
- \(w_i\) are the weights of the datapoints.
- pivot (float or 1D list/array_like) – Pivot point(s) of the model; (
- pivot_guess (float or 1D list/array_like, optional) – Initial guess for the pivot point(s) of the model (
float
if you’re using a one-dimensional model/independent variable,list/array_like
if \(n\)-dimensional; see pivots).
-
pivot_function
¶ Function used to evaluate pivot point(s) (see
pivot_function
optional argument ofrcr.FunctionalForm
model constructor).
-
priors
¶ rcr.Priors
object. Object describing parameter prior probability distribution(s) applied torcr.FunctionalForm
model (seercr.Priors
).To use priors on model parameters for some
rcr.FunctionalForm
model, this attribute of the model needs to be initialized as some instance ofrcr.Priors
(see Applying Prior Knowledge to Model Parameters (Advanced)).
-
result
¶ rcr.FunctionalFormResults
object. Access various results unique to Functional Form RCR with this (seercr.FunctionalFormResults
).
- f (function) –
-
class
rcr.
FunctionalFormResults
¶ Results from (and unique to) functional form/model-fitting RCR.
-
parameter_uncertainties
¶ list of floats. Best-fit model parameter uncertainties, post-outlier rejection.
For example, if you’re fitting to some linear model \(y(x|b,m)=b+mx\), and you obtain a best fit of \(b=1.0\pm0.5\) and \(m=2\pm 1\), then
parameter_uncertainties = [0.5, 1]
.Note that in order for parameter uncertainties to be computed, either/both weights and data error bars/uncertainties must have been provided when constructing the
rcr.FunctionalForm
model.
-
parameters
¶ list of floats. Best-fit model parameters, post-outlier rejection.
For example, if you’re fitting to some linear model \(y(x|b,m)=b+mx\), and you obtain a best fit of \(b=1\) and \(m=2\), then
parameters = [1, 2]
.
-
pivot
¶ float. Recovered optimal “pivot” point for model that should minimize correlation between the slope and intercept parameters of the linearized model (1D independent variable case).
See pivots. For example, the pivot point for the model \(y(x|b,m) = b + m(x-x_p)\) is \(x_p\).
-
pivot_ND
¶ float Recovered optimal \(n\)-dimensional “pivot” point for model that should minimize correlation between the slope and intercept parameters of the linearized model (\(n\)-D independent variable case).
See pivots. For example, the pivot point for the \(n\)-dimensional model \(y(\vec{x}|\vec{b},\vec{m}) = \vec{b} + \vec{m}^T(\vec{x}-\vec{x}_p)\) is \(\vec{x}_p\).
-
-
class
rcr.
Priors
¶ class. Class that encapsulates probabalistic priors to be applied to model parameters when using model-fitting/functional form RCR (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).
Constructor arguments:
Parameters: - priorType (
rcr.priorsTypes
) – The type of priors that you’re applying to your model (seercr.priorsTypes
and Types of Model Parameter Priors in RCR). - p (function, optional 2nd argument) – Custom priors function; takes in a vector of model parameters and returns a vector of the prior probability density for each value (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).
- gaussianParams (2D list/array_like, optional 2nd argument) – A list that contains lists of mu and sigma for the Gaussian prior of each param. If no prior, then just use NaNs (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).
- paramBounds (2D list/array_like, optional 2nd argument (or 3rd, for the case of
rcr.MIXED_PRIORS
)) – A list that contains lists of the lower and upper hard bounds of each param. If not bounded, use NaNs, and if there’s only one bound, use NaN for the other bound (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).
-
gaussianParams
¶ 2D list/array_like of floats. A list that contains lists of mu and sigma for the Gaussian prior of each param. If no prior, then just use NaNs (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).
-
p
¶ function. Custom priors function; takes in a vector of model parameters and returns a vector of the prior probability density for each value (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).
-
paramBounds
¶ 2D list/array_like of floats. A list that contains lists of the lower and upper hard bounds of each param. If not bounded, use NaNs, and if there’s only one bound, use NaN for the other bound (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).
-
priorType
¶ rcr.priorsTypes
object. The type of priors that you’re applying to your model (seercr.priorsTypes
and Types of Model Parameter Priors in RCR).
- priorType (
-
class
rcr.
RCR
¶ Master class used to initialize and run RCR outlier rejection procedures.
-
performBulkRejection
(*args, **kwargs)¶ Overloaded function.
performBulkRejection(self: rcr.RCR, data: List[float]) -> None
Perform outlier rejection WITH the speed-up of bulk pre-rejection (see What is Bulk Rejection?).
Parameters:
- data : list/array_like, 1D
Dataset to perform outlier rejection (RCR) on. Access results via the
result
attribute (rcr.RCRResults
) of your instance ofrcr.RCR
.
performBulkRejection(self: rcr.RCR, weights: List[float], data: List[float]) -> None
Perform outlier rejection WITH the speed-up of bulk pre-rejection (see What is Bulk Rejection?).
Parameters:
- weights : list/array_like, 1D
Weights for dataset to perform outlier rejection (RCR) on.
- data : list/array_like, 1D
Dataset to perform outlier rejection (RCR) on. Access results via the
result
attribute (rcr.RCRResults
) of your instance ofrcr.RCR
.
-
performRejection
(*args, **kwargs)¶ Overloaded function.
performRejection(self: rcr.RCR, data: List[float]) -> None
Perform outlier rejection WITHOUT the speed-up of bulk pre-rejection (slower; see What is Bulk Rejection?).
Parameters:
- data : list/array_like, 1D
Dataset to perform outlier rejection (RCR) on. Access results via the
result
attribute (rcr.RCRResults
) of your instance ofrcr.RCR
.
performRejection(self: rcr.RCR, weights: List[float], data: List[float]) -> None
Perform outlier rejection WITHOUT the speed-up of bulk pre-rejection (slower; see What is Bulk Rejection?).
Parameters:
- weights : list/array_like, 1D
Weights for dataset to perform outlier rejection (RCR) on.
- data : list/array_like, 1D
Dataset to perform outlier rejection (RCR) on. Access results via the
result
attribute (rcr.RCRResults
) of your instance ofrcr.RCR
.
-
result
¶ rcr.RCRResults
object. Access various results of RCR with this (seercr.RCRResults
).
-
setParametricModel
(self: rcr.RCR, model: FunctionalForm) → None¶ Initialize parametric/functional form model to be used with RCR (see Rejecting Outliers While Model Fitting for a tutorial).
Parameters: model ( rcr.FunctionalForm
) – \(n\)-dimensional model to fit data to while performing outlier rejection.
-
setRejectionTech
(self: rcr.RCR, rejection_technique: rcr.RejectionTechniques) → None¶ Modify/set outlier rejection technique to be used with RCR.
See Table of Rejection Techniques for an explanation of each rejection technique, and when to use it.
Parameters: rejection_technique ( rcr.RejectionTechniques
) – The rejection technique to be used with your instance ofrcr.RCR
.
-
-
class
rcr.
RCRResults
¶ Various results from performing outlier rejection with RCR.
-
cleanW
¶ list of floats. The user-provided datapoint weights that correspond to NON-outliers in the original dataset.
For example, if a dataset of
y = [0, 1, -2, 1, 2, 37, 0.5, -100]
with weightsw = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2]
was provided, and only the37
and-100
were found to be outliers, thencleanW = [1, 1.1, 0.9, 1.2, 0.8, 0.95]
.
-
cleanY
¶ list of floats. After performing RCR on some original dataset, these are the datapoints that were NOT found to be outliers.
For example, if a dataset of
y = [0, 1, -2, 1, 2, 37, 0.5, -100]
was provided and only the37
and-100
were found to be outliers, thencleanY = [0, 1, -2, 1, 2, 0.5]
.
-
flags
¶ list of bools. Ordered flags describing outlier status of each inputted datapoint (True if datapoint is NOT an outlier).
For example, if a dataset of
y = [0, 1, -2, 1, 2, 37, 0.5, -100]
was provided and only the37
and-100
were found to be outliers, thenflags = [True, True, True, True, True, False, True, False]
.
-
indices
¶ list of ints. A list of indices of datapoints from original inputted dataset that are NOT outliers.
For example, if a dataset of
y = [0, 1, -2, 1, 2, 37, 0.5, -100]
was provided and only the37
and-100
were found to be outliers, thenindices = [0, 1, 2, 3, 4, 6]
.
-
mu
¶ float. Mean/median/mode (central value) of uncontaminated data distribution.
The central value of the uncontaminated part of the provided dataset, recovered from performing RCR.
-
originalW
¶ list of floats. The user-provided datapoint weights, pre-RCR.
For example, if a dataset with weights
w = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2]
was provided, thenoriginalW = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2]
.
-
originalY
¶ list of floats. The user-provided dataset, pre-RCR.
For example, if a dataset of
y = [0, 1, -2, 1, 2, 37, 0.5, -100]
was provided, thenoriginalY = [0, 1, -2, 1, 2, 37, 0.5, -100]
.
-
rejectedW
¶ list of floats. The user-provided datapoint weights that correspond to outliers in the original dataset.
For example, if a dataset of
y = [0, 1, -2, 1, 2, 37, 0.5, -100]
with weightsw = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2]
was provided, and only the37
and-100
were found to be outliers, thenrejectedW = [0.2, 2]
.
-
rejectedY
¶ list of floats. After performing RCR on some original dataset, these are the datapoints that WERE found to be outliers.
For example, if a dataset of
y = [0, 1, -2, 1, 2, 37, 0.5, -100]
was provided and only the37
and-100
were found to be outliers, thenrejectedY = [37, -100]
.
-
sigma
¶ float. Recovered robust 68.3-percentile deviation of uncontaminated data distribution.
A more robust (less sensitive to outliers) version of the standard deviation/width \(\sigma\) of the uncontaminated part of the provided dataset (see Section 2.1 of The Paper), recovered from performing RCR. For the case of a symmetric uncontaminated data distribution.
-
sigmaAbove
¶ float. Recovered robust 68.3-percentile deviation abpve mu (mean/median/mode) of uncontaminated data distribution.
A more robust (less sensitive to outliers) version of the standard deviation/width \(\sigma_+\) of the positive side of a mildly asymmetric uncontaminated data distribution (see Section 2.1 of The Paper), recovered from performing RCR. (For the symmetric case, \(\sigma_+=\sigma_-\equiv\sigma\)).
-
sigmaBelow
¶ float. Recovered robust 68.3-percentile deviation below mu (mean/median/mode) of uncontaminated data distribution.
A more robust (less sensitive to outliers) version of the standard deviation/width \(\sigma_-\) of the negative side of a mildly asymmetric uncontaminated data distribution (see Section 2.1 of The Paper), recovered from performing RCR. (For the symmetric case, \(\sigma_-=\sigma_+\equiv\sigma\)).
-
stDev
¶ float. Standard deviation of uncontaminated data distribution.
The standard deviation/width \(\sigma\) of the uncontaminated part of the provided dataset, recovered from performing RCR. For the case of a symmetric uncontaminated data distribution.
-
stDevAbove
¶ float. Standard deviation above mu (mean/median/mode) of uncontaminated (asymmetric) data distribution.
The asymmetric standard deviation/width \(\sigma_+\) of the positive side of a mildly asymmetric uncontaminated data distribution, recovered from RCR (for the symmetric case, \(\sigma_+=\sigma_-\equiv\sigma\)).
-
stDevBelow
¶ float. Standard deviation below mu (mean/median/mode) of uncontaminated (asymmetric) data distribution.
The asymmetric standard deviation/width \(\sigma_-\) of the negative side of a mildly asymmetric uncontaminated data distribution, recovered from RCR (for the symmetric case, \(\sigma_-=\sigma_+\equiv\sigma\)).
-
stDevTotal
¶ float. Combined standard deviation both above and below mu (mean/median/mode) of uncontaminated (asymmetric) data distribution.
A combination of the asymmetric standard deviation/width \(\sigma_+\) of the positive side of a mildly asymmetric uncontaminated data distribution and the width \(\sigma_-\) of the negative side of the distribution, recovered from RCR. Can be used to approximate a mildly asymmetric data distribution as symmetric.
-
-
class
rcr.
RejectionTechniques
¶ RCR Standard Rejection Techniques.
Members:
SS_MEDIAN_DL : Rejection technique for a symmetric uncontaminated distribution with two-sided contaminants.
LS_MODE_68 : Rejection technique for a symmetric uncontaminated distribution with one-sided contaminants.
LS_MODE_DL : Rejection technique for a symmetric uncontaminated distribution with a mixture of one-sided and two-sided contaminants.
ES_MODE_DL : Rejection technique for a mildly asymmetric uncontaminated distribution and/or a very low number of data points.
-
name
¶
-
-
class
rcr.
priorsTypes
¶ Types of prior probability density functions that can be applied to model parameters.
Members:
CUSTOM_PRIORS : Custom, function-defined prior probability density functions(s).
GAUSSIAN_PRIORS : Gaussian (normal) prior probability density function(s).
CONSTRAINED_PRIORS : Bounded/hard-constrained prior probability density function(s).
MIXED_PRIORS : A mixture of gaussian (normal), hard-constrained, and uninformative (uniform/flat) prior probability density functions.
-
name
¶
-