API Reference

RCR (Robust Chauvenet Outlier Rejection) Package API Details.

class rcr.FunctionalForm

class. Class used to initialize functional form/model-fitting RCR (see Rejecting Outliers While Model Fitting).

Constructor arguments:

Parameters:
  • f (function) –

    Model function \(y(\vec{x}|\vec{\theta})\) to fit data to while performing outlier rejection, where \(\vec{x}\) is an \(n\)-dimensional list/array_like (or float, for 1D models) of independent variables and \(\vec{\theta}\) is an \(M\)-dimensional list/array_like of model parameters. Arguments for f must follow this prototype:

    param x:Independent variable(s) of model
    type x:float or 1D list/array_like
    param params:Parameters of model
    type params:list/array_like, 1D
    returns:y – Model evaluated at the corresponding values of x and params.
    rtype:float
  • xdata (list/array_like, 1D or 2D) – \(n\)-dimensional independent variable data to fit model to. For 1D models (\(n=1\)), this will be a 1D list/array_like, while for \(n\)-D models, this will be a 2D list/array_like where each entry is a list/array_like of length \(n\).
  • ydata (list/array_like, 1D) – Dependent variable (model function evaluation) data to fit model to.
  • model_partials (list of functions) –

    A list of functions that return the partial derivatives of the model function f with respect to each, ordered, model parameter \(\vec{\theta}\) (See Rejecting Outliers While Model Fitting for an example). Arguments for each one of these functions must follow this prototype (same as for the model function f):

    param x:Independent variable(s) of model
    type x:float or 1D list/array_like
    param params:Parameters of model
    type params:list/array_like, 1D
    returns:y – Derivative of model (with respect to given model parameter), evaluated at the corresponding values of x and params.
    rtype:float
  • guess (list/array_like, 1D) – Guess for best fit values of model parameters \(\vec{\theta}\) (for the fitting algorithm).
  • weights (list/array_like, optional, 1D) – Optional weights to be applied to dataset (see Weighting Data).
  • error_y (list/array_like, optional, 1D) – Optional error bars/\(y\)-uncertainties to be applied to dataset (see Data with Uncertainties and/or Weights).
  • tol (float, optional) – Default: 1e-6. Convergence tolerance of modified Gauss-Newton fitting algorithm.
  • has_priors (bool, optional) – Default: False. Set to True if you’re going to apply statistical priors to your model parameters (see Applying Prior Knowledge to Model Parameters (Advanced); you’ll also need to create an instance of rcr.Priors and set the priors attribute of this instance of FunctionalForm equal to it).
  • pivot_function (function, optional) –

    Default: None. Function that returns the pivot point of some linearized model (see pivots). Must be of the form/prototype of:

    param xdata:\(n\)-dimensional independent variable data to fit model to; same as above``xdata``.
    type xdata:list/array_like, 1D or 2D
    param weights:Optional weights to be applied to dataset (see Weighting Data).
    type weights:list/array_like, optional, 1D
    param f:Model function; same as above f.
    type f:function
    param params:Parameters of model
    type params:list/array_like, 1D
    returns:
    • pivot (float or 1D list/array_like) – Pivot point(s) of the model; (float if you’re using a one-dimensional model/independent variable, list/array_like if \(n\)-dimensional.)
    • However, note that all arguments need to be actually used for the pivot point computation. For example,
    • a simple linear model \(y(x|b,m) = b + m(x-x_p)\) has a pivot point found by \(x_p=\sum_iw_ix_i/\sum_iw_i\), where
    • \(w_i\) are the weights of the datapoints.
  • pivot_guess (float or 1D list/array_like, optional) – Initial guess for the pivot point(s) of the model (float if you’re using a one-dimensional model/independent variable, list/array_like if \(n\)-dimensional; see pivots).
pivot_function

Function used to evaluate pivot point(s) (see pivot_function optional argument of rcr.FunctionalForm model constructor).

priors

rcr.Priors object. Object describing parameter prior probability distribution(s) applied to rcr.FunctionalForm model (see rcr.Priors).

To use priors on model parameters for some rcr.FunctionalForm model, this attribute of the model needs to be initialized as some instance of rcr.Priors (see Applying Prior Knowledge to Model Parameters (Advanced)).

result

rcr.FunctionalFormResults object. Access various results unique to Functional Form RCR with this (see rcr.FunctionalFormResults).

class rcr.FunctionalFormResults

Results from (and unique to) functional form/model-fitting RCR.

parameter_uncertainties

list of floats. Best-fit model parameter uncertainties, post-outlier rejection.

For example, if you’re fitting to some linear model \(y(x|b,m)=b+mx\), and you obtain a best fit of \(b=1.0\pm0.5\) and \(m=2\pm 1\), then parameter_uncertainties = [0.5, 1].

Note that in order for parameter uncertainties to be computed, either/both weights and data error bars/uncertainties must have been provided when constructing the rcr.FunctionalForm model.

parameters

list of floats. Best-fit model parameters, post-outlier rejection.

For example, if you’re fitting to some linear model \(y(x|b,m)=b+mx\), and you obtain a best fit of \(b=1\) and \(m=2\), then parameters = [1, 2].

pivot

float. Recovered optimal “pivot” point for model that should minimize correlation between the slope and intercept parameters of the linearized model (1D independent variable case).

See pivots. For example, the pivot point for the model \(y(x|b,m) = b + m(x-x_p)\) is \(x_p\).

pivot_ND

float Recovered optimal \(n\)-dimensional “pivot” point for model that should minimize correlation between the slope and intercept parameters of the linearized model (\(n\)-D independent variable case).

See pivots. For example, the pivot point for the \(n\)-dimensional model \(y(\vec{x}|\vec{b},\vec{m}) = \vec{b} + \vec{m}^T(\vec{x}-\vec{x}_p)\) is \(\vec{x}_p\).

class rcr.Priors

class. Class that encapsulates probabalistic priors to be applied to model parameters when using model-fitting/functional form RCR (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).

Constructor arguments:

Parameters:
gaussianParams

2D list/array_like of floats. A list that contains lists of mu and sigma for the Gaussian prior of each param. If no prior, then just use NaNs (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).

p

function. Custom priors function; takes in a vector of model parameters and returns a vector of the prior probability density for each value (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).

paramBounds

2D list/array_like of floats. A list that contains lists of the lower and upper hard bounds of each param. If not bounded, use NaNs, and if there’s only one bound, use NaN for the other bound (see Applying Prior Knowledge to Model Parameters (Advanced) for an example).

priorType

rcr.priorsTypes object. The type of priors that you’re applying to your model (see rcr.priorsTypes and Types of Model Parameter Priors in RCR).

class rcr.RCR

Master class used to initialize and run RCR outlier rejection procedures.

performBulkRejection(*args, **kwargs)

Overloaded function.

  1. performBulkRejection(self: rcr.RCR, data: List[float]) -> None

    Perform outlier rejection WITH the speed-up of bulk pre-rejection (see What is Bulk Rejection?).

    Parameters:

    data : list/array_like, 1D

    Dataset to perform outlier rejection (RCR) on. Access results via the result attribute (rcr.RCRResults) of your instance of rcr.RCR.

  2. performBulkRejection(self: rcr.RCR, weights: List[float], data: List[float]) -> None

    Perform outlier rejection WITH the speed-up of bulk pre-rejection (see What is Bulk Rejection?).

    Parameters:

    weights : list/array_like, 1D

    Weights for dataset to perform outlier rejection (RCR) on.

    data : list/array_like, 1D

    Dataset to perform outlier rejection (RCR) on. Access results via the result attribute (rcr.RCRResults) of your instance of rcr.RCR.

performRejection(*args, **kwargs)

Overloaded function.

  1. performRejection(self: rcr.RCR, data: List[float]) -> None

    Perform outlier rejection WITHOUT the speed-up of bulk pre-rejection (slower; see What is Bulk Rejection?).

    Parameters:

    data : list/array_like, 1D

    Dataset to perform outlier rejection (RCR) on. Access results via the result attribute (rcr.RCRResults) of your instance of rcr.RCR.

  2. performRejection(self: rcr.RCR, weights: List[float], data: List[float]) -> None

    Perform outlier rejection WITHOUT the speed-up of bulk pre-rejection (slower; see What is Bulk Rejection?).

    Parameters:

    weights : list/array_like, 1D

    Weights for dataset to perform outlier rejection (RCR) on.

    data : list/array_like, 1D

    Dataset to perform outlier rejection (RCR) on. Access results via the result attribute (rcr.RCRResults) of your instance of rcr.RCR.

result

rcr.RCRResults object. Access various results of RCR with this (see rcr.RCRResults).

setParametricModel(self: rcr.RCR, model: FunctionalForm) → None

Initialize parametric/functional form model to be used with RCR (see Rejecting Outliers While Model Fitting for a tutorial).

Parameters:model (rcr.FunctionalForm) – \(n\)-dimensional model to fit data to while performing outlier rejection.
setRejectionTech(self: rcr.RCR, rejection_technique: rcr.RejectionTechniques) → None

Modify/set outlier rejection technique to be used with RCR.

See Table of Rejection Techniques for an explanation of each rejection technique, and when to use it.

Parameters:rejection_technique (rcr.RejectionTechniques) – The rejection technique to be used with your instance of rcr.RCR.
class rcr.RCRResults

Various results from performing outlier rejection with RCR.

cleanW

list of floats. The user-provided datapoint weights that correspond to NON-outliers in the original dataset.

For example, if a dataset of y = [0, 1, -2, 1, 2, 37, 0.5, -100] with weights w = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2] was provided, and only the 37 and -100 were found to be outliers, then cleanW = [1, 1.1, 0.9, 1.2, 0.8, 0.95].

cleanY

list of floats. After performing RCR on some original dataset, these are the datapoints that were NOT found to be outliers.

For example, if a dataset of y = [0, 1, -2, 1, 2, 37, 0.5, -100] was provided and only the 37 and -100 were found to be outliers, then cleanY = [0, 1, -2, 1, 2, 0.5].

flags

list of bools. Ordered flags describing outlier status of each inputted datapoint (True if datapoint is NOT an outlier).

For example, if a dataset of y = [0, 1, -2, 1, 2, 37, 0.5, -100] was provided and only the 37 and -100 were found to be outliers, then flags = [True, True, True, True, True, False, True, False].

indices

list of ints. A list of indices of datapoints from original inputted dataset that are NOT outliers.

For example, if a dataset of y = [0, 1, -2, 1, 2, 37, 0.5, -100] was provided and only the 37 and -100 were found to be outliers, then indices = [0, 1, 2, 3, 4, 6].

mu

float. Mean/median/mode (central value) of uncontaminated data distribution.

The central value of the uncontaminated part of the provided dataset, recovered from performing RCR.

originalW

list of floats. The user-provided datapoint weights, pre-RCR.

For example, if a dataset with weights w = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2] was provided, then originalW = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2].

originalY

list of floats. The user-provided dataset, pre-RCR.

For example, if a dataset of y = [0, 1, -2, 1, 2, 37, 0.5, -100] was provided, then originalY = [0, 1, -2, 1, 2, 37, 0.5, -100].

rejectedW

list of floats. The user-provided datapoint weights that correspond to outliers in the original dataset.

For example, if a dataset of y = [0, 1, -2, 1, 2, 37, 0.5, -100] with weights w = [1, 1.1, 0.9, 1.2, 0.8, 0.2, 0.95, 2] was provided, and only the 37 and -100 were found to be outliers, then rejectedW = [0.2, 2].

rejectedY

list of floats. After performing RCR on some original dataset, these are the datapoints that WERE found to be outliers.

For example, if a dataset of y = [0, 1, -2, 1, 2, 37, 0.5, -100] was provided and only the 37 and -100 were found to be outliers, then rejectedY = [37, -100].

sigma

float. Recovered robust 68.3-percentile deviation of uncontaminated data distribution.

A more robust (less sensitive to outliers) version of the standard deviation/width \(\sigma\) of the uncontaminated part of the provided dataset (see Section 2.1 of The Paper), recovered from performing RCR. For the case of a symmetric uncontaminated data distribution.

sigmaAbove

float. Recovered robust 68.3-percentile deviation abpve mu (mean/median/mode) of uncontaminated data distribution.

A more robust (less sensitive to outliers) version of the standard deviation/width \(\sigma_+\) of the positive side of a mildly asymmetric uncontaminated data distribution (see Section 2.1 of The Paper), recovered from performing RCR. (For the symmetric case, \(\sigma_+=\sigma_-\equiv\sigma\)).

sigmaBelow

float. Recovered robust 68.3-percentile deviation below mu (mean/median/mode) of uncontaminated data distribution.

A more robust (less sensitive to outliers) version of the standard deviation/width \(\sigma_-\) of the negative side of a mildly asymmetric uncontaminated data distribution (see Section 2.1 of The Paper), recovered from performing RCR. (For the symmetric case, \(\sigma_-=\sigma_+\equiv\sigma\)).

stDev

float. Standard deviation of uncontaminated data distribution.

The standard deviation/width \(\sigma\) of the uncontaminated part of the provided dataset, recovered from performing RCR. For the case of a symmetric uncontaminated data distribution.

stDevAbove

float. Standard deviation above mu (mean/median/mode) of uncontaminated (asymmetric) data distribution.

The asymmetric standard deviation/width \(\sigma_+\) of the positive side of a mildly asymmetric uncontaminated data distribution, recovered from RCR (for the symmetric case, \(\sigma_+=\sigma_-\equiv\sigma\)).

stDevBelow

float. Standard deviation below mu (mean/median/mode) of uncontaminated (asymmetric) data distribution.

The asymmetric standard deviation/width \(\sigma_-\) of the negative side of a mildly asymmetric uncontaminated data distribution, recovered from RCR (for the symmetric case, \(\sigma_-=\sigma_+\equiv\sigma\)).

stDevTotal

float. Combined standard deviation both above and below mu (mean/median/mode) of uncontaminated (asymmetric) data distribution.

A combination of the asymmetric standard deviation/width \(\sigma_+\) of the positive side of a mildly asymmetric uncontaminated data distribution and the width \(\sigma_-\) of the negative side of the distribution, recovered from RCR. Can be used to approximate a mildly asymmetric data distribution as symmetric.

class rcr.RejectionTechniques

RCR Standard Rejection Techniques.

Members:

SS_MEDIAN_DL : Rejection technique for a symmetric uncontaminated distribution with two-sided contaminants.

LS_MODE_68 : Rejection technique for a symmetric uncontaminated distribution with one-sided contaminants.

LS_MODE_DL : Rejection technique for a symmetric uncontaminated distribution with a mixture of one-sided and two-sided contaminants.

ES_MODE_DL : Rejection technique for a mildly asymmetric uncontaminated distribution and/or a very low number of data points.

name
class rcr.priorsTypes

Types of prior probability density functions that can be applied to model parameters.

Members:

CUSTOM_PRIORS : Custom, function-defined prior probability density functions(s).

GAUSSIAN_PRIORS : Gaussian (normal) prior probability density function(s).

CONSTRAINED_PRIORS : Bounded/hard-constrained prior probability density function(s).

MIXED_PRIORS : A mixture of gaussian (normal), hard-constrained, and uninformative (uniform/flat) prior probability density functions.

name