Skip to contents

Find the aggregate rule set from a list of bootstrapped BRS rule sets

Usage

agg_BRS(
  fit,
  X,
  Y,
  maxLen,
  split = F,
  train = 0.7,
  maxRules = 3,
  stat = "acc",
  topRules = 5,
  minProp = 0,
  simplify = F,
  oppmat = NULL,
  oppind = NULL
)

Arguments

fit

the output from the BRS function. A list whose first element is a list of rule sets and whose second element is a list of bootstrap indices. The third element is ignored.

X

data frame or matrix of the data, excluding the outcome

Y

vector of outcomes

maxLen

maximum length of a rule possible

split

logical for whether to split the sample into a training set on which the aggregate rule set is found and a test set on which that rule set's performance is evaluated

train

numeric for proportion of the data to use as training data. If split=F, this argument is ignored.

maxRules

integer for the maximum number of rules in the aggregate rule set

stat

the statistic on which to evaluate the aggregated rule sets. Currently only accuracy is supported

topRules

integer for the number of high prevalence rules of each length to consider

minProp

numeric for proportion of times a rule must appear in order to be considered

simplify

logical for whether equivalent rules are combined for determining prevalence

oppmat

a matrix with two columns and K rows, where K is the length of the list oppind. The kth row contains values v1 and v2 (i.e., v1=oppmat[k,1] and v2=oppmat[k,2]) such that for any variable var in oppind[[k]], var_v1 and !var_v2 are equivalent. v1 should be the prefered return value.

oppind

a list of vectors of variables. Each vector oppind[[k]] contains variables var such that var_v1 and !var_v2 are equivalent, where v1 and v2 form the kth row of oppmat, v1=oppmat[k,1] and v2=oppmat[k,2]

Value

the aggregate rule set, which has the highest stat out of all possible rule sets constructed from at most maxRules candidate rules