Find aggregate rule set — agg

Find the aggregate rule set from a list of bootstrapped BRS rule sets

Usage

agg_BRS(
  fit,
  X,
  Y,
  maxLen,
  split = F,
  train = 0.7,
  maxRules = 3,
  stat = "acc",
  topRules = 5,
  minProp = 0,
  simplify = F,
  oppmat = NULL,
  oppind = NULL
)

Arguments

fit: the output from the BRS function. A list whose first element is a list of rule sets and whose second element is a list of bootstrap indices. The third element is ignored.
X: data frame or matrix of the data, excluding the outcome
Y: vector of outcomes
maxLen: maximum length of a rule possible
split: logical for whether to split the sample into a training set on which the aggregate rule set is found and a test set on which that rule set's performance is evaluated
train: numeric for proportion of the data to use as training data. If split=F, this argument is ignored.
maxRules: integer for the maximum number of rules in the aggregate rule set
stat: the statistic on which to evaluate the aggregated rule sets. Currently only accuracy is supported
topRules: integer for the number of high prevalence rules of each length to consider
minProp: numeric for proportion of times a rule must appear in order to be considered
simplify: logical for whether equivalent rules are combined for determining prevalence
oppmat: a matrix with two columns and K rows, where K is the length of the list oppind. The kth row contains values v1 and v2 (i.e., v1=oppmat[k,1] and v2=oppmat[k,2]) such that for any variable var in oppind[[k]], var_v1 and !var_v2 are equivalent. v1 should be the prefered return value.
oppind: a list of vectors of variables. Each vector oppind[[k]] contains variables var such that var_v1 and !var_v2 are equivalent, where v1 and v2 form the kth row of oppmat, v1=oppmat[k,1] and v2=oppmat[k,2]

Value

the aggregate rule set, which has the highest stat out of all possible rule sets constructed from at most maxRules candidate rules