`add_candidates()`

collates the assessment set predictions
and additional attributes from the supplied model definition
(i.e. set of "candidates") to a data stack.

Behind the scenes, data stack objects are just tibble::tbl_dfs, where the first column gives the true response values, and the remaining columns give the assessment set predictions for each candidate. In the regression setting, there's only one column per ensemble member. In classification settings, there are as many columns per candidate ensemble member as there are levels of the outcome variable.

To initialize a data stack, use the `stacks()`

function.
Model definitions are appended to a data stack iteratively
using several calls to `add_candidates()`

. Data stacks are
evaluated using the `blend_predictions()`

function.

add_candidates( data_stack, candidates, name = deparse(substitute(candidates)), ... )

data_stack | A |
---|---|

candidates | A model definition: either a |

name | The label for the model definition---defaults to the name
of the |

... | Additional arguments. Currently ignored. |

A `data_stack`

object--see `stacks()`

for more details!

This package provides some resampling objects and datasets for use in examples and vignettes derived from a study on 1212 red-eyed tree frog embryos!

Red-eyed tree frog (RETF) embryos can hatch earlier than their normal 7ish days if they detect potential predator threat. Researchers wanted to determine how, and when, these tree frog embryos were able to detect stimulus from their environment. To do so, they subjected the embryos at varying developmental stages to "predator stimulus" by jiggling the embryos with a blunt probe. Beforehand, though some of the embryos were treated with gentamicin, a compound that knocks out their lateral line (a sensory organ.) Researcher Julie Jung and her crew found that these factors inform whether an embryo hatches prematurely or not!

Note that the data included with the stacks package is not necessarily a representative or unbiased subset of the complete dataset, and is only for demonstrative purposes.

`reg_folds`

and `class_folds`

are `rset`

cross-fold validation objects
from `rsample`

, splitting the training data into for the regression
and classification model objects, respectively. `tree_frogs_reg_test`

and
`tree_frogs_class_test`

are the analogous testing sets.

`reg_res_lr`

, `reg_res_svm`

, and `reg_res_sp`

contain regression tuning results
for a linear regression, support vector machine, and spline model, respectively,
fitting `latency`

(i.e. how long the embryos took to hatch in response
to the jiggle) in the `tree_frogs`

data, using most all of the other
variables as predictors. Note that the data underlying these models is
filtered to include data only from embryos that hatched in response to
the stimulus.

`class_res_rf`

and `class_res_nn`

contain multiclass classification tuning
results for a random forest and neural network classification model,
respectively, fitting `reflex`

(a measure of ear function) in the
data using most all of the other variables as predictors.

`log_res_rf`

and `log_res_nn`

, contain binary classification tuning results
for a random forest and neural network classification model, respectively,
fitting `hatched`

(whether or not the embryos hatched in response
to the stimulus) using most all of the other variables as predictors.

See `?example_data`

to learn more about these objects, as well as browse
the source code that generated them.

Other core verbs:
`blend_predictions()`

,
`fit_members()`

,
`stacks()`

# \donttest{ # see the "Example Data" section above for # clarification on the objects used in these examples! # put together a data stack using # tuning results for regression models reg_st <- stacks() %>% add_candidates(reg_res_lr) %>% add_candidates(reg_res_svm) %>% add_candidates(reg_res_sp) reg_st#> # A data stack with 3 model definitions and 15 candidate members: #> # reg_res_lr: 1 model configuration #> # reg_res_svm: 5 model configurations #> # reg_res_sp: 9 model configurations #> # Outcome: latency (numeric)# do the same with multinomial classification models class_st <- stacks() %>% add_candidates(class_res_nn) %>% add_candidates(class_res_rf) class_st#> # A data stack with 2 model definitions and 11 candidate members: #> # class_res_nn: 1 model configuration #> # class_res_rf: 10 model configurations #> # Outcome: reflex (factor)# ...or binomial classification models log_st <- stacks() %>% add_candidates(log_res_nn) %>% add_candidates(log_res_rf) log_st#> # A data stack with 2 model definitions and 11 candidate members: #> # log_res_nn: 1 model configuration #> # log_res_rf: 10 model configurations #> # Outcome: hatched (factor)# use custom names for each model: log_st2 <- stacks() %>% add_candidates(log_res_nn, name = "neural_network") %>% add_candidates(log_res_rf, name = "random_forest") log_st2#> # A data stack with 2 model definitions and 11 candidate members: #> # neural_network: 1 model configuration #> # random_forest: 10 model configurations #> # Outcome: hatched (factor)# these objects would likely then be # passed to blend_predictions(): log_st2 %>% blend_predictions()#>#> #>#> #>#> #>#> # A tibble: 4 x 3 #> member type weight #> <chr> <chr> <dbl> #> 1 .pred_yes_neural_network_1_1 mlp 6.09 #> 2 .pred_yes_random_forest_1_09 rand_forest 1.84 #> 3 .pred_yes_random_forest_1_05 rand_forest 1.45 #> 4 .pred_yes_random_forest_1_06 rand_forest 0.792#> #># }