Title: | (Adaptive) Boosting Trees Algorithm |
---|---|
Description: | Performs (Adaptive) Boosting Trees for Poisson distributed response variables, using log-link function. The code approach is similar to the one used in 'gbm'/'gbm3'. Moreover, each tree in the expansion is built thanks to the 'rpart' package. This package is based on following books and articles Denuit, M., Hainaut, D., Trufin, J. (2019) <doi:10.1007/978-3-030-25820-7> Denuit, M., Hainaut, D., Trufin, J. (2019) <doi:10.1007/978-3-030-57556-4> Denuit, M., Hainaut, D., Trufin, J. (2019) <doi:10.1007/978-3-030-25827-6> Denuit, M., Hainaut, D., Trufin, J. (2022) <doi:10.1080/03461238.2022.2037016> Denuit, M., Huyghe, J., Trufin, J. (2022) <https://dial.uclouvain.be/pr/boreal/fr/object/boreal%3A244325/datastream/PDF_01/view> Denuit, M., Trufin, J., Verdebout, T. (2022) <https://dial.uclouvain.be/pr/boreal/fr/object/boreal%3A268577>. |
Authors: | Gireg Willame [aut, cre, cph] |
Maintainer: | Gireg Willame <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.4 |
Built: | 2024-10-29 04:19:50 UTC |
Source: | https://github.com/giregwillame/bt |
Performs the (Adaptive) Boosting Trees algorithm. This code prepares the inputs and calls the function BT_call
.
Each tree in the process is built thanks to the rpart
function.
In case of cross-validation, this function prepares the folds and performs multiple calls to the fitting function BT_call
.
BT( formula = formula(data), data = list(), tweedie.power = 1, ABT = TRUE, n.iter = 100, train.fraction = 1, interaction.depth = 4, shrinkage = 1, bag.fraction = 1, colsample.bytree = NULL, keep.data = TRUE, is.verbose = FALSE, cv.folds = 1, folds.id = NULL, n.cores = 1, tree.control = rpart.control(xval = 0, maxdepth = (if (!is.null(interaction.depth)) { interaction.depth } else { 10 }), cp = -Inf, minsplit = 2), weights = NULL, seed = NULL, ... )
BT( formula = formula(data), data = list(), tweedie.power = 1, ABT = TRUE, n.iter = 100, train.fraction = 1, interaction.depth = 4, shrinkage = 1, bag.fraction = 1, colsample.bytree = NULL, keep.data = TRUE, is.verbose = FALSE, cv.folds = 1, folds.id = NULL, n.cores = 1, tree.control = rpart.control(xval = 0, maxdepth = (if (!is.null(interaction.depth)) { interaction.depth } else { 10 }), cp = -Inf, minsplit = 2), weights = NULL, seed = NULL, ... )
formula |
a symbolic description of the model to be fit. Note that the offset isn't supported in this algorithm. Instead, everything is performed with a log-link function and a direct relationship exist between response, offset and weights. |
data |
an optional data frame containing the variables in the model. By default the variables are taken from |
tweedie.power |
Experimental parameter currently not used - Set to 1 referring to Poisson distribution. |
ABT |
a boolean parameter. If |
n.iter |
the total number of iterations to fit. This is equivalent to the number of trees and the number of basis functions in the additive expansion.
Please note that the initialization is not taken into account in the |
train.fraction |
the first |
interaction.depth |
the maximum depth of variable interactions: 1 builds an additive model, 2 builds a model with up to two-way interactions, etc.
This parameter can also be interpreted as the maximum number of non-terminal nodes. By default, it is set to 4.
Please note that if this parameter is |
shrinkage |
a shrinkage parameter (in the interval (0,1]) applied to each tree in the expansion. Also known as the learning rate or step-size reduction. By default, it is set to 1. |
bag.fraction |
the fraction of independent training observations randomly selected to propose the next tree in the expansion.
This introduces randomness into the model fit. If |
colsample.bytree |
each tree will be trained on a random subset of |
keep.data |
a boolean variable indicating whether to keep the data frames. This is particularly useful if one wants to keep track of the initial data frames
and is further used for predicting in case any data frame is specified.
Note that in case of cross-validation, if |
is.verbose |
if |
cv.folds |
a positive integer representing the number of cross-validation folds to perform. If |
folds.id |
an optional vector of values identifying what fold each observation is in. If supplied, this parameter prevails over |
n.cores |
the number of cores to use for parallelization. This parameter is used during the cross-validation. This parameter is bounded between 1 and the maximum number of available cores. By default, it is set to 1 leading to a sequential approach. |
tree.control |
for advanced user only. It allows to define additional tree parameters that will be used at each iteration.
See |
weights |
optional vector of weights used in the fitting process. These weights must be positive but do not need to be normalized.
By default, it is set to |
seed |
optional number used as seed. Please note that if |
... |
not currently used. |
The NA values are currently dropped using na.omit
.
a BTFit
object.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
BTFit
, BTCVFit
, BT_call
, BT_perf
, predict.BTFit
,
summary.BTFit
, print.BTFit
, .BT_cv_errors
.
## Load dataset. dataset <- BT::BT_Simulated_Data ## Fit a Boosting Tree model. BT_algo <- BT(formula = Y_normalized ~ Age + Sport + Split + Gender, # formula data = dataset, # data ABT = FALSE, # Classical Boosting Tree n.iter = 200, train.fraction = 0.8, interaction.depth = 3, shrinkage = 0.01, bag.fraction = 0.5, colsample.bytree = 2, # 2 explanatory variable used at each iteration. keep.data = FALSE, # Do not keep a data copy. is.verbose = FALSE, # Do not print progress. cv.folds = 3, # 3-cv will be performed. folds.id = NULL , n.cores = 1, weights = ExpoR, # <=> Poisson model on response Y with ExpoR in offset. seed = NULL) ## Determine the model performance and plot results. best_iter_val <- BT_perf(BT_algo, method='validation') best_iter_oob <- BT_perf(BT_algo, method='OOB', oobag.curve = TRUE) best_iter_cv <- BT_perf(BT_algo, method ='cv', oobag.curve = TRUE) best_iter <- best_iter_val ## Variable influence and plot results. # Based on the first iteration. variable_influence1 <- summary(BT_algo, n.iter = 1) # Using all iterations up to best_iter. variable_influence_best_iter <- summary(BT_algo, n.iter = best_iter) ## Print results : call, best_iters and summarized relative influence. print(BT_algo) ## Model predictions. # Predict on the link scale, using only the best_iter tree. pred_single_iter <- predict(BT_algo, newdata = dataset, n.iter = best_iter, type = 'link', single.iter = TRUE) # Predict on the response scale, using the first best_iter. pred_best_iter <- predict(BT_algo, newdata = dataset, n.iter = best_iter, type = 'response')
## Load dataset. dataset <- BT::BT_Simulated_Data ## Fit a Boosting Tree model. BT_algo <- BT(formula = Y_normalized ~ Age + Sport + Split + Gender, # formula data = dataset, # data ABT = FALSE, # Classical Boosting Tree n.iter = 200, train.fraction = 0.8, interaction.depth = 3, shrinkage = 0.01, bag.fraction = 0.5, colsample.bytree = 2, # 2 explanatory variable used at each iteration. keep.data = FALSE, # Do not keep a data copy. is.verbose = FALSE, # Do not print progress. cv.folds = 3, # 3-cv will be performed. folds.id = NULL , n.cores = 1, weights = ExpoR, # <=> Poisson model on response Y with ExpoR in offset. seed = NULL) ## Determine the model performance and plot results. best_iter_val <- BT_perf(BT_algo, method='validation') best_iter_oob <- BT_perf(BT_algo, method='OOB', oobag.curve = TRUE) best_iter_cv <- BT_perf(BT_algo, method ='cv', oobag.curve = TRUE) best_iter <- best_iter_val ## Variable influence and plot results. # Based on the first iteration. variable_influence1 <- summary(BT_algo, n.iter = 1) # Using all iterations up to best_iter. variable_influence_best_iter <- summary(BT_algo, n.iter = best_iter) ## Print results : call, best_iters and summarized relative influence. print(BT_algo) ## Model predictions. # Predict on the link scale, using only the best_iter tree. pred_single_iter <- predict(BT_algo, newdata = dataset, n.iter = best_iter, type = 'link', single.iter = TRUE) # Predict on the response scale, using the first best_iter. pred_best_iter <- predict(BT_algo, newdata = dataset, n.iter = best_iter, type = 'response')
Fit a (Adaptive) Boosting Trees algorithm. This is for "power" users who have a large number of variables and wish to avoid calling
model.frame
which can be slow in this instance. This function is in particular called by BT
.
It is mainly split in two parts, the first one considers the initialization (see BT_callInit
) whereas the second performs all the boosting iterations (see BT_callBoosting
).
By default, this function does not perform input checks (those are all done in BT
) and all the parameters should be given in the right format. We therefore
suppose that the user is aware of all the choices made.
BT_call( training.set, validation.set, tweedie.power, respVar, w, explVar, ABT, tree.control, train.fraction, interaction.depth, bag.fraction, shrinkage, n.iter, colsample.bytree, keep.data, is.verbose ) BT_callInit(training.set, validation.set, tweedie.power, respVar, w) BT_callBoosting( training.set, validation.set, tweedie.power, ABT, tree.control, interaction.depth, bag.fraction, shrinkage, n.iter, colsample.bytree, train.fraction, keep.data, is.verbose, respVar, w, explVar )
BT_call( training.set, validation.set, tweedie.power, respVar, w, explVar, ABT, tree.control, train.fraction, interaction.depth, bag.fraction, shrinkage, n.iter, colsample.bytree, keep.data, is.verbose ) BT_callInit(training.set, validation.set, tweedie.power, respVar, w) BT_callBoosting( training.set, validation.set, tweedie.power, ABT, tree.control, interaction.depth, bag.fraction, shrinkage, n.iter, colsample.bytree, train.fraction, keep.data, is.verbose, respVar, w, explVar )
training.set |
a data frame containing all the related variables on which one wants to fit the algorithm. |
validation.set |
a held-out data frame containing all the related variables on which one wants to assess the algorithm performance. This can be NULL. |
tweedie.power |
Experimental parameter currently not used - Set to 1 referring to Poisson distribution. |
respVar |
the name of the target/response variable. |
w |
a vector of weights. |
explVar |
a vector containing the name of explanatory variables. |
ABT |
a boolean parameter. If |
tree.control |
allows to define additional tree parameters that will be used at each iteration. See |
train.fraction |
the first |
interaction.depth |
the maximum depth of variable interactions: 1 builds an additive model, 2 builds a model with up to two-way interactions, etc.
This parameter can also be interpreted as the maximum number of non-terminal nodes. By default, it is set to 4.
Please note that if this parameter is |
bag.fraction |
the fraction of independent training observations randomly selected to propose the next tree in the expansion.
This introduces randomness into the model fit. If |
shrinkage |
a shrinkage parameter applied to each tree in the expansion. Also known as the learning rate or step-size reduction. |
n.iter |
the total number of iterations to fit. This is equivalent to the number of trees and the number of basis functions in the additive expansion.
Please note that the initialization is not taken into account in the |
colsample.bytree |
each tree will be trained on a random subset of |
keep.data |
a boolean variable indicating whether to keep the data frames. This is particularly useful if one wants to keep track of the initial data frames
and is further used for predicting in case any data frame is specified.
Note that in case of cross-validation, if |
is.verbose |
if |
a BTFit
object.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
BTFit
, BTCVFit
, BT_perf
, predict.BTFit
,
summary.BTFit
, print.BTFit
, .BT_cv_errors
.
Compute the deviance for the Tweedie family case.
BT_devTweedie(y, mu, tweedieVal, w = NULL)
BT_devTweedie(y, mu, tweedieVal, w = NULL)
y |
a vector containing the observed values. |
mu |
a vector containing the fitted values. |
tweedieVal |
a numeric representing the Tweedie Power. It has to be a positive number outside of the interval ]0,1[. |
w |
an optional vector of weights. |
This function computes the Tweedie related deviance. The latter is defined as:
A vector of individual deviance contribution.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
Method to perform additional iterations of the Boosting Tree algorithm, starting from an initial BTFit
object.
This does not support further cross-validation. Moreover, this approach is only allowed if keep.data=TRUE
in the original call.
BT_more(BTFit_object, new.n.iter = 100, is.verbose = FALSE, seed = NULL)
BT_more(BTFit_object, new.n.iter = 100, is.verbose = FALSE, seed = NULL)
BTFit_object |
a |
new.n.iter |
number of new boosting iterations to perform. |
is.verbose |
a logical specifying whether or not the additional fitting should run "noisely" with feedback on progress provided to the user. |
seed |
optional seed used to perform the new iterations. By default, no seed is set. |
Returns a new BTFit
object containing the initial call as well as the new iterations performed.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
Function to compute the performances of a fitted boosting tree.
BT_perf( BTFit_object, plot.it = TRUE, oobag.curve = FALSE, overlay = TRUE, method, main = "" )
BT_perf( BTFit_object, plot.it = TRUE, oobag.curve = FALSE, overlay = TRUE, method, main = "" )
BTFit_object |
|
plot.it |
a boolean indicating whether to plot the performance measure. Setting |
oobag.curve |
indicates whether to plot the out-of-bag performance measures in a second plot. Note that this option makes sense if the |
overlay |
if set to |
method |
indicates the method used to estimate the optimal number of boosting iterations. Setting |
main |
optional parameter that allows the user to define specific plot title. |
Returns the estimated optimal number of iterations. The method of computation depends on the method
argument.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
A simulated database used for examples and vignettes. The variables are related to a motor insurance pricing context.
BT_Simulated_Data
BT_Simulated_Data
A simulated data frame with 50,000 rows and 7 columns, containing simulation of different policyholders:
Gender, varying between male and female.
Age, varying from 18 to 65years old.
Noisy variable, not used to simulate the response variable. It allows to assess how the algorithm handle these features.
Car type, varying between yes (sport car) or no.
Yearly exposure-to-risk, varying between 0 and 1.
Yearly claim number, simulated thanks to Poisson distribution.
Yearly claim frequency, corresponding to the ratio between Y and ExpoR.
These are objects representing CV fitted boosting trees.
CV (Adaptive) Boosting Tree Model Object.
a list of BTFit
objects with each element corresponding to a specific BT fit on a particular fold
The following components must be included in a legitimate BTCVFit
object.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
BT
.
These are objects representing fitted boosting trees.
Boosting Tree Model Object.
BTInit |
an object of class |
BTErrors |
an object of class |
BTIndivFits |
an object of class |
distribution |
the Tweedie power (and so the distribution) that has been used to perform the algorithm. It will currently always output 1. |
var.names |
a vector containing the names of the explanatory variables. |
response |
the name of the target/response variable. |
w |
a vector containing the weights used. |
seed |
the used seed, if any. |
BTData |
if |
BTParams |
an object of class |
keep.data |
the |
is.verbose |
the |
fitted.values |
the training set fitted values on the score scale using all the |
cv.folds |
the number of cross-validation folds. Set to 1 if no cross-validation performed. |
call |
the original call to the |
Terms |
the |
folds |
a vector of values identifying to which fold each observation is in. This argument is not present if there is no cross-validation. On the other hand, it corresponds
to |
cv.fitted |
a vector containing the cross-validation fitted values, if a cross-validation was performed. More precisely, for a given observation, the prediction will be furnished by the cv-model
for which this specific observation was out-of-fold. See |
The following components must be included in a legitimate BTFit
object.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
BT
.
Predicted values based on a boosting tree model object.
## S3 method for class 'BTFit' predict(object, newdata, n.iter, type = "link", single.iter = FALSE, ...)
## S3 method for class 'BTFit' predict(object, newdata, n.iter, type = "link", single.iter = FALSE, ...)
object |
a |
newdata |
data frame of observations for which to make predictions. If missing or not a data frame, if |
n.iter |
number of boosting iterations used for the prediction. This parameter can be a vector in which case predictions are returned for each iteration specified. |
type |
the scale on which the BT makes the predictions. Can either be "link" or "response". Note that, by construction, a log-link function is used during the fit. |
single.iter |
if |
... |
not currently used. |
predict.BTFit
produces a predicted values for each observation in newdata
using the first n.iter
boosting iterations.
If n.iter
is a vector then the result is a matrix with each column corresponding to the BT
predictions with n.iter[1]
boosting iterations, n.iter[2]
boosting
iterations, and so on.
As for the fit, the predictions do not include any offset term. In the Poisson case, please remind that a weighted approach is initially favored.
Returns a vector of predictions. By default, the predictions are on the score scale.
If type = "response"
, then BT
converts back to the same scale as the outcome. Note that, a log-link is supposed by construction.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
Function to print the BT results.
## S3 method for class 'BTFit' print(x, ...)
## S3 method for class 'BTFit' print(x, ...)
x |
a |
... |
arguments passed to |
Print the different input parameters as well as obtained results (best iteration/performance & relative influence) given the chosen approach.
No value returned.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
BT
, .BT_relative_influence
, BT_perf
.
Computes the relative influence of each variable in the BTFit object.
## S3 method for class 'BTFit' summary( object, cBars = length(object$var.names), n.iter = object$BTParams$n.iter, plot_it = TRUE, order_it = TRUE, method = .BT_relative_influence, normalize = TRUE, ... )
## S3 method for class 'BTFit' summary( object, cBars = length(object$var.names), n.iter = object$BTParams$n.iter, plot_it = TRUE, order_it = TRUE, method = .BT_relative_influence, normalize = TRUE, ... )
object |
a |
cBars |
the number of bars to plot. If |
n.iter |
the number of trees used to compute the relative influence. Only the first |
plot_it |
an indicator as to whether the plot is generated. |
order_it |
an indicator as to whether the plotted and/or returned relative influences are sorted. |
method |
the function used to compute the relative influence. Currently, only |
normalize |
if |
... |
additional argument passed to the plot function. |
Please note that the relative influence for variables having an original negative relative influence is forced to 0.
Returns a data frame where the first component is the variable name and the second one is the computed relative influence, normalized to sum up to 100.
Depending on the plot_it
value, the relative influence plot will be performed.
Gireg Willame [email protected]
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.