I know that is an ill-defined question. But it would be assuring to know that a loss function does not perform worse than you expected.
Here is a small experiment with the staple data set for classification: the spam data.
- 2/3 of the data is used for training and the remaining 1/3 for testing.
- The gbm package is used with tree-depth=2 and the default settings for other parameters.
- Two loss functions: “Bernoulli” and “Adaboost”
The results: 4.1% testing error rate for “Bernoulli” and 5.5% testing error rate for “Adaboost”. According to the ElemStatLearn book, logistic regression has around 5.5% testing error rate; does the experiment show that “Adaboost” loss function is no better than vanilla logistic regression?