Jake Campbell bio photo

Jake Campbell

Product analyst at Vimeo
3 > 2

Email

LinkedIn

Github

Mixed-Effects Models

One of the main assumptions of regression is independence of observations. What this means is we don’t want to measure the same observation twice, or deal with connected observations. We see this a lot in longitudinal studies or studies where groups are present. If we violate this assumption, our coefficients and p-values aren’t going to reliable; they could be inflated giving us false information on what is significant. How can we deal with this issue??? One way that we’ll go over today is the mixed-effects model!

Read More

Parameter Tuning with Caret

One of the most important aspects of any model is its parameters. For example, how do we know the ideal amount of trees for a random forest, or the minimum number of observations needed in each node of a GBM? Finding just the right parameters can take your model from good to great… the question is (as with most things in life), how can I do this without putting a ton manual of work into it? Well, this tutorial shows us how to search for ideal model parameters using the caret package we went over in a previous tutorial!

Read More

Decision Trees Pt. 3: Boosted Decision Trees

The past two tutorials have focused on different models involving a tree-based structure. The last one gave an introduction to the idea of ensemble modeling, or combining multiple models together to get a better prediction. This lesson focuses on a different type of ensemble model called boosting.

Read More

Decision Trees Pt. 2: Bagged Trees and Random Forest

In the last tutorial, we saw the basics of a single decision tree. While easy to interpret and understand, it still leaves some things to be desired. In general CARTs (Classification and Regression Tree) are lazy learners that struggle with model variance. They don’t perform much better than basic regression and can have varied outcomes and interpretations. To address this, we can use ensemble methods, creating several tree models rather than relying on a single tree. Two related ensemble tree methods we’ll look at in this tutorial are bagged trees and random forest.

Read More

Decision Trees Pt. 1

So far, we’ve looked at parametric models. These are models that have set assumptions that are required to make them statistically valid. There are other classes of bad-boy type models that don’t follow the rules, and don’t really care. One of the more popular types that we’ll go over today is the basic decision tree. This lesson is going to cover the basics of a simple decision tree model, but we’ll expand on it much more in the near future.

Read More