Inspired by a recent conversation about machine learning and decision trees, I have made my own for predicting 50+ PTS. Using the same dataset we used in the previous post, I built the below decision tree model in R, which is ~84% accurate with a sensitivity score of 64% (of the predicted scores of 50+, 64% were right). This is an OK model and continues to back up what was found in the in regression model that was built.

This is a very simple model. One of the most impactful in this model is that I required the model to look for at least 10 examples per node before a split can be made. For example it would look in the data 50+ results and then see what variable provided the most impact and had to do it with 10 examples. This model is fairly generalized, but also, hitting 50+ isn’t a common occurrence. I ran the data through another algorithm and got back nothing!

Let’s use this decision tree against a recent 56 point performance from Trae Young.

Had 17 FGM, so we go left with a yes

FGM <17 , NO! We go right

FTM <11? NO! He had 15, we go right

FG3A < 7 ? NO! He had 12, so we go right and end with a YES

Nice man