Posts

Symbiosis in the NBA

2022-03-22

It’s been a while since I’ve done an NBA analytics project, but I’ve recently been intrigued by player-player interactions within teams. Oftentimes, fans have a hunch that two players “mesh” well together or two players’ playstyles do not complement one another. However, for the most part, this is a qualitative observation. In this article, I will present a simple, quantitative way of discovering favorable/unfavorable duos in the NBA (in addition to investigating specific duos).

[Read more]

Finding Determinants of NBA Shot Probability using Interpretable Machine Learning Methods

2020-10-25

This is a project that I am presenting as a poster at the CMU Sports Analytics Conference. A full version of this research (and associated code) is here: https://github.com/avyayv/CMSACRepo.

You may also view the poster I created at: http://www.stat.cmu.edu/cmsac/poster2020/posters/Varadarajan-NBAShotProb.pdf

Overview

Since the advent of basketball analytics, a metric that is accurately able to determine the relative worth of player’s defense has been widely sought after. It is widely regarded that features like shot defense are key to a player’s defensive identity, but regularized on-off metrics like RAPM are unable to take this into account. Using player-tracking data, we are able to extract information about shot defense.

[Read more]

Where do assists come from? (Part 2)

2020-07-05

A few weeks ago, I did some analysis with archived SportVU Player Tracking data (2015-16), looking at where on the court assists come from. You can read about that analysis at these links:

Blog post: https://analyzeball.com/2020/06/02/where-do-assists-come-from/,

Specific players: https://twitter.com/avyvar/status/1267189790388056064

League wide trends: https://twitter.com/avyvar/status/1270892658437705733).

Here, I’m a deeper dive on this data, looking at assists off misses and comparisons by position (Guards, Forwards, Center).

In addition, you might realize that the overall distribution of assists differs slightly from my previous tweets. This is because I used a more robust method of discovering assists from the raw SportVU logs. To find a general timeframe for when assists occurred, I have been using play-by-play data. But now, I determine the exact time of shot release time using calculus (smoothing, and then taking the derivative of the ball height with respect to the ground) and then backtracking to determine the last pass to this shot. This led to a significant increase in the number of assists in my dataset (around 2x more assists from before), leading to a change in the overall distribution.

[Read more]

Playing With Win Probability Models

2020-06-20

I recently developed a win probability model for the awesome py_ball package in Python. The package itself makes NBA/WNBA data accessible to a wide audience. If you haven’t seen it, you should definitely check it out. The link is https://github.com/basketballrelativity/py_ball.

In this blog post, I’ll describe the methods I used to develop the model.

Methods

Our model heavily relies on a series of logistics regressions, which are dependent on (a) the amount of time remaining in the game (b) the point differential and (c) who has possession. As of right now the only bias we introduce at the beginning of the game is home court advantage, which is why the home team always has slightly better odds than the away team. This is because we feed everything into the model with respect to the home team, so the model learns that the home team has a slight advantage. We are hoping to add betting odds to find true pre-game win probabilities.

[Read more]

Where do assists come from? (Part 1)

2020-06-02

I recently tweeted some assist heat maps that were generated using 2015-16 SportVU data here.

Although the individual player heat maps are interesting, I wanted to look at more league-wide trends. I also wanted to explain my methods a little bit more.

Why?

The reason why I found this specific problem interesting was because of its potential implications.

Players in the NBA and all of basketball have inherent bias for where they prefer shots. For instance, if a player like Ben Simmons were standing at the 3-point line, you wouldn’t guard him as tightly as you would Stephen Curry. Essentially, you could adjust coaching strategy if you better understand player tendencies.

[Read more]

Elam Ending Analytics

2020-03-15

With the NBA season being postponed, there has been a lack of basketball in the world. As a result, I thought it would be interesting to look into depth about how the Elam Ending has a place in the current NBA and how it would work.

What is the Elam Ending?

If you didn’t watch the All-Star Game in 2020, the Elam Ending is an idea where each team at the start of a period has a target score rather than fighting against the clock. Rather than having a 5 minute overtime or a 12 minute fourth quarter, each team would have to score a certain number of points, based on the higher score in the game.

[Read more]

Clustering NBA Shot Charts (Part 2)

2019-12-24

My previous blog post showed how cluster-able NBA shot charts were. I recently made a few improvements to the model and looked into things that I didn’t look into in the previous article.

A quick summary of that article is that I generated a 14 dimensional vector with shot frequencies for different locations on the court. Then I ran k-means clustering on this vector for each player over a season.

Most of the methodology is the same between the two, so please read the other article for more depth.

[Read more]

Clustering NBA Shot Charts (Part 1)

2019-11-29

Methodology

In the NBA, we often assign labels to players, not really looking in depth on what constitutes these labels. Something that we can do to figure out the “definition” of these labels and see whether these labels actually exist is to use an algorithm known as k-means-clustering to cluster shot charts (to find similar shot charts given a set of features).

My approach for clustering the shot charts was to bin groups of shots, much like we do sometimes with visualization. By binning the groups of shots, it means I used data in the form of a vector, highlighting the frequency for individual locations, like so.

[Read more]

Playmaking in the Playoffs vs. the Regular Season

2019-05-20

The 2019 NBA Playoffs have been excellent, with teams playing at their absolute best. We’ve seen teams like the Warriors and the Bucks absolutely dominate, but how have these teams, along with other teams, changed their playmaking strategies? For instance, if we look at the Bucks in the Playoffs, they have obviously decided to make Giannis drive into the paint more and pass out less. This is due to the fact that Giannis’s points in the paint generate more points per shot (field goal percentage*a three or a two) than would a three-point shooter typically.

[Read more]

Usage Rate - Regular Season vs. Playoffs

2019-05-08

When we look at games in the playoffs, we see completely different strategies employed by teams. Star players seem to be more relied on than they would in the regular season, while players with smaller roles seem to be less useful than in the regular season. This ‘hunch’ can be represented with a graph of usage rates in the playoffs vs in the regular season. Here is a graph (with a line created with a basic linear regression algorithm).

[Read more]