Great (and Not-So-Great) Expectations

Posted by Neil Paine on October 31, 2008

The 2008-09 NBA season is finally getting started, and just about every fan and media member is concerned with what we can expect from each team. So here's a question: from a strictly statistical perspective, which teams in NBA history most exceeded their preseason expectations? By the same token, which teams were the most disappointing in light of their expectations?

Well, first of all, we're going to have to come up with a formula to define "expectations" for every team. Logically, you'd think that past W-L records would be the place to start, so I created 5 linear regression models, regressing a team's winning percentage for the year in question (year Y) on the previous 5, 4, 3, 2, & 1 years (Y-5, Y-4, etc.). The results weren't surprising: the only year that really matters in terms of establishing expectations is Y-1, the season immediately prior to the one we're trying to predict. In other words, for every 2002 Lakers and 1998 Spurs — where Y-1 didn't accurately reflect the true abilities of a team as well as Y-2 did — there are countless other examples of teams whose expectations were best set by simply looking at the previous year's record.

As a matter of fact, we can even do one better than simply looking at W-L records. A common piece of APBRmetrics wisdom is that point differential is actually more informative than winning % when assessing a team's strength. And here at Basketball-Reference, we happen to have the ultimate incarnation of point differential: the Simple Rating System, which also adjusts for strength of schedule. Sure enough, when we use the SRS to run the same series of regressions that we did with winning percentage, we find that Y-1 is again the only season that is significant in establishing our win expectations in year Y — and we also see that the SRS is a slightly better predictor of future performance than winning percentage (R-squared of 0.44 vs. 0.42). Armed with that information, we can use the following equation to create our "expected wins" for any given season:

Win%_Y = 41 + (1.88 * SRS_Y-1)

So, according to our very simple method for establishing preseason expectations, which teams were the biggest surprises of all-time?

Year	Team	SRS_Y-1	xWins	Wins	Diff
2008	BOS	-3.706	34.0	66.0	32.0
1998	SAS	-7.926	26.1	56.0	29.9
1990	SAS	-7.450	27.0	56.0	29.0
1980	BOS	-4.775	32.0	61.0	29.0
2005	PHO	-2.941	35.5	62.0	26.5
1970	MIL	-5.067	31.5	56.0	24.5
1989	PHO	-4.801	32.0	55.0	23.0
1996	CHI	 4.311	49.1	72.0	22.9
1972	LAL	 3.264	47.1	69.0	21.9
2000	LAL	 2.675	46.0	67.0	21.0

What's the common thread here? Each of these teams either had key offseason additions, or they exploded for an historically great season (or both). In the first category, last year's Celtics obviously added Kevin Garnett, Ray Allen, and James Posey; the '98 Spurs added Tim Duncan & a healthy David Robinson; in 1990, San Antonio added Robinson, Terry Cummings, & Sean Elliott; the '80 Celtics added Larry Bird; the 2005 Suns added Steve Nash; the '70 Bucks added Kareem Abdul-Jabbar; and the '89 Suns added Tom Chambers & a full season of Kevin Johnson. The other 3 teams were already good in Y-1, but each peaked in year Y as one of the greatest teams in NBA history. Needless to say, rare outbursts like that are pretty hard to see coming.

And how about the most disappointing teams of all-time?

Year	Team	SRS_Y-1	xWins	Wins	Diff
1999	CHI	 7.244	54.6	21.3	-33.3
1997	SAS	 5.975	52.2	20.0	-32.2
1965	SFW	 4.390	49.3	17.4	-31.8
1983	HOU	-0.393	40.3	14.0	-26.3
2007	MEM	 3.738	48.0	22.0	-26.0
1973	PHI	-3.441	34.5	9.0	-25.5
1953	PHW	-1.071	39.0	14.3	-24.7
1985	NYK	 3.789	48.1	24.0	-24.1
1991	DEN	 1.562	43.9	20.0	-23.9
2008	MIA	-1.209	38.7	15.0	-23.7

(Note: All teams were pro-rated to an 82-game schedule.)

The causes of these collapses are more varied and complex: Michael Jordan et al's departure from the Bulls is easily the single biggest mass exodus of talent from any team in NBA history, while other franchises crumbled due to key injuries (Robinson's Spurs, Pau Gasol's Grizzlies), isolated personnel losses (Wilt Chamberlain leaving the Warriors), or just plain old attrition ('85 Knicks, '08 Heat). Catastrophes like these are, for the most part, easier to predict than the pleasant surprises that filled the previous section, assuming you have all the facts in hand -- it's always risky to guess how a player will fit within a new team's system, but it doesn't take a rocket scientist to see that a team is going to fall apart without its superstar player(s).

That said, what kind of expectations does the model set for the 2008-09 season? Here's what happens when we apply the equation to last year's SRS numbers:

Year	Team	SRS_Y-1	xWins
2009	BOS	 9.307	58.5
2009	LAL	 7.344	54.8
2009	UTA	 6.867	53.9
2009	DET	 6.671	53.5
2009	NOH	 5.464	51.3
2009	PHO	 5.138	50.7
2009	SAS	 5.104	50.6
2009	HOU	 4.835	50.1
2009	ORL	 4.788	50.0
2009	DAL	 4.702	49.8
2009	DEN	 3.739	48.0
2009	TOR	 2.469	45.6
2009	GSW	 2.381	45.5
2009	PHI	 0.188	41.4
2009	POR	-0.520	40.0
2009	CLE	-0.525	40.0
2009	WAS	-0.605	39.9
2009	SAC	-1.854	37.5
2009	IND	-1.864	37.5
2009	ATL	-2.228	36.8
2009	CHI	-3.191	35.0
2009	CHA	-4.484	32.6
2009	NJN	-5.146	31.3
2009	MEM	-5.752	30.2
2009	MIN	-6.254	29.2
2009	NYK	-6.543	28.7
2009	LAC	-6.561	28.7
2009	MIL	-6.912	28.0
2009	OKC	-8.037	25.9
2009	MIA	-8.530	25.0

Just eyeballing the list, it looks like Denver and at least one of the Phoenix/Dallas/San Antonio triad are good bets to underperform their expected win totals, while Miami, Cleveland, Philadelphia, and (especially) Houston could exceed these expectations.

As an aside, I wonder how these expected records would stand up in Erich Doerr's projection comparison? I have a hunch that they'd do surprisingly well, since most of the competitors in last year's APBRmetrics prediction challenge (including myself) had standard errors well above 9.0. Which just goes to show that you can have the most sophisticated projection system in the world, but there's a good chance it won't predict the standings any better than last year's Simple Ratings.

This entry was posted on Friday, October 31st, 2008 at 8:40 am and is filed under General, SRS, Statgeekery. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

11 Responses to “Great (and Not-So-Great) Expectations”

Ben Says:
October 31st, 2008 at 11:06 am
If you do a regression with both y-1 wins and srs, do you still get an r^2 around .44 since they're so highly correlated?
Ben Says:
October 31st, 2008 at 11:10 am
One variable to consider adding would be the previous year's win share weighted average age. That could be one number that might pick up the direction a team's headed in.
Neil Paine Says:
October 31st, 2008 at 12:48 pm
Yeah, that's exactly what happens. When I regress SRS_Y-1 and WAA_Y-1 on WAA_Y, the r-squared is still 0.44, because they're so correlated with each other (r-value between SRS and WAA is .96). Actually, in the multivariate regression, the WAA_Y-1 variable isn't even significant at the 5% level, which just reinforces SRS' predictive superiority over straight-up winning %.
Neil Paine Says:
October 31st, 2008 at 1:09 pm
That's a good idea, adding age as a variable, except I'd prefer to weight by minutes or possessions used (or both), to get at the size of a player's role on his team. I wouldn't use Win Shares, though, because they can be negative for really bad players even if they have big roles.

Let's take the 1992-93 Mavericks as an example, since they had several notably negative players by WS playing large roles... If you look at their average age when weighted by minutes played and possessions used, it's basically the same (24.9 by MP, 25.0 by Poss). But if you weight by Win Shares, their average age ends up being 31.2, because they had 7 players with negative WS, including 22-year-old Jim Jackson with the absolute worst single-season WS total of any NBA player for the years we're able to calculate the stat.

But I do think minute- or possession-weighted age could improve the model. Between 2 equally bad Y-1 teams, one old and one young, the younger team should obviously be expected to do better in year Y, and right now our simple model doesn't make that distinction.
Mountain Says:
October 31st, 2008 at 4:21 pm
If previous year is only year that "really matters in terms of establishing expectations" I wonder what you'd find if you weighted previous year performance by month in a way that gave somewhat greater weight to later months. This would help more closely capture trade impacts, rookie development, coach/team adjustments, etc. I know most recent month or game isn't necessarily more accurate for that reason alone but considering these factors I think it might be worth running some alternatives and see if you can improve the prediction even further. Not immediately sure exactly how much NBA weights conference and division games more to later months but could see that as a mitigating factor but perhaps one that could be adjusted for.
Ben Says:
November 1st, 2008 at 10:40 am
I hadn't followed all the changes to win shares. I knew decimals were added, but I didn't know it could be negative now. If you just zero out the negative contributors, you probably get good results. Minutes and possessions would, of course, be reasonable too.
Mountain Says:
November 1st, 2008 at 1:49 pm
If previous year is only year that “really matters in terms of establishing expectations” would that suggest a change from

"Give the 2007-08 season a weight of 5, the 2006-07 season a weight of 4, and the 2005-06 season a weight of 3 and calculate the weighted sum of minutes played."

to

Give the 2007-08 season a weight of 5, the 2006-07 season a weight of 2-3, and the 2005-06 season a weight of 1-2?
Mountain Says:
November 1st, 2008 at 1:57 pm
Team regression doesn't necessarily imply a change for players but I wonder whether if these player season weights work better than Marcel's. It could be checked, if it hasn't already been.
Neil Paine Says:
November 2nd, 2008 at 12:46 am
I don't know exactly where Tango's Marcel weights come from (though I remember Bill James coming up with a similar weighting system for projections), but I do know that player performance is a lot more stable from year to year than team performance. While Y-1 might be the only significant variable in predicting teams, think about how much personnel turnover happens from year to year -- between trades, the draft, free agency, etc., a team's "true talent level" can drastically change over the course of 2+ years.

Players, on the other hand, largely retain their skillsets from year to year; they may improve certain aspects of their game or get older and decline, but, barring injury, "true talent" doesn't change anywhere near as quickly and dramatically at the individual level as it does for teams. That's why I think the Simple Projection System of weighting performance over multiple seasons are appropriate for individual players, even if Y-1 is the only past season that matters when creating expectations at the team level.
Mountain Says:
November 2nd, 2008 at 2:40 am
I shouldn't have tried the two together but I'd still wonder about whether a different weight set does better. If not as big a shift as I suggested maybe 5, 3.5, 2.5-3. A small difference but if the goal is nudging the overall result might be worth looking.
Mountain Says:
November 2nd, 2008 at 3:28 pm
"tied"

or perhaps some improvement could be obtained by discounting outlier games to some extent

« What Does Opening Night Tell Us About a Team?

Great Expectations, Part II »