| |
MLB Forum
Negative correlation between results
Posted By: Colin Caster In Response To: Refinements (James)
Date: 12 Aug 01, 10:26 pm
The actual matchups remaining should have a large effect on the playoff qualifying, although it may turn out that assuming all games independent is a "good enough" approximation.
But when Houston has 10 of their last 48 games against the Cubs, there is a strong anti-correlation between their results. If nothing else, I can guarantee they won't both go 44-4 or better... By late September, the anti-correlation effect will be huge in the NL west as LA plays 13 of their last 16 against SF and Arizona.
Good point. With baseball's 162 game season it's easy (for me) to forget that there is a modest built-in expected negative correlation between results of any pair of teams that play one another. An otherwise good model that fails to account for this negative correlation should produce good estimates of number of wins on an individual team by team basis, but may not produce good estimates of the relative number of wins.
This may be an important distinction, because the flawed model could be useful for predicting whether Team A goes over X wins for the season, but not for predicting whether Team A beats Team B out for the division title.
OTOH, I do not believe that the negative correlation we're talking about is sizeable enough to make a substantial difference in the simulation. I will try it both ways and report back, but note that sharing 10 games in a 44 game season remainder, which is about the highest current proportion of shared games in any pair of team schedules, should produce a negative correlation in only the -.20 to -.25 range. (I calculated this empirically; if someone knows a general solution, I'd be very interested.)
So, again, I don't think this will make a difference, but I will include it (it's easy to include a population correlation matrix in my cool new sim program) along with some of the other excellent suggestions made in this thread, and report back. Thanks!
MathBoy and other stat-heads: any other ideas for running these simulations?
CC
- AL regular season records: Monte Carlo study -- Colin Caster -- 8 Aug 01, 10:58 pm
- NL Monte Carlo -- Colin Caster -- 8 Aug 01, 11:36 pm
- Adjustment -- Editor -- 9 Aug 01, 7:43 am
- Definitely -- Colin Caster -- 10 Aug 01, 8:58 pm
- Refinements -- James -- 10 Aug 01, 3:08 pm
- Monte Carlo -- StevieY -- 10 Aug 01, 3:17 pm
- Agreed. -- Colin Caster -- 12 Aug 01, 10:29 pm
- Negative correlation between results -- Colin Caster -- 12 Aug 01, 10:26 pm
- Agreed. -- Colin Caster -- 12 Aug 01, 10:29 pm
- Definitely -- Colin Caster -- 10 Aug 01, 8:58 pm
- Variance? -- James -- 10 Aug 01, 3:41 pm
- Updated vig. -- James -- 10 Aug 01, 4:27 pm
- Good sim ideas -- Colin Caster -- 12 Aug 01, 10:34 pm
- Futures surveys -- James -- 13 Aug 01, 12:48 am
- Good sim ideas -- Colin Caster -- 12 Aug 01, 10:34 pm
- Adjustment -- Editor -- 9 Aug 01, 7:43 am
- NL Monte Carlo -- Colin Caster -- 8 Aug 01, 11:36 pm
| |
MLB Forum is maintained by Pi Yee Press