Scoring my 2019 Predictions

David Manheim
8 min readDec 31, 2019

Here’s my Annual Review.

I mark each True / False, and sometimes include [notes at the end.] Obviously, not all predictions can be scored yet, but this is for the items I do know. (I will remove /modify this line after I’ve updated again.)

My PRELIMINARY overall brier score for the year is: 0.1487

My PRELIMINARY calibration for the year is:

[50–60%) / (40–50%] 2 Correct, 0 Incorrect (100%)
[60–70%) / (30–40%] 10 Correct, 3 Incorrect (77%)
[70–80%) / (20–30%] 8 Correct, 2 Incorrect (80%)
[80–90%) / (10–20%] 8 Correct, 1 Incorrect (89%)
[90–95%) / (5–10%] 1 Correct, 0 Incorrect (100%)
[95–99%) / (1–5%] 4 Correct, 1 Incorrect (80%)
<99% \ >1% 0 Correct, 0 Incorrect (N/A)

My 3-year (2017–2019) calibration curve:

Politics

It’s a lot easier to do these predictions in an election year, but I have a few easy-to-quantify things I will mention. Also, since Israeli elections are coming up (in April,) and I’m in Israel, I’m going to risk making a fool of myself and predict a few things on those. [Israeli Elections are going to their third round... I wouldn’t have forecast that.]

True/True/True/False/False — Trump’s RCP average approval rating on 1/1/20 is above 30%/35%/40%/45%/50%, respectively: 95% / 85% / 50% / 40% / 5%. [Low of 40.8 in February, High of 45.3 in December. I’d give myself more credit if I thought this hadn’t been fairly obvious.]

True — Trump still president at end of year: 96% {90%} (Note: I was predicting this question before VFP, but they included it.) [Impeachment happened, removal is unlikely, but I win by default due to timing.]

False — VFP: No Democratic presidential candidate will become a clear frontrunner (Predictwise probability of nomination >50%) in the political prediction markets at any point in 2019: 75% {60%} [Warren got there before crashing, which is really annoying. But she isn’t anymore. Still, Dylan already admitted defeat here.]

True — VFP: The US will not enter a recession: 65% {80%} (My scoring assumes we use NBER’s retrospective peak month. They usually delay announcing for about a year, so this likely can’t be scored until 2020.) [Edit: Vox says no, which is likely true, so I’ll score it.]

True — VFP: Congress will not authorize funding for a full-length border wall: 98% {95%} (“Full length” is cheating.)

False — Added Q: Congress will authorize funding for a border walls of at least $5.7bn: 15%

True — VFP: US homicides will decline: 75% {80%} [We won’t know this with high reliability for a while (World bank Global development) — Edit: Vox says they did, which is likely true, so I’ll score it.]

International:

False: VFP: The United Kingdom will leave the European Union: 65% {80%} (I think an extension past end of March is likely, and cancellation or extensions pushing past Jan 1, 2020 are possible.) [I did better than VOX, but still missed this.]

True — Added: Brexit will be delayed past March 29th (or cancelled): 51%

True — VFP: Narendra Modi will continue as Indian prime minister after the 2019 elections 70% {60%} (I’m not better informed than Dylan and Kelsey, but I have a stronger trust in polls + stronger prior that dislike of the opposition will translate into a win.)

True — VFP: Neither India nor China will enter a recession: 80% {70%} (Similar to Dylan’s reasoning, but stronger. But joint questions are annoying.)

True — Added: India will not enter a recession: 85%

True — Added: China will not enter a recession: 85%

Israel

(Resolving this is a bit weird; I’m giving myself some slack and scoring only on the basis of the first election. And given that, I scraped by with many of these)

True[-ish] — Netanyahu is prime minister after the Israeli elections: 80% (N/A)

True — Netanyahu’s party gets the most votes: 85%. [26.4% to 26.1% — I was fairly lucky here. I didn’t really think about coalition formation and how votes get split. If Meretz of Labor had joined Blue and White, his coalition could have won easily, and his party still could have gotten beaten handily.]

True/False — Jewish home passes threshold / gets 6 seats: 60% / 35% [5 Seats]

True — Arab parties (total) seats decline from 11: 70%. (Splitting is dumb, but seems inevitable.) [They split, ended up with a combined 7.82%, and 10 seats. In the second election they got 10.60% of the vote, and 13 seats.]

EA-Related:

True — VFP: Malaria deaths will decrease 75% {80%} (Strongly based on their guess — they know more than I do about this.) [Unresolved — The report that covers 2018 came out December 2019 — Edit: Vox says true, which is likely correct, so I’ll score it.]

True — VFP: No additional countries will adopt a universal basic income: 80% {90%} (There are lots of countries that might do something, and the idea is gaining traction, so I’m hedging.)

(Likely) True — VFP: More animals will be killed for US human consumption in 2019 than in 2018: 75% {60%} (The trend is strong, the economy is fine. I’m confused that they have their probability so low.) [I don’t know where to find the information to resolve this. Edit: Vox says “Likely Correct”, so I’ll score it.]

Tech:

FALSE — VFP: Impossible Burger meat will be sold in at least one national grocery chain: 95% {95%} [Wegmans is a national store, I think…? [Edit: Vox says the stores that sell it everywhere are not national chains. I got caught on the same technicality they did.]

True — VFP: Fully autonomous self-driving cars will not be commercially available as taxis or for sale: 70% {90%} (Even if they aren’t price competitive, there’s a huge cachet in being first to market. Someone wants to do it, even if the tech is still too expensive. But they say “real commercial product,” so they might hedge if it’s offered but far too expensive, etc.) [Not there yet. I was optimistic.]

True — VFP: DeepMind will release an AlphaZero update, or new app, capable of beating humans and existing computer programs at a task in a new domain: 60% {50%} (AlphaGo was October 2015, Alphazero was Dec. 2017. I assume they have more projects that are in the works — unclear if they will release them.) [Resolution: Yes— and this was quick! AlphaStar was released Jan 24.]

Environment:

True — VFP: Average world temperatures will increase relative to 2018: 65% {60%} [As of November, it’s a big difference, so this is effectively certain.]

True — VFP: Global carbon emissions will increase: 80% {80%} [Growth rate of atmospheric CO2 is up since 2018, but awaiting data on emissions. Seems clearly to be True, but waiting for now. Edit: Vox called it, so I’ll score it.]]

My earlier long-term predictions:

(2017) There will be a Republican primary challenger getting >10% of the primary vote in 2020 (conditional on Trump running) — 60% (was 70%. I’m thinking about total popular vote, and given the structure of primaries, this is a higher bar that I initially thought about. Still, there are a LOT of republicans who hate him, and many more public figures who would switch over if they weren’t scared of what happens when Trump wins.)
(I was optimistic, but we’ll see.) [Changing my mind for next year — this isn’t happening]

(2017) The stock market [Edit: S&P] will go down under President Trump (Conditional on him having a 4 year term, Inauguration-Inauguration) — 60%(no change, was 60%. But I’m affirming because which split congress usually means markets go up, I have greater concerns. I’m updating based partly on results so far, with markets up, and partly on my suspicions that the current gyrations will get worse, and that the current economic mismanagement really is a big problem.)
(Update — This looks very unlikely.) [Yeah, I was wrong here too.]

(2018) The retrospective consensus of economists about the 2017 tax bill will be;
Unresolved — …didn’t increase GDP growth more than 0.2%: 96% (was: 95%)
Unresolved — …that, after accounting for growth, it increased the 10-year deficit more than $1tr / $1.2tr / $1.5tr, respectively: 93% / 80% / 45% (was: 90% / 70% / 40%. [But see recent article about how poorly it’s working out. Here also, “The Fed spent most of last year concerned that Trump’s tax cuts would spur a hot economy and rising inflation. Fed leaders anticipated they would need to raise interest rates twice in 2019 to tap the brakes on the economy. None of that came to pass.”]

True — (2018) The House will vote to impeach Trump before the end of his current term: 75% (was 65%) Note: 50% vote needed.
[Update: Called it.]

Unresolved — (2018) Conditional on impeachment, the senate will convict: 10% (was 20%) Note: 67% vote needed. (Most uncertainty is if he does something additionally crazy, crazy enough to prompt short term worries about safety/stability.) [We’ll see. But I was underestimating the crazy that people would be OK with, and also underestimating how likely he was to do something that’s objectively and clearly illegal misuse of power.]

(2018 — I neglected to include this before, but now carried over for scoring.) AI wins a Real Time Strategy game (RTS — Starcraft, etc.) in full-mode against the best human players before end of;
False — 2019–45% [OpenAI Five for DOTA didn’t play full mode. And the early-2019 “victory” of AlphaStar in StarCraft was an automation win based on superspeed, not strategy. That was possible years ago. The more reasonable play in the Starcraft ladder clearly showed that it can’t beat the *best* human players quite yet — though it’s VERY close. But I’m biased here, so maybe this should be true?]
Unresolved — 2020–60% [I now think this is too low. I’ll update for next years’ predictions.]
Unresolved — Within Byun Hyun Woo’s Lifetime: 98%

Personal

True — Still living in Israel at end of year: 97%

True — I have (some) official academic affiliation: 60%.

True/False/False/True — I have an affiliation with: F-O/CC-C/Te-I/Other: 40% / 40% / 30% / 20%

True — My multi-Agent Goodhart paper is accepted into the special issue: 60%
[Eventually. This took FOREVER.]

True/True/True — I publish or submit pre-prints of at least 1/2/3 more papers: 90%/80%/60%.
[4 have been submitted, two rejected, another already submitted to a new journal. I forgot how ridiculously long this takes.]

False/False/False — My Google Scholar H-index hits 7 / 8 / 9: 65% / 35% / 5% [No. Argh. Scholar Coverage is annoying, and citations come in slowly — I’m a single citation on one paper away from 7. (I could Goodhart this and combine 2 different publications on the same project to hit 7, but I won’t.)]

True/True — My actual (no-self cites, includes non-google sources) H-index hits 6 / 7 : 70% / 30%

--

--