EA - My take on What We Owe the Future by elifland

The Nonlinear Library: EA Forum - En podkast av The Nonlinear Fund

Prøv Podimo gratis i hele 60! dager!

I Podimo finner du eksklusive podkaster og bestselgende lydbøker tilpasset dine ører

Kategorier:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My take on What We Owe the Future, published by elifland on September 1, 2022 on The Effective Altruism Forum. Cross-posted from Foxy Scout Overview What We Owe The Future (WWOTF) by Will MacAskill has recently been released with much fanfare. While I strongly agree that future people matter morally and we should act based on this, I think the book isn’t clear enough about MacAskill’s views on longtermist priorities, and to the extent it is it presents a mistaken view of the most promising longtermist interventions. I argue that MacAskill: Underestimates risk of misaligned AI takeover. more Overestimates risk from stagnation. more Isn’t clear enough about longtermist priorities. more I highlight and expand on these disagreements in part to contribute to the debate on these topics, but also make a practical recommendation. While I like some aspects of the book, I think The Precipice is a substantially better introduction for potential longtermist direct workers, e.g. as a book given away to talented university students. For instance, I’m worried people will feel bait-and-switched if they get into EA via WWOTF then do an 80,000 Hours call or hang out around their EA university group and realize most people think AI risk is the biggest longtermist priority, many thinking this by a large margin. more What I disagree with[1] Underestimating risk of misaligned AI takeover Overall probability of takeover In endnote 2.22 (p. 274), MacAskill writes [emphasis mine]: I put that possibility [of misaligned AI takeover] at around 3 percent this century. I think most of the risk we face comes from scenarios where there is a hot or cold war between great powers. I think a 3% chance of misaligned AI takeover this century is too low, with 90% confidence.[2] Most of the risk coming from scenarios with hot or cold great power wars may be technically true if one thinks a war between US and China is >50% likely soon which might be reasonable with a loose definition of cold war. That being said, I strongly think MacAskill’s claim about great power war gives the wrong impression of the most probable AI takeover threat models. My credence on misaligned AI takeover is 40% this century, of which not much depends on a great power war scenario. Below I’ll explain why my best-guess credence is 40%: the biggest input is a report on power-seeking AI, but I’ll also list some other inputs then aggregate the inputs. Power-seeking AI report The best analysis estimating the chance of existential risk (x-risk) from misaligned AI takeover that I’m aware of is Is Power-Seeking AI an Existential Risk? by Joseph Carlsmith.[3] Carlsmith decomposes a possible existential catastrophe from AI into 6 steps, each conditional on the previous ones: Timelines: By 2070, it will be possible and financially feasible to build APS-AI: systems with advanced capabilities (outperform humans at tasks important for gaining power), agentic planning (make plans then acts on them), and strategic awareness (its plans are based on models of the world good enough to overpower humans). Incentives: There will be strong incentives to build and deploy APS-AI. Alignment difficulty: It will be much harder to build APS-AI systems that don’t seek power in unintended ways, than ones that would seek power but are superficially attractive to deploy. High-impact failures: Some deployed APS-AI systems will seek power in unintended and high-impact ways, collectively causing >$1 trillion in damage. Disempowerment: Some of the power-seeking will in aggregate permanently disempower all of humanity. Catastrophe: The disempowerment will constitute an existential catastrophe. I’ll first discuss my component probabilities for a catastrophe by 2100 rather than 2070[4], then discuss the implications of Carlsmith’s own assessment as well as reviewers of hi...

Visit the podcast's native language site