AXRP - the AI X-risk Research Podcast

En podkast av Daniel Filan

Prøv Podimo gratis i hele 60! dager!

I Podimo finner du eksklusive podkaster og bestselgende lydbøker tilpasset dine ører

59 Episoder

35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
Publisert: 24.8.2024
34 - AI Evaluations with Beth Barnes
Publisert: 28.7.2024
33 - RLHF Problems with Scott Emmons
Publisert: 12.6.2024
32 - Understanding Agency with Jan Kulveit
Publisert: 30.5.2024
31 - Singular Learning Theory with Daniel Murfet
Publisert: 7.5.2024
30 - AI Security with Jeffrey Ladish
Publisert: 30.4.2024
29 - Science of Deep Learning with Vikrant Varma
Publisert: 25.4.2024
28 - Suing Labs for AI Risk with Gabriel Weil
Publisert: 17.4.2024
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
Publisert: 11.4.2024
26 - AI Governance with Elizabeth Seger
Publisert: 26.11.2023
25 - Cooperative AI with Caspar Oesterheld
Publisert: 3.10.2023
24 - Superalignment with Jan Leike
Publisert: 27.7.2023
23 - Mechanistic Anomaly Detection with Mark Xu
Publisert: 27.7.2023
Survey, store closing, Patreon
Publisert: 28.6.2023
22 - Shard Theory with Quintin Pope
Publisert: 15.6.2023
21 - Interpretability for Engineers with Stephen Casper
Publisert: 2.5.2023
20 - 'Reform' AI Alignment with Scott Aaronson
Publisert: 12.4.2023
Store, Patreon, Video
Publisert: 7.2.2023
19 - Mechanistic Interpretability with Neel Nanda
Publisert: 4.2.2023
New podcast - The Filan Cabinet
Publisert: 13.10.2022

2 / 3

AXRP (pronounced axe-urp) is the AI X-risk Research Podcast where I, Daniel Filan, have conversations with researchers about their papers. We discuss the paper, and hopefully get a sense of why it's been written and how it might reduce the risk of AI causing an existential catastrophe: that is, permanently and drastically curtailing humanity's future potential. You can visit the website and read transcripts at axrp.net.

Visit the podcast's native language site

59 Episoder

35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization

34 - AI Evaluations with Beth Barnes

33 - RLHF Problems with Scott Emmons

32 - Understanding Agency with Jan Kulveit

31 - Singular Learning Theory with Daniel Murfet

30 - AI Security with Jeffrey Ladish

29 - Science of Deep Learning with Vikrant Varma

28 - Suing Labs for AI Risk with Gabriel Weil

27 - AI Control with Buck Shlegeris and Ryan Greenblatt

26 - AI Governance with Elizabeth Seger

25 - Cooperative AI with Caspar Oesterheld

24 - Superalignment with Jan Leike

23 - Mechanistic Anomaly Detection with Mark Xu

Survey, store closing, Patreon

22 - Shard Theory with Quintin Pope

21 - Interpretability for Engineers with Stephen Casper

20 - 'Reform' AI Alignment with Scott Aaronson

Store, Patreon, Video

19 - Mechanistic Interpretability with Neel Nanda

New podcast - The Filan Cabinet