Episode from the podcastAXRP - the AI X-risk Research Podcast

20 - 'Reform' AI Alignment with Scott Aaronson

Released Wednesday, 12th April 2023

Good episode? Give it some love!

20 - 'Reform' AI Alignment with Scott Aaronson

20 - 'Reform' AI Alignment with Scott Aaronson

Wednesday, 12th April 2023

Good episode? Give it some love!

Rate Episode

Podchaser Pro

How should we scientifically think about the impact of AI on human civilization, and whether or not it will doom us all? In this episode, I speak with Scott Aaronson about his views on how to make progress in AI alignment, as well as his work on watermarking the output of language models, and how he moved from a background in quantum complexity theory to working on AI.

Note: this episode was recorded before this story (vice.com/en/article/pkadgm/man-dies-by-suicide-after-talking-with-ai-chatbot-widow-says) emerged of a man committing suicide after discussions with a language-model-based chatbot, that included discussion of the possibility of him killing himself.

Patreon: https://www.patreon.com/axrpodcast

Ko-fi: https://ko-fi.com/axrpodcast

Topics we discuss, and timestamps:

- 0:00:36 - 'Reform' AI alignment

- 0:01:52 - Epistemology of AI risk

- 0:20:08 - Immediate problems and existential risk

- 0:24:35 - Aligning deceitful AI

- 0:30:59 - Stories of AI doom

- 0:34:27 - Language models

- 0:43:08 - Democratic governance of AI

- 0:59:35 - What would change Scott's mind

- 1:14:45 - Watermarking language model outputs

- 1:41:41 - Watermark key secrecy and backdoor insertion

- 1:58:05 - Scott's transition to AI research

- 2:03:48 - Theoretical computer science and AI alignment

- 2:14:03 - AI alignment and formalizing philosophy

- 2:22:04 - How Scott finds AI research

- 2:24:53 - Following Scott's research

The transcript: axrp.net/episode/2023/04/11/episode-20-reform-ai-alignment-scott-aaronson.html

Links to Scott's things:

- Personal website: scottaaronson.com

- Book, Quantum Computing Since Democritus: amazon.com/Quantum-Computing-since-Democritus-Aaronson/dp/0521199565/

- Blog, Shtetl-Optimized: scottaaronson.blog

Writings we discuss:

- Reform AI Alignment: scottaaronson.blog/?p=6821

- Planting Undetectable Backdoors in Machine Learning Models: arxiv.org/abs/2204.06974

Show More

Rate

Get this podcast via API

From The Podcast

AXRP - the AI X-risk Research Podcast

AXRP (pronounced axe-urp) is the AI X-risk Research Podcast where I, Daniel Filan, have conversations with researchers about their papers. We discuss the paper, and hopefully get a sense of why it's been written and how it might reduce the risk of AI causing an existential catastrophe: that is, permanently and drastically curtailing humanity's future potential. You can visit the website and read transcripts at axrp.net.

Join Podchaser to...

Rate podcasts and episodes
Follow podcasts and creators
Create podcast and episode lists
& much more

Episode Tags

Do you host or manage this podcast?
Claim and edit this page to your liking.

,

Unlock more with Podchaser Pro

Audience Insights

Contact Information

Demographics

Charts

Sponsor History

and More!

Pro Features

Resources
Help Center
Blog
API

Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More