All Podcast Episodes
Dan Hendrycks on Catastrophic AI Risks
January 6, 2024
Video
Dan Hendrycks joins the podcast again to discuss X.ai, how AI risk thinking has evolved, malicious use of AI, AI race dynamics between companies and between militaries, making AI organizations safer, and how representation engineering could help us understand AI traits like deception.
Timestamps:
00:00 X.ai - Elon Musk's new AI venture
02:41 How AI risk thinking has evolved
12:58 AI bioengeneering
19:16 AI agents
24:55 Preventing autocracy
34:11 AI race - corporations and militaries
48:04 Bulletproofing AI organizations
1:07:51 Open-source models
1:15:35 Dan's textbook on AI safety
1:22:58 Rogue AI
1:28:09 LLMs and value specification
1:33:14 AI goal drift
1:41:10 Power-seeking AI
1:52:07 AI deception
1:57:53 Representation engineering
Podcast
Related episodes
If you enjoyed this episode, you might also like:
March 14, 2024
Katja Grace on the Largest Survey of AI Researchers
Play
January 6, 2024
Frank Sauer on Autonomous Weapon Systems
Play
January 6, 2024
Darren McKee on Uncontrollable Superintelligence
Play