Published On Sep 18, 2024
This lecture was delivered at the 2024 Cooperative AI Summer School. For more information, please visit https://www.cooperativeai.com/summer-...
Rachel Freedman is a PhD student at the Center for Human-Compatible AI at UC Berkeley, where she researches misspecification problems in RLHF, model interpretability and control, and dangerous capabilities evals for foundation models.
show more