I work on AI Safety and Reinforcement Learning at the Future of Humanity Institute, University of Oxford. I also lead a project on "Inferring Human Preferences" with Andreas Stuhlmüller of Ought.org. I've published papers at NIPS and AAAI, and an online interactive textook at agentmodels.org. My recent collaboration surveying AI experts is forthcoming in the Journal of AI Research. My PhD is from MIT, where I worked on cognitive science, probabilistic programming, and philosophy of science.