Richard Dazeley
Why do you care about AI Existential Safety?
As AI systems are increasingly integrated into everyday society, I am increasingly concerned with ensuring systems integrate safely, ethically, and that they can articulate their behavior clearly to build trust. Safety is always a trade-off between accomplishing the required task while not impacting the environment negatively and behaving in a way that is predictable to others in that environment. An autonomous car that behaves entirely safely will never go anywhere, so some risk is required. Determining and controlling the levels of acceptable risk and ensuring the system can explain its behavior against these conflicting issues is key essential to emerging Strong-AI systems.
Please give one or more examples of research interests relevant to AI existential safety:
My research over the last few years has focused on expanding our work in multiobjective Reinforcement Learning (MORL) to facilitate the automation of trade off between task accomplishment, safety, ethical behavior, and the consideration of other actors in an environment. This has seen the development of agents that can self-identify negative impacts and avoid making them in the future. This is leading into work on identifying causality in dynamic environments to ensure an agent can prevent temporally distant impacts of its behavior. We have put forward a framework for how to maintain a dialog with stakeholders to ensure an understanding of these safe and ethical trade-offs are clearly articulated and justified. We have even put forward apologetic framework that can identify a user’s safety-oriented preferences autonomously.