Skip to content

AI Researcher Adrian Weller

Published:
October 1, 2016
Author:
Revathi Kumar

Contents

AI Safety Research




Adrian Weller

Senior Research Fellow, Department of Engineering

University of Cambridge

aw665@cam.ac.uk

Project: Investigation of Self-Policing AI Agents

Amount Recommended:    $50,000




Project Summary

We are unsure about what moral system is best for humans, let alone for potentially super-intelligent machines. It is likely that we shall need to create artificially intelligent agents to provide moral guidance and police issues of appropriate ethical values and best practice, yet this poses significant challenges. Here we propose an initial evaluation of the strengths and weaknesses of one avenue by investigating self-policing intelligent agents. We shall explore two themes: (i) adding a layer of AI agents whose express purpose is to police other AI agents and report unusual or undesirable activity (potentially this might involve setting traps to catch misbehaving agents, and may consider if it is wise to allow policing agents to take corrective action against offending agents); and (ii) analyzing simple models of evolving adaptive agents to see if robust conclusions can be learned. We aim to survey related literature, identify key areas of hope and concern for future investigation, and obtain preliminary results for possible guarantees. The proposal is for a one year term to explore the ideas and build initial models, which will be made publicly available, ideally in journals or at conferences or workshops, with extensions likely if progress is promising.

Technical Abstract

We are unsure about what moral system is best for humans, let alone for potentially super-intelligent machines. It is likely that we shall need to create artificially intelligent agents to provide moral guidance and police issues of appropriate ethical values and best practice, yet this poses significant challenges. Here we propose an initial evaluation of the strengths and weaknesses of one avenue by investigating self-policing intelligent agents. We shall explore two themes: (i) adding a layer of AI agents whose express purpose is to police other AI agents and report unusual or undesirable activity (potentially this might involve setting traps to catch misbehaving agents, and may consider if it is wise to allow policing agents to take corrective action against offending agents); and (ii) analyzing simple models of evolving adaptive agents to see if robust conclusions can be learned. We aim to survey related literature, identify key areas of hope and concern for future investigation, and obtain preliminary results for possible guarantees. The proposal is for a one year term to explore the ideas and build initial models, which will be made publicly available, ideally in journals or at conferences or workshops, with extensions likely if progress is promising.




Workshops

  1. The Future of Artificial Intelligence: January 11-13, 2016. New York University, NY.
  2. Reliable Machine Learning in the Wild: June 23, 2016. ICML Workshop, NY.
    • This workshop discussed a wide range of issues related to engineering reliable AI systems. Among the questions discussed were (a) how to estimate causal effects under various kinds of situations (A/B tests, domain adaptation, observational medical data), (b) how to train classifiers to be robust in the face of adversarial attacks (on both training and test data), (c) how to train reinforcement learning systems with risk-sensitive objectives, especially when the model class may be misspecified and the observations are incomplete, and (d) how to guarantee that a learned policy for an MDP satisfies specified temporal logic properties. Several important engineering practices were also discussed, especially engaging a Red Team to perturb/poison data and making sure we are measuring the right data.
    • More details of the workshop can be found at our website: https://sites.google.com/site/wildml2016/.

This content was first published at futureoflife.blackfin.biz on October 1, 2016.

About the Future of Life Institute

The Future of Life Institute (FLI) is a global non-profit with a team of 20+ full-time staff operating across the US and Europe. FLI has been working to steer the development of transformative technologies towards benefitting life and away from extreme large-scale risks since its founding in 2014. Find out more about our mission or explore our work.

Our content

Related content

Other posts about 

If you enjoyed this content, you also might also be interested in:

AI Researcher Michael Webb

AI Safety Research Michael Webb PhD Candidate Stanford University michaelwebb@gmail.com Project: Optimal Transition to the AI Economy Amount Recommended:    $76,318 Project […]
October 1, 2016

AI Researcher Daniel Weld

AI Safety Research Daniel Weld Thomas J. Cable / WRF Professor of Computer Science & Engineering and Entrepreneurial Faculty Fellow […]
October 1, 2016

AI Researcher Michael Wellman

AI Safety Research Michael Wellman Lynn A. Conway Professor of Computer Science and Engineering Professor, Electrical Engineering and Computer Science […]
October 1, 2016

AI Researcher Michael Wooldridge

AI Safety Research Michael Wooldridge Head of Department of Computer Science, Professor of Computer Science University of Oxford Senior Research […]
October 1, 2016

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and cause areas.
cloudmagnifiercrossarrow-up linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram