Skip to content

Why do you care about AI Existential Safety?

In my undergraduate degree in philosophy and computer science I was engaging with arguments on existential risks from superintelligent AI systems beginning in about 2015. I have followed the literature and arguments closely since then. The topic of my PhD—agent-environment interactions during planning—was chosen with regard to the safety of deep RL agents (which seemed like the dominant paradigm in AI in 2019). Since then, the urgency of aligning AI has increased. In short, I believe that the future might be extremely good, but that the chance that all future value will be lost is considerable (~40% in this century), and I want to do what I can to ensure that the long-term flourishing of humanity is ensured.

I have a good understanding of the field, having completed the AGI Safety Course, attended philosophy courses on existential risk, organized groups and journal clubs on the topic and engage with researchers in the field.

Please give at least one example of your research interests related to AI existential safety:

My broad interest in AI safety is best described as high-level interpretability/evaluations: constructing experiments to elicit behaviors that can tell us about the inner workings of frontier models even if the inner states of the model are hard to interpret. An ability that I care about (and that I am an expert in) is planning: search over a world model to come up with intelligent actions in never-before-seen situations.

I have studied this in the context of agents embedded in a physical environment in my PhD (see my publications).
There, I investigated how the visual structure of the environment might allow for subgoal decomposition, a crucial puzzle piece in the ability to plan efficiently. In a simulated physical environment, agents with different planning strategies built towers, demonstrating that visual subgoal decomposition can indeed mitigate the computational demands of planning (paper). Behavioral experiments show that humans on the same task engage in visual subgoal decomposition, and that their subgoal choices are explained by planning cost (paper—full writeup is in preparation).
Planning requires a model of the world. To understand human and AI physical world models, I contributed in the creation of a large dataset and ran a benchmarking study to test AI models physical reasoning using ThreeDWorld. This work—Physion—was accepted at NeurIPS. The comparison to humans utilizes the Cognitive AI Benchmarking framework, which I maintain. I co-organized a workshop on Cognitive AI Benchmarking, which develops best practices around comparing the behavior of humans and AI systems with a focus on representational alignment: the degree to which humans and AIs operate on similar mental representations.

Currently, I am interested in understanding how planning and reasoning in LLMs can be investigated and supervised.
I recently developed and ran an evaluation of steganography in AI. Following this, I am working on investigating which optimization pressures might lead to the emergence of planning behaviors in models without explicit search over a world model built-in.

Why do you care about AI Existential Safety?

AI is unlike other technologies in that, instead of just making a powerful tool for us to use, we aim to create systems that learn and develop on their own with emergent capabilities that we will not by default control. We need much more work on building safe and beneficial AI, rather than just plunging headfirst into an era of more and more powerful systems. The future turns on the extent and manner in which we’re able to embed social values in these unprecedentedly powerful systems.

Please give at least one example of your research interests related to AI existential safety:

The increasing capabilities and autonomy of AI, particularly large language models, is leading to radically new forms of human-AI interaction. I argue that we can understand such dynamics by building on the longstanding scaffolding in human-computer interaction theory, including computers as social actors, moral attribution, mind perception, and anthropomorphism. Namely, I characterize “digital minds” as having or appearing to have mental faculties (e.g., agency, intent, reasoning, emotion) that circumscribe interaction. We can mitigate existential risk through operationalizing these faculties and researching the complex systems that emerge between digital minds and humans, as well as digital minds with each other.

Why do you care about AI Existential Safety?

The AI systems of today have already transformed the world in a myriad of ways, bad and good. Our uncertainty about the capacity of future systems necessitates a cautious approach. Work on AI safety now will reduce path-dependent risks from both any possible rogue agents and from the everyday havoc that can come from systems we don’t understand.

Please give at least one example of your research interests related to AI existential safety:

We want AIs to do what we want, to abide by our values. Thus I propose homing in on value alignment—getting a better picture of what we mean by values and what we mean by alignment. Clarifying these conceptual gaps and formalizing fuzzy notions of value will help develop technical and conceptual countermeasures against AI risks.

One way to do so is by identifying which values to learn in any downstream system. I have done so in part by formalizing the values and strategies that people take. Still, we must further operationalize those values in AI systems in order to align to them. In ongoing work I measure the degree of alignment between LLMs and humans as it concerns moral disagreements.

Why do you care about AI Existential Safety?

I deeply care about AI existential safety, given the rapid pace at which it is transforming education. This transformation is occurring faster than governments and educational bodies can adapt. The ethical challenges posed by AI in education and the risks of student data misuse are significant. Despite the immense benefits of AI in education, transparency is crucial at all levels.

Existential threats also arise from a changing global economy and society. Current students may be disadvantaged due to their lack of experience. These students are likely to be among the first to experience job displacement unless mitigating measures are implemented. This would require an overhaul of traditional education systems and a focus on emotional development, innovation, and collaborative problem-solving for global issues.

Please give at least one example of your research interests related to AI existential safety:

As for my research interests related to AI existential safety, they currently lie in AI ethics and implementing a new skills curriculum to support future generations. I am focused on mitigating risks and promoting transparency for students. I have studied AI Ethics with London School of Economics & sit on a panel for AI in Education chaired by Sir Anthony Seldon and on the Microsoft AI in Education panel, supporting their ethical decision making. I am also collaborating with Bristol, Oxford, Google and Glasgow University on AI research in education.

Why do you care about AI Existential Safety?

It seems likely that AGI, or even superintelligent AI, will be developed in the next few years, if we are lucky, in the next few decades. There are strong incentives to do so, and AI companies are pouring billions of dollars into it. But aligning such AGIs with our values and goals is currently an unsolved problem, and even if we succeed in aligning AI, bad actors could deliberately give an AI harmful goals. This could lead to an existential catastrophe for humanity. So I think we should devote significantly more resources and intellectual effort towards solving these challenges.

Please give at least one example of your research interests related to AI existential safety:

I am mainly interested in understanding the inner workings of AI systems so that we can detect misaligned goals or deceptive behaviour. Currently, I am working on a Lie Detector for Large Language Models, with the goal of making it reliable and generalize well to various contexts.

Why do you care about AI Existential Safety?

I am a PhD researcher focused on developing methods to build moral alignment into the core of AI systems. While many researchers are working on advancing task-specific capabilities in AI models (e.g., improving on baselines), I believe it is essential to address a more fundamental dimension of intelligence – social and moral reasoning, so that when these models get deployed, they do not cause collateral damage to human society.
* I believe that working on building morality into systems is essential to building robust aligned AI systems in the future, especially as the end product becomes less and less interpretable by human researchers, and controlled post-hoc fine-tuning becomes less feasible.
* This is why day-to-day I work on technical implementations of ideas from moral philosophy & psychology into learning agents (i.e., simulated models of future adaptive AI systems). My research so far has centered around Multi-Agent Reinforcement Learning simulations and moral fine-tuning of Language Models.

Please give at least one example of your research interests related to AI existential safety:

My best work so far is my PhD research on developing moral reasoning in learning agents. I have summarised this in 1 conceptual paper and 1 published experimental paper. In the conceptual paper (see ‘AI Safety Manifesto’ pdf attached), we analyse the space of approaches to moral alignment on a spectrum from fully top-down imposed rules (e.g. logic-based rules or constrained learning) to the more recent fully bottom-up inferred values (e.g. RLHF & Inverse RL). After reviewing existing works along this continuum, we argue that the middle of this range (i.e., a hybrid space) is too sparsely populated, and motivate the use of a combination of interpretable top-down quantitative definitions of moral objectives, based on existing frameworks in fields such as Moral Philosophy / Economics / Psychology, with the bottom-up advantages of trial-and-error learning from experience via RL. This hybrid methodology provides a powerful way of studying and imposing control on an AI system while enabling flexible adaptation to dynamic environments. We review 3 case studies combining moral principles with learning (namely, RL in social dilemmas with intrinsic rewards, safety-shielded RL, & Constitutional AI), providing a proof-of-concept for the potential of this hybrid approach in creating more prosocial & cooperative agents. My experimental paper then implements the intrinsic rewards method in simulated RL agents, demonstrating relative pros and cons of learning via different moral frameworks in multi-agent social dilemmas. I next plan to extend insights from this work towards implementing moral preferences in LLM-based agents.
Links:
* conceptual paper
* experimental paper, code & video

Why do you care about AI Existential Safety?

Even without fast takeoffs and Artificial General Intelligence within the next few years, it is likely that we will see transformative effects of advanced AI systems on our societies, the economy and our political systems. These transformations harbour enormous potential for scientific discovery, economic growth and human empowerment, but they also bring with them significant risks from misuse to loss-of-control scenarios and structural risks as a result of race dynamics. Stewarding this transformation – harnessing the benefits while mitigating the risks – is a crucial challenge for humanity in the 21st century. To be successful, we will need solid and sustainable governance mechanisms and institutions that require joint international and cross-sectoral collaboration.

Please give at least one example of your research interests related to AI existential safety:

I have previously published on safety standards and risk management practices for AI development and deployment, informed by a number of case studies of other high-risk or dual-use technologies. During my time in Taiwan, I investigated the (inter-)national security implications of bottlenecks in the semiconductor supply chain. These days, I think a lot about the direct and indirect effects that advanced AI systems could have on democracy, what other challenges will emerge during a potential intelligence explosion (incl. rapid economic growth), and how to manage these challenges, including how to develop robust collective decision making procedures.

Why do you care about AI Existential Safety?

I believe even if there is less than 0.01% chance of AI wiping out humanity, it’s worth every effort to avoid it. With the recent advances in generative AI and its integration with tech in every sphere of human life is a testament to the dangers it poses. The least we can do is create awareness.

I was at an ed-tech panel in the Ed-tech summit in Birmingham recently where everyone among the panelists admitted that they can’t prevent students from using LLMs, but none of them were willing to admit that AI has already penetrated our education system in ways that we can’t control. There was a clear contradiction in both these beliefs and I was surprised how casually they ignored the dangers of AI.

Please give at least one example of your research interests related to AI existential safety:

I am passionate about two main research interests:
Firstly, the power of AI to manipulate humans into taking actions that might be harmful. My Phd thesis on the Transparency of AI revealed that our limited understanding of AI systems can lead to trusting these systems and ignoring their drawbacks.
Secondly, AI going rogue and taking actions that can lead to human extinction. This can be intentional from bad actors or a mistake where optimises for its end goal that leads to human extinction.

Why do you care about AI Existential Safety?

I care about “AI existential safety” because it is of utmost importance as the development of sophisticated AI technologies presents both opportunities & dangers. It’s vital to align AI’s goals with human values and priorities to avert negative consequences stemming from divergent objectives. This is particularly crucial as AI’s abilities approach/exceed those of humans. Prioritizing existential safety means actively working to protect our future, maintain our independence, and ensure a peaceful coexistence with AI. By doing so, we aim to direct AI’s vast potential towards enhancing human existence and tackling major global issues, rather than allowing it to evolve into a force of unpredictable and possibly detrimental outcomes. This involves a comprehensive understanding of AI’s capabilities, robust ethical frameworks, effective governance strategies, and continuous monitoring to ensure that AI systems do not deviate from intended ethical guidelines. Additionally, there’s a need for international cooperation in setting standards for AI development and use, ensuring that these technologies are managed responsibly on a global scale.

Please give at least one example of your research interests related to AI existential safety:

I have worked at Rolls-Royce contributing to the safety-critical software development for aircraft engines. Following that, I started my PhD research at the University of Oxford. My research, including a paper on uncertainty quantification in Remaining Useful Life prediction for aircraft engines, aligns with AI safety. It emphasizes the importance of estimating uncertainties in AI/Deep Learning models for safe AI development. Another focus of my research is on Generative AI, specifically regarding AI safety. Foundation models, a type of generative model, present unique risks if their development does not reflect human values. My work explores methods to enhance these models for greater alignment with user goals, especially through Computer Vision tasks. This research aims to refine the application of foundation models, ensuring they are more versatile and practical for diverse needs.
My publication list can be found here.

Why do you care about AI Existential Safety?

I had extensive experience working in academic hospitals, where I interacted with clinicians and came across patients on a daily basis. In the clinical world, decisions were made by clinicians, through conversions with patients/families, examinations (e.g. radiology, pathology, genetics), and consultation with clinicians across departments – often a complicated, concerted effort. Practicing clinicians have gone through long professional training, which entails not only profound knowledge and hands-on practice, but also the solemn oath to respect human life, ethics, and the discipline of medicine. Both are fundamental to our trust in doctors. The recent surge of AI is transformative: in many fictions/prophecies, AI is to take over doctors, because it can be more knowledgeable and meticulous than any human being, especially with large models. This leads to my concern about AI’s path of deployment in healthcare, and more generally about existential safety. While AI is centered on optimization, healthcare transcends mere optimization, emphasizing human values. Without vigilant oversight, AI might prioritize optimization at the expense of trust fundamental to our healthcare systems.

Please give at least one example of your research interests related to AI existential safety:

I design trustworthy AI methods that harness the power of computation and data to solve real-world clinical problems, from diagnosis to intervention. I can provide two examples of my current research related to AI existential safety:

1. Uncertainty of AI prediction
I approach trustworthiness from the lens of uncertainty. We use the classical Bayesian inference framework but adapt it to be computationally tractable for huge AI models. The idea is to infer the posterior probability distributions of the AI prediction. If there is a large entropy, it implies high uncertainty, and the results need to be carefully checked by human experts. We have very exciting results showing that our uncertainty estimation is very effective in detecting MRI segmentation errors by a trained AI model, while run through massive population datasets. This research also addresses the ‘over-confidence’ issue of deep learning, which is one of the fundamental threats to safety.

2. Inductive Bias into AI
A second example is my ongoing research on inductive bias. Designing an efficient way to instil inductive bias into AI is crucial to overcoming the unexplainable error typical of the purely data-drive AI black boxes. Learning rules and principles from data is far from trivial, even with extremely large models and datasets. With proper regularization, which encodes prior knowledge (e.g., human anatomy, epidemiological distribution, etc.), we can more effectively ensure AI-generated predictions’ plausibility. A simple example is that we encode the heart anatomical model to remove wrong AI predictions of heart shape from artefact-corrupted MRI exams.

Collectively, the two research trajectories — viewed from both posterior and prior perspectives — underscore my belief that, by integrating classical statistical learning theories, we can imbue AI with certain mathematical rigor. This will be instrumental in addressing unforeseen risks, thus boosting the intrinsic safety of AI. I anticipate the opportunity to collaborate with the Future of Life community, to advance the frontier with like-minded, forward-thinking scientists.

Why do you care about AI Existential Safety?

​AI is going to be part of the decision-making in many real-life applications. Hence, it is important to ​develop ​fair and safe AI-decision making. Without safety guarantee, ​it might be even dangerous to ​implement AI in real-life applications. ​For example, consider the Large language models which can generate very good response, and ​even codes. However, ​how can one ensure that those responses are safe, and fair? ​Hence, it is very important to consider safety and fairness alongside ​building accurate models. As an second example, AI is being now used in robots, ​and drones. ​The question is how can you ensure that the trajectories can be safe. ​I am interested in ​developing AI algorithms while it can be safe.

Please give at least one example of your research interests related to AI existential safety:

My current research lies in safe Reinforcement Learning or constrained Markov Decision Process (CMDP). Many constrained sequential decision-making processes such as safe AV navigation, fair multi-agent learning, wireless network control, safe transportation control, safe edge-computing etc., can be cast as CMDP. Reinforcement Learning (RL) algorithms have been used to learn optimal policies for unknown unconstrained MDP. Extending these RL algorithms to unknown CMDP, brings the additional challenge of not only maximizing the reward but also satisfying the constraints. My research lies in providing theoretical guarantees on developing RL algorithm which can ensure safety as well maximize the objective. My research interest lies in both online and offline RL algorithm development. Recently, I developed the first model-free RL algorithm for large-state space with provable sample complexity guarantee. In particular, our developed algorithm guarantees feasibility guarantee as well optimality guarantee and can achieve that using the smallest possible samples. Hence, my proposed algorithm is achievable even using small number of samples. My research also revolves around developing algorithm which can adapt to the non-stationarity a well. Towards this end, I developed the first safe RL algorithm which can provably adapt to the non-stationarity in the MDP. I also developed the first safe offline RL algorithm with provable performance guarantee which can have tremendous impact as it can generate safe algorithm only using offline database. Finally, recently, I am also involved in developing distributionally safe RL algorithm which can learn algorithms even with high-probability for practical setup.

In terms of application, my research also concerns about developing safe RL algorithm for safe robot navigation. In this regard, I am collaborating with the Ohio State University in implementing such safe RL algorithms in practical setup. In this regard, we are trying to develop algorithms which will not violate safety even during training. With the collaboration of the Northeastern University, we are developing the safe beam forming algorithm which can identify the directions for sending signal fast and yet can minimize the interference at the neighboring terminal. Further, such proposed approach can adapt to the time-varying channel condition. Hence, our approach can be applicable in next generation wireless network safely.

Why do you care about AI Existential Safety?

My concern for AI existential safety is based on the immense impact AI will have—both positive and negative. From my early work pioneering reinforcement learning from human feedback (RLHF), I’ve mostly focused on how humans can communicate what they want to AI. This is not merely a technical challenge but a moral imperative. In my research, I’ve seen firsthand how misaligned AI poses significant risks and fear the consequences of misalignment with the ever-increasing agency and capabilities of AI. Examples like reward functions in autonomous driving that misinterpret human preferences illustrate the gravity of these risks. My transition from industry to academia, specifically to focus on AI alignment, further demonstrates my concern for ensuring AI advancements contribute positively to society. I believe that without meticulous research and thoughtful consideration in designing and deploying AI systems, we risk unforeseen consequences that could have far-reaching impacts. Consequently, my dedication to AI existential safety research comes from a desire to prevent potential catastrophic outcomes and to steer AI development towards beneficial and safe applications for humanity.

Please give at least one example of your research interests related to AI existential safety:

The majority of my research career has focused on AI alignment, which appears to be critical for reducing x-risk from AI. My dissertation at UT Austin pioneered the reinforcement learning from human feedback (RLHF) approach, now a key training step for large language models (LLMs). I concluded my dissertation defense by highlighting RLHF’s potential to empower humans in teaching AI to align with their interests (video). During a postdoc at MIT, I led the first course worldwide on interactive machine learning and co-authored the field’s most cited overview. At this stage, while not explicitly focused on existential risk, I was already contemplating human-AI alignment.

Later, I returned to academia as a lab’s co-lead. I was hired to work on reinforcement learning (RL) for autonomous driving but was drawn to a niche: how the AI could know humans’ driving preferences and how existing research communicates such preferences. In RL, these preferences are communicated with reward functions.

So began my current alignment research. The results so far are listed here. I have focused on human-centered questions critical to alignment but too interdisciplinary for most technical alignment researchers. I have considered how reward functions could be inferred from implicit human feedback such as facial expressions (website). I have identified issues resulting from the ad hoc trial-and-error reward function design used by most RL experts (website). And I have found that published reward functions for autonomous driving can catastrophically misspecify human preferences by ranking a policy that crashes 4000 times more often than a drunk US teenager over a policy that safely chooses not to drive (website). I also have published three papers that begin with questioning the previously unexamined assumption within contemporary RLHF of what drives humans to give preferences (see the final three papers on this page). This research has all been published in or accepted to top AI venues.

While most computer science researchers make mathematically convenient assumptions about humans without questioning them, which can provide a misaligned foundation upon which they derive algorithms that result in misaligned AI. Instead, I take an interdisciplinary approach—blending computer science with psychology and economics, as demonstrated by the projects above—which positions me to provide unique insights at the intersection of AI and the human stakeholders its decisions affect, insights that are complementary to and serve as a check for pure computer science research on alignment.

Why do you care about AI Existential Safety?

For the same reasons as in the Macaskill book “”what do we owe the future””.

Please give at least one example of your research interests related to AI existential safety:

I believe that it is very hard to compete with a superior intelligence, but it may be feasible to design mechanisms that make it aligned. I’m interested in regret minimization in games, as it pertains to safety and alignment.

Why do you care about AI Existential Safety?

It falls at the intersection of “one of the biggest problems of humanity” and “things I can reasonably work on”, thus, given my intellectual interests and my position, it seems the most rational and important thing to spend my time on.

Please give at least one example of your research interests related to AI existential safety:

Two examples, one conceptual, one technical.
Conceptually, I am interested in (and working on) defining and operationalizing existential risk in such a way that it can be considered in guidelines (e.g. ISO 31000:2018) for risk management systems that will inevitably be adopted by the law (e.g. EU AI Act). Such frameworks are usually only focused on small risks, deriving from product safety regulations, and they are not conceptually fit to deal with agents and existential risk(s). This builds on years of research into jurisprudence, formal ethics, policy, normative risk.
Technically, I am working on alignment by trying to integrate reasons (as in normative, not explanatory, reasons) in reinforcement learning. It is well-known that conduct alone is not enough to ensure alignment or compliance (cf goal misgeneralization, scheming AIs). Reasons, I argue, would provide a way to make sure that agents act in a certain way because that is what they intend to do. This builds on my years of research on deontic logic, formal ethics, and formal approaches to reasons.

Why do you care about AI Existential Safety?

I am very much involved in applications of AI, start ups, advising big tech e.g. Google, Apple, Microsoft etc. I also teach AI and graduate students at Oxford – a world leader in this area – I run a large machine learning group there. In all these activities I would like to raise the profile of AI safety.

Please give at least one example of your research interests related to AI existential safety:

I am very much involved in applications of AI, start ups, collaborations with big tech e.g. Google, Apple, Microsoft etc. I also teach AI and graduate students at Oxford – a world leader in this area – I run a large machine learning group there. In all these activities I would like to raise the profile of AI safety. I have always been interested in social equality and social justice, the coming AI revolution, combined with a revolution in robotics will radically change the balances of power in society with dangers that we might slip into a totalitarian state or big tech oligarchy. I would like to work towards a future in which the benefits of AI are fully realised for all society.

Imane Bello is FLI’s Multilateral Engagement Lead. Previously, Ima worked as a legal and policy counsel on AI. In this capacity, she advised ML companies, NGOs and other stakeholders on compliance in artificial intelligence (governance, risk management, ethics and human rights).

Ima holds the Paris Bar Exam Certificate, a master’s degree in Global Governance Studies from Sciences Po Paris and bachelor’s degrees in social sciences, law and political sciences from Sciences Po Paris, Nancy II and the Free University of Berlin. She speaks French, English, German and a decent amount of Spanish.

One year ago today, the Future of Life Institute put out an open letter that called for a pause of at least six months on “giant AI experiments” – systems more powerful than GPT-4. It was signed by more than 30,000 individuals, including pre-eminent AI experts and industry executives, and made headlines around the world. The letter represented the widespread and rapidly growing concern about the massive risks presented by the out-of-control and unregulated race to develop and deploy increasingly powerful systems.

These risks include an explosion in misinformation and digital impersonation, widespread automation condemning millions to economic disempowerment, enablement of terrorists to build biological and chemical weapons, extreme concentration of power into the hands of a few unelected individuals, and many more. These risks have subsequently been acknowledged by the AI corporations’ leaders themselves in newspaper interviews, industry conferences, joint statements, and U.S. Senate hearings. 

Despite admitting the danger, aforementioned AI corporations have not paused. If anything they have sped up, with vast investments in infrastructure to train ever-more giant AI systems. At the same time, the last 12 months have seen growing global alarm, and calls for lawmakers to take action. There has been a flurry of regulatory activity. President Biden signed a sweeping Executive Order directing model developers to share their safety test results with the government, and calling for rigorous standards and tools for evaluating systems. The UK held the first global AI Safety Summit, with 28 countries signing the “Bletchley Declaration”, committing to cooperate on safe and responsible development of AI. Perhaps most significantly, the European Parliament passed the world’s first comprehensive legal framework in the space – the EU AI Act.

These developments should be applauded. However, the creation and deployment of the most powerful AI systems is still largely ungoverned, and rushes ahead without meaningful oversight. There is still little-to-no legal liability for corporations when their AI systems are misused to harm people, for example in the production of deepfake pornography. Despite conceding the risks, and in the face of widespread concern, Big Tech continues to spend billions on increasingly powerful and dangerous models, while aggressively lobbying against regulation. They are placing profit above people, while often reportedly viewing safety as an afterthought.

The letter’s proposed measures are more urgent than ever. We must establish and implement shared safety protocols for advanced AI systems, which must in turn be audited by independent outside experts. Regulatory authorities must be empowered. Legislation must establish legal liability for AI-caused harm. We need public funding for technical safety research, and well-resourced institutions to cope with incoming disruptions. We must demand robust cybersecurity standards, to help prevent the misuse of said systems by bad actors.

AI promises remarkable benefits – advances in healthcare, new avenues for scientific discovery, increased productivity, and more. However there is no reason to believe that vastly more complex, powerful, opaque, and uncontrollable systems are necessary to achieve these benefits. We should instead identify and invest in narrow and controllable general-purpose AI systems that solve specific global challenges.

Innovation needs regulation and oversight. We know this from experience. The establishing of the Federal Aviation Administration facilitated convenient air travel, while ensuring that airplanes are safe and reliable. On the flipside, the 1979 meltdown at the Three Mile Island nuclear reactor effectively shuttered the American nuclear energy industry, in large part due to insufficient training, safety standards and operating procedures. A similar disaster would do the same for AI. We should not let the haste and competitiveness of a handful of companies deny us incredible benefits it can bring.

Regulatory progress has been made, but the technology has advanced faster. Humanity can still enjoy a flourishing future with AI, and we can realize a world in which its benefits are shared by all. But first we must make it safe. The open letter referred to giant AI experiments because that’s what they are: the researchers and engineers creating them do not know what capabilities, or risks, the next generation of AI will have. They only know they will be greater, and perhaps much greater, than today’s. Even AI companies that take safety seriously have adopted the approach of aggressively experimenting until their experiments become manifestly dangerous, and only then considering a pause. But the time to hit the car brakes is not when the front wheels are already over a cliff edge. Over the last 12 months developers of the most advanced systems have revealed beyond all doubt that their primary commitment is to speed and their own competitive advantage. Safety and responsibility will have to be imposed from the outside. It is now our lawmakers who must have the courage to deliver – before it is too late.

Sign up for the Future of Life Institute newsletter

Join 40,000+ others receiving periodic updates on our work and focus areas.
cloudmagnifiercrossarrow-up
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram