Threats of Generative AI — Clarity Foundation

Forecasting potential misuses of language models (Generative AI) for disinformation campaigns, and how to reduce risk.

Purpose of the research: OpenAI and the other co-authors believe it is critical to analyze the threat of AI-enabled influence operations and outline steps that can be taken before language models are used for influence operations at scale. They hope their research will inform policymakers that are new to the AI or disinformation fields, and spur in-depth research into potential mitigation strategies for AI developers, policymakers, and disinformation researchers.

Background: In recent years, artificial intelligence (AI) systems have significantly improved and their capabilities have expanded. In particular, AI systems called “generative models” have made great progress in automated content creation, such as images generated from text prompts. One area of particularly rapid devel- opment has been generative models that can produce original language, which may have benefits for diverse fields such as law and healthcare.

However, there are also possible negative applications of generative language models, or “language models” for short. For malicious actors looking to spread propaganda—information designed to shape perceptions to further an actor’s interest—these language models bring the promise of automating the creation of convincing and misleading text for use in influence operations, rather than having to rely on human labor. For society, these developments bring a new set of concerns: the prospect of highly scalable—and perhaps even highly persuasive—campaigns by those seeking to covertly influence public opinion.

This report aims to assess: how might language models change influence operations, and what steps can be taken to mitigate these threats? This task is inherently speculative, as both AI and influence operations are changing quickly.

Summary Report here.

Full Report here.

All of OpenAI’s excellent research reports here.

This is the full January 2023 report by Georgetown University’s Center for Security and Emerging Technology, OpenAI, and the Stanford Internet Observatory (all co-authors).