Who is Responsible for AI-Generated Harm: Users or Creators?

On the final day of February, Gary Marcus from NYU published an essay titled “The Threat of Automated Misinformation is Only Getting Worse.” He highlighted how easily misinformation can be generated using Bing with certain commands. Shawn Oakley, labeled a “jailbreaking expert” by Marcus, asserted that basic methods are sufficient to exploit this capability, indicating that the risk of large-scale AI-generated misinformation is indeed growing.

Marcus shared his insights on Twitter, prompting a response from Mike Solana of FoundersFund. My interpretation of Solana’s sarcastic remarks suggests that asserting an AI model is a dangerous misinformation tool is flawed if one deliberately circumvents its safeguards. He implies that the issue lies not with the tool itself but rather with its misuse, placing blame on the user rather than the company that developed the tool. His comparison between Bing Chat and a text editor misses a crucial difference: while Microsoft Word cannot generate persuasive misinformation at scale, Bing can.

Beneath the insults lies a more profound debate that requires careful consideration. At its core, this is about identifying who bears responsibility for AI-generated harm. Solana’s tweet suggests that users hold all the blame—ChatGPT and Bing are merely tools, and their misuse is akin to any other tool (like a text editor). Conversely, Marcus argues that companies must accept responsibility for releasing products that are not adequately prepared, complicating efforts to combat online misinformation.

I align with Marcus's viewpoint but believe it's unwise to completely dismiss Solana’s perspective, even if his argument is weak. Blaming either users or companies in absolute terms is unproductive. To better frame the issue of AI-generated harm, we need a balanced analysis, which I aim to provide here. The question I wish to address is: “What must we establish to definitively determine who is at fault when harm occurs from using an AI system?”

I propose a two-level evaluation: first, examining the nature of AI systems (“Are AI systems comparable to other tools?”), and second, considering the usage of these systems (“What constitutes proper or improper use of AI?”). Let's delve into this.

AI Systems vs. Non-AI Tools: The Absence of Guidance

Solana’s argument falters partly because Bing Chat and Microsoft Word are not directly comparable. While one can write misinformation in a text document and publish it, the capacity to do so at scale and with persuasive power—as Marcus notes—is a defining characteristic of AI tools.

How unique are these tools, really? ChatGPT is not the only tool that can cause harm (even if not at scale); cars serve as another example. Why do we not hold car manufacturers accountable? Although the comparison is imperfect, it raises an important question: why do we criticize OpenAI for commercializing ChatGPT but not Toyota for producing vehicles that result in numerous fatalities daily?

The key difference is that, unlike with many consumer products (such as cars), generative AI systems lack a manual—there is no intended use, no guidance on appropriate application, and no clear instructions for avoiding misuse.

If someone is harmed by a vehicle, it’s a case of using the car incorrectly. In contrast, systems like ChatGPT are released without clear expectations. While they can be beneficial or harmful, companies do not explicitly communicate how they should be used; they prefer users to discover this themselves to gather valuable real-world feedback. Manuals exist for cars that explain their functioning, and we receive training to drive safely to prevent harm.

This is not the situation with ChatGPT. Users cannot predict the output from ChatGPT based on a given prompt reliably—the system's opacity hinders understanding, and the inherent randomness in token selection makes accurate predictions impossible. Moreover, users cannot foresee its failures. What would be the implications if a car manufacturer produced vehicles that no one could operate? Or if they designed cars with faulty brakes or unreliable steering?

The absence of a manual for generative AI models does not stem from companies' reluctance to create one; rather, they lack the knowledge to do so. On March 3, MIT Technology Review published an exclusive interview with ChatGPT's creators, revealing that they did not anticipate either ChatGPT's success or the extent of the issues it would present to millions of users. One striking comment from Jan Leike was:

“I think it’s very difficult to really anticipate what the real safety problems are going to be with these systems once you’ve deployed them. So we are putting a lot of emphasis on monitoring what people are using the system for, seeing what happens, and then reacting to that. This is not to say that we shouldn’t proactively mitigate safety problems when we do anticipate them. But yeah, it is very hard to foresee everything that will actually happen when a system hits the real world.”

Car manufacturers understand precisely where each component belongs, the correct tire pressure, and the importance of seat belts. AI firms, however, cannot answer fundamental questions such as “What are the failure modes, and under what conditions do they occur?” I acknowledge that generative AI is a new technology. When automobiles were first invented, their limitations were unknown. However, car manufacturers did not grant universal access to their products without a comprehensive understanding of the potential risks involved—our understanding of cars developed alongside their use.

In contrast, AI firms have made generative AI tools widely accessible from the start; ChatGPT was available to everyone from day one, only to be manipulated shortly thereafter. Microsoft's Sydney appeared unexpectedly following announcements of Bing Chat's imminent rollout “to millions.” Additionally, Meta's Galactica demo had to be terminated after three days due to backlash revealing a disconnect between its specifications and reality.

It is unreasonable to expect users to discern the rules through trial and error. The attempts at creating comprehensive manuals (e.g., Gwern, Janus) underscore the models' complexity. AI companies should reconsider releasing products requiring deep expertise that they do not provide.

Good Use vs. Misuse: The Need for Regulation

Even if such a manual existed, it would not fully resolve this debate. A driving manual would not prevent someone from deliberately causing harm with a vehicle. The manual merely establishes the parameters for using the object and its intended function. Another critical element that, alongside a well-crafted manual, can help determine fault in any situation is regulation.

The law clearly states that if you intentionally harm someone with your vehicle, you are at fault, not the car manufacturer. But what about AI? How can we discuss appropriate versus inappropriate use of generative AI models when regulations are mostly absent and there is no established jurisprudence? When Microsoft’s Sydney misled and threatened beta testers, was that a failure of the user or the company for not preventing it? Without defined rules, no one can answer.

This lack of regulation extends beyond usage. Companies can collect data via web scraping, design unreliable products, and sell them without clarity on their applications: Should we permit users to develop emotional attachments to virtual avatars owned by companies that can terminate them at any time? Should individuals without access to traditional medicine be directed to an “AI medical advisor”?

No frameworks currently exist to hold companies accountable. Legislative proposals like the EU AI Act or the US AI Bill of Rights are in the works but have not been formalized. While lawsuits against companies like Stability.ai and Midjourney are emerging, resolving them may take years.

As AI technology advances rapidly, legal frameworks lag behind. During the interval between the emergence of these AI models and the establishment of proper regulations, companies and their products operate in a realm of non-accountability. Regulation is essential to define the boundaries of AI systems, enabling us to categorize use as either appropriate or inappropriate.

Conclusion

Reflecting on the initial question: “What must we do to accurately ascertain who is responsible when harm results from using an AI system?” My response is that companies should provide users with manuals detailing proper and improper uses of their products, along with their limitations. Additionally, policymakers should implement suitable regulations to hold both companies and users accountable based on the circumstances.

Returning to the original debate that inspired this essay clarifies why Solana’s comparison is limited (not solely due to scale issues but also because ChatGPT lacks a user manual, unlike text editors) and why Marcus stresses corporate responsibility, even if users may share some blame. Without regulation, companies can operate without constraints.

Under the conditions I propose, individuals could equitably compare AI models to other tools, allowing for appropriate attribution of blame to companies or users in instances of poor design or misuse. In other words, we must ensure that AI products are qualifiable as deficient by establishing clear technical and legal standards. If we fail to do this, unproductive online debates will persist without any party being held accountable for AI-generated harm.

Subscribe to The Algorithmic Bridge. Connecting algorithms and people, this newsletter focuses on the AI that matters to your life.

You can also support my work on Medium directly and gain unlimited access by becoming a member using my referral link here! :)