Strange Errors and Familiar Flaws: How AI and Human Mistakes Compare

In our daily lives, making mistakes is as human as breathing. We forget appointments, misread instructions, or occasionally put salt in our coffee instead of sugar. Over thousands of years, we’ve developed sophisticated systems to catch and correct these very human errors – from the simple “measure twice, cut once” to complex legal appeals processes.

But as artificial intelligence increasingly weaves itself into the fabric of our society, we’re encountering a fundamentally different kind of mistake-maker. While both humans and AI systems make errors, the nature of these mistakes reveals fascinating patterns of both similarity and difference. Understanding these patterns is crucial as we integrate AI into critical aspects of our lives.

The Predictable Unpredictability of Human Error

Human mistakes follow patterns we’ve learned to anticipate. Think about someone solving calculus problems – their errors typically cluster around the edges of their knowledge. When we’re tired, stressed, or distracted, we’re more likely to make mistakes. Most importantly, humans usually know when they’re out of their depth. Someone struggling with calculus will readily admit their uncertainty about complex mathematical concepts.

These predictable patterns have allowed us to develop effective safeguards. Hospitals mark surgical sites to prevent wrong-side operations. Casinos rotate dealers to maintain alertness. Double-entry bookkeeping catches financial errors. These systems work because they’re designed around how humans typically fail.

AI’s Bizarre Blunders

Artificial intelligence, particularly large language models (LLMs), presents a unique set of challenges. AI mistakes don’t follow the familiar human patterns. Instead of clustering around difficult topics, they appear randomly across all subjects. An AI might perfectly solve complex equations one moment, then confidently declare that cabbages eat goats the next.

Perhaps most disconcertingly, AI systems maintain unwavering confidence even when spectacularly wrong. Unlike a human who might hesitate or express doubt, an AI will state absurdities with the same assurance as established facts. This combination of randomness and unshakeable confidence makes it particularly challenging to trust AI systems with complex, multi-step tasks.

Surprising Similarities: Where Human and AI Errors Align

Despite their differences, AI and human errors share some intriguing commonalities:

Context Sensitivity: Just as humans give different answers depending on how a question is phrased in surveys, AI exhibits “prompt sensitivity” – producing varying responses based on slight changes in input.
Availability Bias: Both humans and AI tend to default to familiar or frequently encountered information. For instance, when asked about locations, AI models often default to familiar places like “America” even when discussing more exotic locations. This mirrors how humans often reach for the most readily available examples rather than thinking through all possibilities. Just as we might immediately think of recent news stories when asked about current events, AI systems gravitate toward the most common examples from their training data.
Memory Patterns: Like humans, AI models often better remember information from the beginning and end of long documents, struggling with details in the middle – a pattern similar to human memory’s primacy and recency effects. Encouragingly, researchers have found ways to improve this limitation by training models on more examples of information retrieval from long texts, leading to more uniform performance across document lengths.
Incomplete Information: Both humans and AI systems can make errors when working with partial or ambiguous information, though they handle this uncertainty differently.

The Peculiar World of AI Vulnerabilities

One area where AI mistakes diverge dramatically from human ones is in how they can be manipulated or “jailbroken” – convinced to bypass their built-in safety constraints. While some jailbreaking techniques mirror human social engineering (like pretending to be an authority figure or claiming something is “just a joke”), others are uniquely AI-specific.

For instance, researchers discovered that presenting questions in ASCII art – pictures made from text characters – could bypass AI safety filters. An AI might refuse to explain how to build dangerous devices when asked directly but willingly provide the information when the same question is presented as a pattern of symbols. No human would be fooled by such a simple transformation, highlighting the alien nature of AI cognition.

Building Better Guardrails

This mix of familiar and alien error patterns suggests two potential paths forward. We could either:

Engineer AI systems to make more human-like mistakes, making them easier to catch with existing safeguards
Develop entirely new error-detection systems tailored to AI’s unique failure modes

Some progress is already being made on both fronts. Reinforcement learning with human feedback – the technique behind ChatGPT’s success – helps align AI behavior more closely with human expectations. Meanwhile, new AI-specific error-checking methods are emerging, such as asking the same question multiple times in different ways to cross-reference responses.

Moving Forward Thoughtfully

As we continue integrating AI into our world, understanding these similarities and differences in error patterns becomes crucial. While humans occasionally make random or incomprehensible mistakes, such behavior is rare and often signals deeper issues. We typically don’t put people exhibiting such inconsistency in charge of important decisions.

The same principle should apply to AI systems. We must carefully match AI applications to their actual capabilities, always keeping in mind both the predictable and unpredictable ways they might fail. As these systems become more sophisticated, we’ll need to develop new frameworks for understanding, predicting, and preventing their mistakes – frameworks that account for both the human-like and uniquely artificial aspects of AI errors.

The future of human-AI collaboration depends not just on improving AI capabilities, but on deeply understanding how and why these systems fail. Only then can we build the safeguards needed to use them responsibly and effectively.

References

AI Mistakes Are Very Different From Human Mistakes – https://spectrum.ieee.org/ai-mistakes-schneier
How are Prompts Different in Terms of Sensitivity? – https://arxiv.org/pdf/2311.07230
Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5. – https://arxiv.org/pdf/2305.04400
LLM In-Context Recall is Prompt Dependent – https://arxiv.org/html/2404.08865v1