#004 — Perception

Perception is the interpretation of sensory input. It feels like a recording but is closer to a guess, both for human eyes and for machine vision. Both can be fooled in systematic ways, and AI now generates a growing share of what humans see, which adds a third layer to the failure modes.

The pattern: confident interpretation of sensory input that diverges from reality. Humans see things that aren’t there and miss things that are. Machines do the same — and the things they generate now exploit both sides simultaneously.

🧠 In humans

Perception isn’t a recording of the world; it’s a best-guess synthesis under heavy priors. When the priors collide with the data, you get optical illusions — the Müller-Lyer arrows, the Kanizsa triangle, the rotating-snake illusion. These persist even after you know the trick; perception doesn’t update on knowledge.

A more consequential family is attentional blindness. Simons & Chabris’s “invisible gorilla” study (1999) showed observers counting basketball passes failing to notice a person in a gorilla suit walking through the scene. Simons & Levin (1998) had experimenters swap identities mid-conversation; roughly half the participants didn’t notice. The world that reaches awareness has already been heavily filtered.

And then there’s pareidolia: the visual system is so tuned for faces, animals, and agents that it finds them in clouds, toast, and Martian topography. The false-positive rate on agency-detection is a feature, not a bug — it was cheaper, evolutionarily, to flinch at a stick than to be eaten by the snake.

Canonical experiments: Simons & Chabris (1999); Simons & Levin (1998); Rensink, O’Regan & Clark (1997) on change blindness; the McGurk-MacDonald effect (1976) on audio-visual integration.

🤖 In machines

Computer vision has its own taxonomy of perceptual failure. Adversarial examples (Szegedy et al., 2014; Goodfellow et al., 2015) showed that imperceptible perturbations — invisible to a human eye — could flip an image classifier from “panda” to “gibbon” with 99% confidence. The pattern was not a bug in a particular model but a feature of high-dimensional classifiers trained on finite data.

Texture bias (Geirhos et al., 2019) revealed that CNNs were largely classifying by surface texture rather than shape — an elephant’s skin pattern stretched over a cat silhouette gets called “elephant.” This wasn’t visible until someone looked for it; the models worked well on benchmarks for reasons orthogonal to how humans recognize objects.

In the wild, distribution shift is the more common failure: a model trained on clean datasets fails on rotated, occluded, low-light, or unusually-composed real-world images. Vision-language models add a new attack surface — image-encoded prompt injection, where instructions hidden in pixels override the model’s safety training.

Canonical papers: Szegedy et al. (2014); Goodfellow, Shlens & Szegedy (2015); Geirhos et al. (2019); Bagdasaryan et al. (2023) on multimodal prompt injection.

🤝 In hybrid systems — AI as perceptual mediator

This is where the frame shifts. With Calibration, the hybrid case was emergent failure between human and machine working on the same task. With Perception, the hybrid case is something different and arguably more consequential: AI now produces and filters a growing share of what humans perceive.

Deepfakes are the cleanest example. Synthetic faces, voices, and video aren’t a failure of human perception or machine perception alone — they’re a failure of the system that AI generation creates: AI exploiting human perceptual priors (faces, voices, plausible motion) with content that other AIs are barely able to detect. The 2024 election cycle saw both real-world deception and the inverse problem now called the “liar’s dividend” — genuine recordings dismissed as deepfakes because the category exists.

In medical imaging, AI-assisted radiology was supposed to reduce miss rates. Several studies (including Lehman et al., 2019) have found the picture is messier: human readers who use AI assistance can become less sensitive to abnormalities the AI flags inconsistently, and can miss findings the AI fails to highlight. The human’s perceptual attention is partly outsourced — and the outsourcing’s gaps become the system’s gaps.

Autonomous driving has its own version: humans-in-the-loop fail to override AI misperceptions in time precisely because the AI is correct most of the time. Attentional vigilance decays under reliable automation, which is exactly when the rare misperception lands.

And there’s a slower, less dramatic failure: observational deskilling. Tourists experience landmarks through phone screens and AI summaries; satellite analysts learn to trust segmentation models and lose the ability to read raw imagery; birdwatchers offload identification to Merlin and stop building the priors that made identification possible. AI doesn’t fool perception here — it stands in for it.

Canonical studies: Lehman et al. (2019) on AI-assisted mammography; Bansal et al. (2021) on complementarity in human-AI teams; Vaccaro & Waldo (2019) on automation-induced attentional decay; recent work on the liar’s dividend in political communication.

↔ Where they converge

All three fail confidently. Human perception, machine vision, and AI-mediated perception all report their outputs as straightforward observation rather than synthesis.
All three are prior-driven. Humans see faces in noise because evolution; CNNs see textures because training data; AI-mediated systems propagate the priors of whichever model is in the loop.
All three have an adversarial regime — small, targeted perturbations that exploit the synthesis. Optical illusions, adversarial examples, and deepfakes are the same family of attack at different layers.

⤨ Where they diverge

Human illusions are largely involuntary and shared; you and I see the Müller-Lyer arrows the same way. Machine adversarials are model-specific and largely transferable across architectures but not across modalities. Hybrid failures are situational — they depend on the deployment context (medical, legal, civic) in ways the component failures don’t.
The human and machine cases are bounded by the agent’s own perceptual system. The hybrid case has no such bound: synthetic media is a new perceptual category that didn’t exist before, and the failure mode is generative, not just receptive.
For the first two, you can train the agent to be more robust. For the third, robustness is a societal problem (provenance standards, watermarking, media literacy), not an agent-level one.

🌀 Open question

We have decades of work on human perceptual robustness and a growing literature on machine adversarial robustness. We have almost nothing on system-level perceptual robustness — how a society of humans + AI tools + AI-generated content maintains contact with reality at all. The C2PA provenance standard, watermarking proposals, and detection-arms-race papers are early attempts, none yet decisive.

The deepest version of the question: if a growing fraction of what people perceive is AI-mediated, what’s the perceptual analog of epistemic hygiene at a civilizational scale? (Open as of mid-2026.)

📡 Recent entries (auto-fed)

Week 2026-W21

2026-05-18 — An AI-themed investment scam campaign was tracked across more than 15,000 cloaked domains, abusing the Keitaro ad-tracking platform to route victims toward deepfaked impersonations of trusted public figures. The operation illustrates deepfake fraud now running at infrastructure scale, with cloaking used to evade security tooling while reaching ordinary users (AIID #1236).

2026-05-18 — A woman in Guelph, Ontario lost $14,000 in cryptocurrency after clicking a social media ad featuring an apparent Mr. Beast endorsement of an investment scheme, which police attributed to a fraudulent celebrity endorsement. The case is a concrete instance of synthetic or impersonated-celebrity media converting directly into financial loss for a non-expert viewer (AIID #1485).

2026-05-10  arXiv:2505.xxxxx   "Vision-language models inherit the texture
                                bias of their image encoders"
2026-05-07  AI Incident DB #862 Facial recognition false positive leads
                                to wrongful arrest (Detroit, 4th case)
2026-05-02  Reuters/Factcheck   "Liar's dividend used to dismiss authentic
                                hospital footage from Sudan"
2026-04-28  C2PA quarterly      Provenance adoption rates across major
                                platforms
2026-04-19  Tesla NHTSA filing  Autopilot perception failure in low-sun
                                conditions, three incidents
... 33 more

📡 Recent entries (auto-fed)

Week 2026-W22

2026-05-22 — Sequential Difference Maximization reframes the adversarial objective as maximizing the gap between the top non-ground-truth class probability and the ground-truth probability, with a three-layer optimization loop that reports stronger attack success against image classifiers than prior gradient-based baselines (arXiv:2605.20308).

2026-05-25 — AI image-authenticity tools labeled a genuine photograph of freshly dug graves, reportedly for schoolgirls killed in a bombing in Iran, as not real, illustrating how forensic classifiers can launder documentary evidence into doubt (AIID #1493).

2026-05-25 — An Iranian outlet circulated an AI-generated satellite image purporting to show a destroyed US base in Qatar during US-Iran wartime tensions, with the fabricated imagery consumed as authentic and demonstrating synthetic overhead imagery as an active disinformation vector (AIID #1494).

2026-05-25 — A pro-Iran propaganda network circulated a grainy video appearing to show blindfolded young girls paraded past Donald Trump, intercut with Trump-Epstein footage, logged as synthetic or manipulated media gaining traction inside an influence operation (AIID #1496).

2026-05-25 — Federal prosecutors charged two men with using AI to generate nude videos and photos of female celebrities, among the first cases brought under a newly enacted statute targeting deepfake pornography (AIID #1500).

2026-05-25 — An emerging-markets lead received a WhatsApp message purporting to be from a colleague, followed by a video call showing a deepfaked likeness of that colleague, with subtle wrongness during the exchange prompting suspicion rather than compliance (AIID #1502).