Category: Blog

Your blog category

  • Investigating LLM Visibility Factors: Could the 499 Response Code Matter?

    A note on what this is (and isn’t)

    This piece is best read as a working hypothesis, not a confirmed best practice. The analysis below comes from a mix of small-scale testing across our own portfolio of client sites.

    That includes one paired experiment where we left 499-related issues in place on one site and deliberately eliminated them on a comparable site of similar authority.

    It also draws on public commentary from SEO thought leaders who have been writing about AI search behavior. Sample sizes are small, attribution is messy, and the underlying systems (AI search agents, indexers, retrieval pipelines) are largely opaque. So treat what follows as a direction worth investigating in your own audits, not a settled ranking factor.

    Why we started looking at this

    If you’ve been doing technical SEO for a while, you already live in a world of crawl budgets, render budgets, Core Web Vitals, and server logs. Recent shifts in how people search (AI Overviews, ChatGPT search, Perplexity, enterprise assistants) raise a question worth taking seriously: are at least some of these systems fetching information live, under tight latency constraints, while a user is waiting for an answer?

    We don’t think the answer is uniformly yes. A lot of AI-generated answers appear to be drawn from pre-indexed content the model or its retrieval layer already has.

    But we also can’t assume it’s uniformly no. There’s enough evidence (in observed crawler behavior, in vendor documentation, and in our own logs) to suggest real-time fetching is part of the mix for at least some queries and some systems.

    That uncertainty is what made one server-log detail interesting to us:

    The 499 response code.

    Our hypothesis (still a hypothesis): Sustained 499 patterns may quietly hurt visibility in real-time AI retrieval scenarios, because your server is effectively recording that the client gave up before you finished responding. Below is what 499 is, why we think it could matter, what we’ve seen, and how to investigate this on your own properties.

    What is a 499 response code?

    499 is a non-standard status code introduced by NGINX. It means:

    Client Closed Request. The client terminated the connection before the server finished responding.

    Important nuance: your server doesn’t “send” a 499 the way it sends a 200 or 404. NGINX records a 499 when the client bailed before the response was complete.

    A second nuance worth calling out up front: 499 is NGINX-specific. If you’re on Apache, IIS, Cloudflare Workers, Vercel, Fastly, or another stack, you won’t see 499 in your logs at all.

    You’ll see different signals: 408s, 503s, 524s, or just dropped connections recorded differently. The underlying behavior (client gave up mid-response) can still happen; the log code just won’t be 499.

    If your stack isn’t NGINX, the equivalent signal is whatever your edge or origin uses to record client-initiated disconnects.

    Common reasons 499 happens

    • A user hits stop or closes the tab, or their mobile connection drops.
    • A client (browser, app, bot) has a timeout and gives up.
    • An upstream system (proxy, edge, or potentially an AI fetcher) abandons the request to stay within a latency budget.
    • Your backend is slow (database, API, rendering) and the client leaves before NGINX can finish.

    In classic SEO, 499s were largely treated as a “users bounced” signal.

    The question we’re asking is whether, in a world where some retrieval traffic comes from automated agents on tight clocks, sustained 499s on the right templates also represent missed inclusion opportunities.

    Why this might be more than a server curiosity

    A reasonable counterargument to everything below is: “Most AI answers come from pre-indexed content, so live fetch latency doesn’t matter.” For a lot of queries, that’s probably right. But two things keep us from dismissing the live-fetch case:

    1. Some AI search products clearly fetch in real time, at least for certain query types. Perplexity’s product behavior, ChatGPT’s browsing/search mode, and various enterprise assistants document or visibly perform live retrieval. The proportion varies by product and query, but it isn’t zero.
    2. Even crawlers that aren’t “real-time” still have timeouts. A slow or unreliable origin can underperform in a generic index too.

    So the working assumption is: real-time retrieval is a meaningful subset of how LLMs source content, and even where it isn’t, fast-and-reliable origins help. That makes server-side timing worth paying attention to.

    The concurrent fetch pattern (where this matters most)

    Where we think 499 risk is highest is in products that do concurrent live retrieval for a single user query. Example query:

    “What’s the difference between 5G SA and NSA for industrial robotics, and what should a Canadian manufacturer buy this year?”

    A retrieval-style system might, in parallel:

    • Fetch a few authoritative explainer pages.
    • Pull vendor documentation for SA vs NSA capabilities.
    • Grab a couple of recent articles.
    • Look for a decision guide or checklist.
    • Compare latency and reliability claims across sources.

    If your page is in that candidate set but the fetch doesn’t complete inside the agent’s budget, you’re probably not in the final answer for that query.

    Being late isn’t literally the same as being missing. The agent may have alternates and may try you again later. But for that specific synthesis, you’re out.

    A simplified retrieval funnel

    A rough mental model for live retrieval looks like this:

    Step 1:

    Discovery: the agent decides your URL might have the information.

    Step 2:

    Fetch: it requests your page.

    Step 3:

    Extract: it parses the content and pulls relevant passages.

    Step 4:

    Synthesize: it combines passages from multiple sources into an answer.

      A 499 (or the equivalent on a non-NGINX stack) breaks step 2. No fetch, no extraction, no inclusion, at least for that turn. That’s not a “ranking factor” in the traditional sense, but if it happens often enough on your money pages, it’s plausibly a visibility tax.

      What we tested (and the caveats)

      Inside our own portfolio we ran an informal paired comparison:

      • Site A: a client property where we had identified a non-trivial pattern of 499s on key editorial and collection templates, and (with the client’s knowledge) left the underlying performance issues in place for a defined observation window.
      • Site B: a comparable property of similar authority, topic mix, and template structure, where we addressed TTFB, edge caching, and the specific endpoints producing 499s before the same window began.

      Over the window, Site B showed more frequent appearance in AI Overview–style results and citations in chat-based search for the topics it should have been competitive on. Site A’s presence was patchier and noisier. That’s suggestive, not conclusive. Some honest caveats:

      • Two sites is a tiny sample, and “similar authority” is doing a lot of work in that sentence.
      • We can’t cleanly attribute 499s on Site A to AI agents specifically. Some are clearly humans on flaky mobile networks.
      • AI search surfaces are themselves moving targets, so changes during the window could be product-side, not site-side.
      • We didn’t isolate variables tightly enough to claim causation. We made a basket of performance fixes on Site B, not just 499 mitigation.

      Independently, public commentary from practitioners like Michael King has pushed the broader idea that AI systems are sensitive to retrieval performance in ways traditional SEO didn’t emphasize. Our test isn’t a replication of anyone else’s work. It’s a small directional check on our own properties, consistent with that broader line of thinking.

      Net: we think there’s enough signal to make 499 patterns worth investigating, and not enough to claim a confirmed mechanism.

      Where 499 spikes tend to show up

      In our logs, the templates most prone to 499s are also the ones most likely to matter for retrieval:

      Heavy dynamic pages

      • Collection or hub pages assembled from multiple APIs.
      • Product comparison pages.
      • Filter- and facet-heavy URLs (tags, search endpoints).

      These usually combine slow backend queries, cache misses, and long TTFB.

      Overbuilt editorial templates

      • A dozen JavaScript bundles.
      • Third-party widgets and personalization calls.
      • Hero video, heavy fonts, client-side rendered content.

      Humans will often wait three to five seconds. Automated clients are less patient.

      “It’s fine on my laptop”

      • Edge misconfiguration.
      • Origin slow under concurrency.
      • Heavy TLS negotiation.
      • Backend queueing during peaks.

      Bursty access patterns (several quick hits across many properties) expose fragility that a single browser session won’t.

      Why we’re adding this lens to technical audits

      Traditional technical audits focus on crawlability, indexation, performance for users, and structured data. We’re adding a fifth lens, tentatively, around retrievability under real-time constraints. The reasoning:

      • Users are asking longer, more complex questions.
      • They expect synthesized answers, not just blue links.
      • At least some systems pull from multiple sources concurrently and live.
      • On those queries, “slow to respond” looks a lot like “not included.”

      Even if real-time retrieval turns out to be a smaller share of traffic than we think, the fixes for 499 patterns largely overlap with performance work you should be doing anyway. The downside of investigating is low.

      How to audit 499s on your own properties

      1) Confirm where 499 (or its equivalent) is logged

      On NGINX you’ll have access logs with status codes. On other stacks, find the equivalent client-disconnect signal. You’re looking for patterns across URLs, user agents, time of day, and upstream response time. A single 499 isn’t a problem. A sustained pattern on important templates is.

      2) Segment by URL type

      Break the 499s down across blog posts, category/collection pages, search endpoints, API routes, auth/redirect chains, and CDN vs origin. Concentration on a few templates is good news, because you can fix it surgically.

      3) Correlate with TTFB and upstream latency

      Sustained 499s usually travel with slow backend responses, slow database calls, origin overload, and cache misses. If your logs include $request_time and $upstream_response_time, you’ll often see high values for both alongside the 499. Translation: the client wasn’t willing to wait.

      4) Look at who is canceling

      Not all 499s are the same. Look at known bots, proxy ranges, user agents associated with AI fetchers (these change frequently, so don’t overfit), and spikes that match known AI traffic patterns. Even when you can’t cleanly label an agent as “LLM,” a concentrated 499 pattern on retrieval-relevant pages is worth treating as a signal.

      5) Compare with 408 / 504 / 524

      • 499: client left early.
      • 408: server timed out waiting for client.
      • 504: gateway timeout (proxy didn’t get a response from upstream).
      • 524 (Cloudflare): origin took too long.

      499 clustered with 504/524 suggests a performance reliability problem, not just isolated disconnects.

      What to do about it

      Most of these are standard performance hygiene. We’re not claiming they’re uniquely “LLM-specific.” They’re things good technical SEOs already know to do. The argument is that real-time retrieval raises the cost of not doing them.

      A) Make important pages fast to start

      Retrieval agents need usable text quickly, not a fully hydrated app. Prioritize TTFB, server-side rendering or server-side content availability, and cached HTML for key pages. If your content only appears after client-side JS, you’re betting every agent will execute your app like a browser. That’s an uncomfortable bet.

      B) Cache smarter, closer to the edge

      • Cache HTML at the CDN/edge for content pages where possible.
      • Reduce cache fragmentation (avoid unnecessary query strings).
      • Pre-warm caches for high-value pages.

      Many fetchers may hit a URL once. If every hit is an origin miss, you pay the latency every time.

      C) Reduce backend complexity per request

      • Eliminate N+1 queries.
      • Move expensive personalization off the critical path.
      • Precompute popular pages.
      • Optimize database indexes.
      • Add application-level caching (Redis or similar).

      D) Reduce payload and blocking work

      • Enable compression (Brotli/Gzip).
      • Remove unnecessary third-party scripts from content templates.
      • Lazy-load non-critical components.
      • Trim heavy fonts and hero media.

      E) Tune timeouts intentionally

      You can’t force a client not to cancel, but you can avoid long hangs, fail fast when upstream is unhealthy, and stop queueing requests until the client gives up. This is one of the places DevOps and SEO genuinely need to collaborate.

      F) Make content easy to extract once fetched

      This is the most genuinely LLM-flavored item on the list. Once an agent does fetch your page, give it the cleanest possible path to the answer:

      • Clear, descriptive headings.
      • Short answer blocks near the top of relevant sections.
      • Bullet lists where structure helps.
      • Definitions early in the document.
      • Appropriate schema markup.

      A quick self-check

      For your top retrieval-relevant pages, ask:

      • Can the main content be retrieved quickly without running a large JS bundle?
      • Is the first meaningful text visible early in the HTML?
      • Does the page depend on multiple upstream calls before content is available?
      • Do you see 499 (or equivalent) spikes during peaks?

      If you hesitate on any of these, an audit is probably worth your time.

      Closing: a hypothesis worth investigating

      We’re not making the strong claim that 499s are a confirmed LLM ranking signal, or that every canceled request is a lost answer. The systems on the other end are too opaque, and our own evidence is too small, to talk like that.

      What we are saying: real-time retrieval is part of how AI search works, we can’t assume an LLM will always answer from a pre-indexed copy of your page, and sustained 499 patterns on retrieval-relevant templates are a plausible visibility risk in that mode. The fixes are the same fixes that make your site better for everyone else, so the cost of investigating is low and the upside, if our reading is right, is real.

      Treat this as an experiment to run on your own properties, not a checklist item to check off. And if your testing contradicts ours, we’d like to hear about it.

    • Can AI Actually Write a Great Novel? Here’s What I Found Out.

      Months of trying to make a machine write something Tolstoy-tier. Here’s where it works, where it falls apart, and what surprised me.

      Okay, so. For the past few months I’ve been doing something kind of ridiculous. I’ve been trying to use AI to write fiction. Not a parody, not a clone. Real literature. The kind of book that gets read in two hundred years, and could stand its own against the likes of Dostoevsky, Shakespeare, Tolstoy, Faulkner, Hemingway, and so on.

      I know how that sounds. I’m not under the illusion that I cracked it. But the experiment was never really about producing the masterpiece on attempt one. It was about figuring out where exactly this stuff breaks. Where’s the seam? At what point does it stop being writing and start being a very smooth impression of writing?

      This is just me sharing notes from the trenches. No hot takes, no manifestos. Just stuff I’ve actually noticed, in the order I noticed it. If you’re playing with this too, I’d love to hear what you’re seeing.

      AI is a mirror, not a muse

      This is the biggest thing, and honestly the thing I tell every writer friend who asks. AI doesn’t write your story. It writes your story back at you, in your voice, but only if you give it enough of your voice to work with.

      First time I sat down with Claude Opus 4.7 and just asked it cold to write me an opening, the result was fine. Like, technically fine. Clean prose. It moved. It also belonged to nobody. It was the literary equivalent of stock photography. Looked good, said nothing, could’ve been written for anyone.

      Then I tried something different. I wrote the first three to four pages myself first, and only then handed it over. Completely different result. Suddenly the model was picking up my rhythm, my weird pacing habits, the way I lean on certain kinds of clauses. It started feeling less like a co-author and more like a really attentive friend who’d been reading my drafts for years and was trying to keep me sounding like me while I stepped away.

      Which reframes the whole thing for me. AI isn’t generating original literature. It’s amplifying voice you already have. If you don’t have a voice yet, it can’t help you find one. It’ll just give you back the average of every voice it has ever read, which is exactly what generic AI prose feels like.

      So if you’re trying to write fiction with AI, my advice is dead simple: write the opening yourself. Write enough that the model has something real to grip onto. Then iterate. And expect to push back. A lot. Five, six rounds on tone, on a phrase that feels off, on a beat that came out wrong. The voice doesn’t survive on autopilot. You’re constantly correcting drift.

      Where it absolutely shines: structural editing

      If voice is where the model has gotten weirdly good, structure is where it’s just flat-out useful. Right now. Today. This is the unglamorous strength nobody writes essays about, but it’s probably the most valuable thing this tech does for a working writer.

      Hand the model a messy draft. Tangled plot, two arcs that contradict each other, a subplot that goes nowhere, a scene you kept only because the dialogue was funny. It will find all of it in one read. It’ll tell you which threads are doing real work and which are decorative. It’ll suggest cuts that hurt to make and are basically always right. It catches the inconsistencies you’ve stopped seeing because you’ve read your own thing thirty times.

      It’s not creative work in the romantic sense. It’s editorial work. But editorial work is what turns a draft into a book, and most of us don’t have access to a sharp, patient editor who’ll read the whole thing in one go and tell us the truth. The model is that editor. Three a.m., never tired, never offended.

      One trap though. Don’t confuse “structurally cleaner” with “actually better.” The model can give you a more polished, more coherent version of your draft. That’s not always a better one. Some of the greatest novels in the canon are messy on purpose. Make sure you’re accepting cuts because they serve the story, not because the manuscript looks tidier afterward. Coherence isn’t the same as art.

      Where it still struggles: actual human emotion

      Now the harder part. The part I’m least optimistic about in the short term.

      AI doesn’t really get human emotion the way a great novelist does. The newer models are surprisingly good at the surface of feeling. They can write grief, jealousy, longing, shame. The prose looks right. What they struggle with, and what every revision pass exposes, is the gravitational pull between two people.

      Here’s the failure mode I keep hitting. The model writes two characters who are supposed to love each other, or be quietly destroying each other, and the words on the page are technically correct, but something is off. The dialogue is plausible. The interiority is plausible. The relationship isn’t. It feels like two people performing a relationship instead of being inside one. You can read it and tell.

      When I push back, it usually improves on the second or third pass, but the improvement has a particular flavor. It starts borrowing moves from existing literature. The held silence, the betraying gesture, the line of dialogue that says one thing and means three. Sophisticated recombination of techniques rather than a fresh act of feeling. Sometimes that’s enough. A lot of the time, the seam shows if you’re paying attention.

      The deeper version of this problem: AI is bad at subtext. It’s bad at what’s left unsaid. Half of what makes Tolstoy or Chekhov great lives in the gap between what a character says and what they mean. The silences. The misread look. The line that ends one beat too early. Models trained to be helpful and complete are basically biased against withholding. You have to fight them constantly to let a scene stay ambiguous, let a character stay unknowable, let a conversation just end without resolving.

      Same problem shows up with endings. The model wants to close loops. It wants to land beats. It wants every chapter to feel earned. But great literature regularly refuses to do that. The Brothers Karamazov doesn’t really tie itself off. Hamlet leaves you with a corpse and a bunch of unresolved interior. Left alone, AI ties the bow every single time. You have to keep telling it not to.

      The architecture problem

      Here’s a failure that only shows up in long work, and it took me a while to catch.

      The model can write a beautiful chapter. It can write a beautiful run of chapters. What it can’t reliably do, even with a ton of context, is the kind of architectural patience that defines a great novel. By that I mean the way Dostoevsky plants a tiny detail in chapter two that detonates in chapter forty. The way Tolstoy lets one image come back, transformed, three hundred pages later. The way you finish a novel and realize the whole thing was secretly about something the first chapter only hinted at.

      That kind of long-range intentionality requires holding the entire book in your head as one object. The model holds context, sometimes a lot of it, but it doesn’t seem to hold the work as an object the way a writer does. It writes locally. It nails the next great paragraph. It doesn’t seem to know which paragraph is doing load-bearing work for a payoff three hundred pages out.

      Until that changes, the architectural soul of a long novel has to come from a person. The model can help you execute. It can’t yet hold the whole thing.

      Voice vs. worldview (these are not the same thing)

      Quick distinction worth making, because people collapse these and they shouldn’t.

      Voice, at the sentence level, is the texture. Diction, syntax, rhythm, the small recurring habits. Models pick this up fast. Three to five pages and Claude is producing sentences that sound like mine.

      Worldview is something else. Worldview is the moral and philosophical lens that holds a body of work together. Tolstoy isn’t great because of his syntax. He’s great because of his lifelong obsession with moral awakening, with how to live, with the specific weight he gives to dying men and peasants and aristocrats. That worldview is inseparable from his actual life. His crises. His late-life religious turn.

      Models don’t have a worldview. They have a statistical absorption of every worldview they’ve ever read. Ask one to write inside a worldview and it performs one. Sometimes the performance is excellent. It’s rarely the kind of unified moral vision that animates a real book from inside, because that vision tends to come from somebody who actually lived a life and arrived at convictions about it.

      This isn’t obviously a problem you fix with bigger models or more data. It might be a different kind of limitation.

      The “necessity” problem

      There’s a thing critics call necessity. The feeling that a sentence had to exist exactly this way. That if you swapped it out, something would be lost.

      AI prose, even the good stuff, mostly doesn’t have it. It’s well-formed. It’s appropriate. It moves the plot. But you can usually picture ten other versions of the same paragraph that would do basically the same job. The sentence doesn’t feel inevitable. It feels picked from a menu.

      Great writers do something else. They write sentences that, once you’ve read them, you can’t unread. There’s almost a fingerprint at the level of word choice. AI tends to produce sentences that feel like a consensus of fingerprints. That gap, between inevitable and merely appropriate, is now the main thing I look for when I’m editing model output. I cut everything appropriate. I keep only what feels like it had to be there.

      It always sounds like 2026

      One more thing on style. AI defaults to contemporary literary fiction voice. Even when you ask it for nineteenth-century cadence or Elizabethan diction, the gravity pulls back toward something polished and present-day. You can win individual battles. You can feel it resisting the whole time. On its own, it doesn’t write like Melville or Faulkner or Woolf. It writes like a thoughtful contemporary writer doing an impression of them.

      This matters more than it sounds. Part of why the canon is the canon is that those books are stylistically irreducible to any other era. They couldn’t have been written at any other time, by any other person. AI tends to produce work that feels like right now, which is a strange thing to say about a model trained on centuries of literature, but that’s what I see.

      The friction question (this is the one that keeps me up)

      Last thing, and the one I genuinely don’t have an answer for.

      A lot of what we call great literature was made under absurd amounts of friction. Dostoevsky wrote in debt, in mourning, sometimes mid-seizure, sometimes for his life. Tolstoy revised War and Peace by hand for years. Shakespeare worked under commercial pressure inside an industry. The friction wasn’t incidental to the work. It shaped it.

      AI removes friction. That’s the whole point. Blank page is less scary. Fifth draft arrives faster. Structural problems get diagnosed in minutes instead of months.

      So here’s the question I keep circling: can art survive the removal of friction? Maybe friction was always romantic mythology and the work is what matters, full stop. Maybe shorter feedback loops just mean better art, faster, because we get more iterations per lifetime. That’s a totally defensible read.

      But maybe not. Maybe some of what makes great literature great is the trace of a person who paid for every paragraph in years of their life. Maybe readers can feel that cost without being able to name it. Maybe a frictionless work, however polished, lacks the specific gravity that comes from being wrung out of someone.

      I don’t know which is right. I suspect both are partly right. What I know is the question isn’t going away, and anyone using these tools seriously is going to have to come up with their own answer.

      Where this leaves us

      After all of it, here’s where I’ve landed for now.

      AI cannot, today, write a novel that belongs next to Tolstoy. Voice is shallow without a writer behind it. Emotional connection between characters is performed, not felt. Subtext gets suppressed by training that wants clarity. Long-range architecture is beyond the model’s grip. Necessity is missing. Worldview is borrowed. The era keeps leaking through.

      AI can, today, make a serious writer faster, sharper, less stuck, and more structurally rigorous than they would otherwise be. That’s not nothing. That’s actually a huge deal. It just turns out to be a different deal than producing literature that lasts.

      My current bet: the great novels of the next twenty years will still be written by people. People who use AI heavily, in ways earlier writers didn’t have access to, but the central acts, the voice, the worldview, the emotional truth, the architectural intention, will stay stubbornly human. If that ever stops being true, that’s going to be one of the more interesting boundaries this technology crosses. I’m watching for it. I’ll keep poking at it.

      Experiment continues. If you’re doing this too, message me. Genuinely curious what you’re seeing.

    • Goodbye, Top of the Funnel. It’s Not You, It’s AI. 

      Goodbye, Top of the Funnel. It’s Not You, It’s AI. 

      AI search didn’t just change SEO. It quietly removed the economic reason most informational content existed in the first place — and that should worry the LLMs more than it worries anyone else.

      “There are some decades where nothing happens, and then there are weeks when decades happen.” — Lenin

      Lenin was talking about revolutions. I think about that quote every time I open ChatGPT or Claude now.

      For about twenty years, the open web ran on a deal that almost no one wrote down but everyone in marketing understood. A user types a question into Google. Google sends them to a page. The page answers the question, and on the way to the answer, the publisher gets a chance to show an ad, build brand affinity, capture an email, or pitch a product. Informational content was the entry ticket. Monetization came later in the funnel.

      That deal is breaking in front of us, and most companies are still pretending it isn’t.

      The trade that built the web

      If you’ve ever been told to write a blog post titled “What is SEO?” or “What is an LLM?” or “How are footballs made?” , you have lived inside the trade. The post itself rarely paid for itself directly. Nobody buys software off a definitional explainer. The post existed because it ranked, and ranking pulled strangers into a property where the company could finally introduce itself.

      The classic move was the listicle. “Best CRM software in 2024.” “Top 10 project management tools.” The company writing the post was, conveniently, also #1 on the list. It was a soft con, but a productive one: the user got a real comparison, the publisher got a real lead, the search engine got a real answer to index. Every party in the system had an incentive to keep producing more of it.

      Multiply that across every B2B SaaS company, every DTC brand, every media site, every affiliate. That is most of the written internet of the last two decades. Top-of-funnel content was never really about education. It was a customer acquisition channel that happened to be educational.

      What AI search actually broke

      ChatGPT, Claude, Perplexity, Gemini — they all do the same thing from the user’s perspective. They answer the question. Inline. With citations sometimes, often without. The user never lands on the page that wrote the answer.

      This sounds like a small UX change. It is not. It is a structural break in the economics of informational content.

      If a user can ask “what’s the difference between RAG and fine-tuning” and get a competent two-paragraph answer in the chat window, the marginal value of writing the 47th blog post titled “RAG vs. Fine-Tuning Explained” collapses. The traffic doesn’t arrive. The lead doesn’t get captured. The product never gets pitched. The post becomes a write-only exercise, useful as training data for the model, useless as a customer acquisition channel for the author.

      The incentive that funded the production of informational content for twenty years has been quietly removed. Not banned, not deprecated. Removed.

      The competition layer disappears too

      The thing people miss is that this isn’t just bad for individual publishers. It is bad for the long-run quality of the answers themselves.

      The reason there are good explainers on the web is that ten companies fought to write the best one. Competition for the click is what produced the second draft, the better example, the clearer diagram, the updated 2024 numbers. When the click goes away, so does the competition. So does the second draft.

      Informational content won’t die completely, universities will still publish, hobbyists will still write, a handful of companies with strong brand budgets will keep doing it for reasons unrelated to immediate ROI. But the volume and the rate of improvement both fall off a cliff. The web stops getting better at explaining itself.

      The existential problem this poses for the LLMs

      Here is the part that should be keeping the model labs up at night.

      LLMs are, in a real sense, fine-tuned on the output of the very system they are dismantling. The reason a chatbot can give you a competent answer about RAG, or footballs, or the Treaty of Westphalia, is that thousands of humans were paid, directly or indirectly, to write good explainers, because the click economy made it worth their while. Take away the click economy and you take away the production line.

      The first generation of frontier models was trained on a web that was being aggressively cultivated for human readers. The next generation will be trained on a web where the incentive to cultivate has been pulled. You can see where this goes. Models trained on models trained on models, with fewer and fewer fresh human explainers entering the corpus. The technical term is model collapse. The marketing term is “why do all the answers feel the same now.”

      The chatbots need the open web more than the open web needs the chatbots, and the open web is figuring that out slowly.

      What this means if you’re a marketer right now

      If you’re running content for a company in 2026, the honest answer is that a lot of what worked in 2022 doesn’t anymore, and pretending otherwise is expensive. A few things I’d push on:

      Stop measuring TOFU content by traffic alone. Generic explainers that used to pull search traffic will keep losing it. The pages that still earn their keep are the ones with a point of view, original data, or a perspective the model genuinely cannot synthesize from the rest of the web.

      Take brand authority seriously as a distribution strategy. When a user asks an LLM “what’s the best tool for X,” the model surfaces the brands it has the strongest, most consistent signal about. Getting mentioned in the answer is the new ranking. That mention is earned through the same things brand has always been earned through — PR, original research, podcast appearances, partnerships, customers talking about you in places models read, just with a much sharper payoff than before.

      Write less, and write things only you can write. A weekly explainer of yesterday’s news loses to the chatbot every time. A field report from your customer base, a benchmark you ran, a contrarian take from someone with skin in the game, those still work, because they’re primary sources. The model needs them. So do humans.

      Treat being cited by LLMs as a real channel and instrument it. Track which models mention you, in what contexts, with what framing. This is the new SERP and almost no one is monitoring it yet.

      The honest summary

      Top-of-funnel content didn’t die because AI is better at writing it. It died because the click, the thing that paid for it, stopped arriving. The chatbots ate the entry ticket and left the rest of the funnel intact, which is the part nobody seems to be saying clearly.

      It’s a strange moment. The companies that own the chatbots are also the ones quietly eroding the supply of training data they depend on. The companies that produce the training data are watching their distribution model evaporate. The users are getting faster answers and a thinner web. Nobody is obviously winning except, in the short term, the inference layer.

      Lenin’s line keeps coming back to me because this really is one of those weeks-where-decades-happen moments. The mechanics of how content gets produced, distributed, and rewarded on the internet are being rewritten in real time, and the playbooks built for the previous regime aren’t going to survive contact with this one.

      If you’re trying to figure out what your content and brand strategy should look like in the AI-search era — what to keep doing, what to kill, and how to actually get cited by the models that are now sitting between you and your customers — that’s the question we work on at Zeta Theorem. I’d be happy to talk.