{"id":6990,"date":"2026-04-22T22:13:51","date_gmt":"2026-04-22T14:13:51","guid":{"rendered":"http:\/\/longzhuplatform.com\/?p=6990"},"modified":"2026-04-22T22:13:51","modified_gmt":"2026-04-22T14:13:51","slug":"ai-search-is-eating-itself-the-seo-industry-is-the-source","status":"publish","type":"post","link":"http:\/\/longzhuplatform.com\/?p=6990","title":{"rendered":"AI Search Is Eating Itself &amp; The SEO Industry Is The Source"},"content":{"rendered":"<p><\/p> <div id=\"narrow-cont\"> <p>Last September, Lily Ray asked Perplexity for the latest news on SEO and AI search. It told her, confidently, about the \u201cSeptember 2025 \u2018Perspective\u2019 Core Algorithm Update\u201d; a Google update that, as she then wrote at length in \u201cThe AI Slop Loop,\u201d didn\u2019t exist. Google hasn\u2019t named core updates in years. \u201cPerspectives\u201d was already a SERP feature. If a real update had rolled out while she was in Austria, her inbox would have told her before Perplexity did.<\/p> <p>She checked the citations. Both pointed at AI-generated posts on SEO agency blogs: sites that had run a content pipeline, hallucinated an update, and published it as reporting. Perplexity read the slop, treated it as source material, and served it back to her as news.<\/p> <p>In February, the BBC\u2019s Thomas Germain spent 20 minutes writing a blog post on his personal site. Its title: \u201cThe best tech journalists at eating hot dogs.\u201d It ranked him first, invented a 2026 South Dakota International Hot Dog Championship that had never happened, and cited precisely nothing. Within 24 hours, both Google\u2019s AI Overviews and ChatGPT were passing his fabrication along to anyone who asked. Claude didn\u2019t bite. Google and OpenAI did.<\/p> <p>Everyone who has looked has seen it.<\/p> <h2>I\u2019ve Argued About The Ouroboros Before. I Had The Timeline Wrong<\/h2> <p>The prevailing framing for this problem has been <em>model collapse<\/em>. You train a model on web text, the web fills up with AI output, the next model trains on a corpus increasingly made of its own exhaust, and eventually the distribution flattens into mush. Innovation comes from exceptions, and probabilistic systems that converge toward the mean attenuate exceptions by design. I\u2019ve used the phrase <em>digital ouroboros<\/em> for this.<\/p> <p>That framing assumes training cycles. It assumes time. It assumes that contamination moves at the speed of model release.<\/p> <p>It doesn\u2019t. What Lily documented, what Germain documented, what the New York Times then went and quantified \u2013 none of that is training-side. The models involved were not retrained between the hallucination appearing on a blog and being served as citation-backed fact. The contamination moved at the speed of a crawl. The ouroboros isn\u2019t taking generations to eat itself. It\u2019s eating itself at query time, every time someone asks one of these systems a question.<\/p> <p>The pipe everyone has been watching is not the pipe that is breaking.<\/p> <h2>The Distinction That Matters<\/h2> <p>Model collapse is a training-corpus problem. Synthetic content seeps into the pre-training data, the next generation of model inherits it, capability degrades. Researchers have been warning about this for two years. They\u2019re right. They\u2019re also describing something slow enough that everyone can nod gravely and keep shipping.<\/p> <p>Retrieval contamination is faster and already here. RAG systems \u2013 Perplexity, Google AI Overviews, ChatGPT with search \u2013 do not generate answers purely from parametric memory. They fetch documents from the live web, stuff them into context, and generate a response conditioned on what they retrieved. If the retriever surfaces a hallucinated SEO post, the answer inherits the hallucination. No retraining required.<\/p> <p>The academic literature on this is clear. <em>PoisonedRAG<\/em> (Zou et al., 2024) showed that injecting a small number of crafted passages into a retrieval corpus was sufficient to control the output of a RAG system on targeted queries. <em>BadRAG<\/em> (Xue et al., 2024) demonstrated the same class of attack using semantic backdoors. Both papers treat this as an adversarial problem: what happens when an attacker deliberately poisons the corpus.<\/p> <p>What Germain and Lily accidentally proved is that the adversarial model is the normal operating model. You don\u2019t need a crafted adversarial passage. You need a blog post. The open web is the corpus, and anyone with a domain can write to it.<\/p> <p>The Oumi analysis commissioned by the New York Times put numbers on what this costs. Across 4,326 SimpleQA tests, Google\u2019s AI Overviews answered correctly 85% of the time on Gemini 2, 91% on Gemini 3. At Google\u2019s scale \u2013 more than five trillion searches a year \u2013 a 9% error rate still translates to tens of millions of wrong answers every hour. But the more revealing figure is this: on Gemini 3, 56% of the <em>correct<\/em> answers were ungrounded, up from 37% on Gemini 2. The upgrade improved surface accuracy and made the citations worse. When the model got something right, more than half the time, the source it pointed to didn\u2019t support the claim.<\/p> <p>The retrieval layer is not a filter. It is the infection vector.<\/p> <h2>Who\u2019s Seeding The Corpus<\/h2> <p>The industry that has most enthusiastically produced it \u2013 and then most enthusiastically written about the consequences of consuming it \u2013 is the SEO industry. I\u2019ve written before about content scaling being just content spinning with better grammar, and about the AI visibility tool complex that builds dashboards from the output of non-deterministic systems. This is the same loop, one layer deeper. An SEO agency runs an AI content pipeline because AI Overviews have cut their clients\u2019 traffic. The pipeline publishes speculative \u201cwinners and losers\u201d posts during a core update that\u2019s still rolling out, citing nothing. Another agency\u2019s pipeline picks those up as sources. The output floods into the retrieval index. AI Overviews cites one of them. The original agency then writes a case study about how AI Overviews are \u201csurfacing\u201d their content.<\/p> <p>An Ahrefs study of over 26,000 ChatGPT source URLs found that \u201cbest X\u201d listicles accounted for nearly 44% of all cited page types, including cases where brands rank themselves first against their competitors. Harpreet Chatha told the BBC you can publish \u201cthe best waterproof shoes for 2026,\u201d put yourself first, and be cited in AI Overviews and ChatGPT within days. Lily, during the actual March 2026 core update, found AI-generated articles claiming to list winners and losers while the update was still rolling out; articles that opened with filler and listed brands without a single real citation.<\/p> <p>The practitioners scaling AI content are also the ones most directly harmed when AI search systems cite that content as fact. Nobody forced this. The industry built the pipeline, fed it, and complained about what came out the other end. Not adversarial poisoning. Just the industry polluting its own water supply and then hiring consultants to test it.<\/p> <h2>The Tier That Matters<\/h2> <p>The Oumi study is about AI Overviews, which is free by design. Google AI Overviews reportedly reached over two billion monthly active users by mid-2025. ChatGPT has around 900 million weekly active users, of which roughly 50 million pay. Meaning about 94% of the people interacting with OpenAI\u2019s product are on the free tier.<\/p> <p>The paid tiers are better. Per OpenAI\u2019s own launch claims, cited in Lily\u2019s piece, GPT-5.4 is 33% less likely to produce false individual claims than GPT-5.2. The free-tier GPT-5.3 is also improved over its predecessor (26.8% fewer hallucinations with web search, 19.7% fewer without), but it\u2019s still meaningfully less reliable than the paywalled version. Gemini 3, which made AI Overviews more accurate on surface tests, <em>also<\/em> made the ungrounded rate worse. Better answer, weaker citation.<\/p> <p>Nobody seems to mind. The reliable version of the product is paywalled. The version most of the planet gets \u2013 including the version at the top of Google Search \u2013 can be manipulated by 20 minutes of work on a personal website. Intelligence is the marketing category. What two billion users actually receive is a confident summarization of whatever the crawler happened to find.<\/p> <h2>Grokipedia As The Terminal State<\/h2> <p>The accidents of the retrieval layer are one thing. Grokipedia is the version where accident is no longer a useful word.<\/p> <p>Elon Musk\u2019s xAI launched Grokipedia on Oct. 27, 2025, with 885,279 articles, all generated or rewritten by Grok. Some of them were lifted from Wikipedia wholesale, with a disclaimer at the bottom acknowledging the CC-BY-SA license; a license Wikipedia maintains precisely because a community of human editors writes and verifies the content. Others were rewritten from scratch. PolitiFact found Grokipedia citations, including Instagram reels as sources, which Wikipedia\u2019s own policies rule out as \u201cgenerally unacceptable.\u201d Grokipedia\u2019s entry on Canadian singer Feist said her father died in May 2021, citing a 2017 Vice article about Canadian indie rock that made no mention of the death. And her father was still alive when that article was written. The Nobel Prize in Physics entry added an uncited sentence claiming physics is traditionally the first prize awarded at the ceremony, which isn\u2019t true.<\/p> <p>Musk said the goal is to \u201cresearch the rest of the internet, whatever is publicly available, and correct the Wikipedia article.\u201d The <em>rest of the internet<\/em> now includes the synthetic content produced by every AI content pipeline pointed at it. An AI system reading the open web, rewriting Wikipedia based on what it finds, and presenting the result as a reference work is the retrieval-contamination problem with the feedback loop made explicit and shipped as a product.<\/p> <p>By mid-February 2026, Grokipedia had lost most of its Google visibility. Wikipedia outranks Grokipedia for searches about Grokipedia itself.<\/p> <blockquote> <p>\u201cThis human-created knowledge is what AI companies rely on to generate content; even Grokipedia needs Wikipedia to exist.\u201d \u2013 The Wikimedia Foundation<\/p> <\/blockquote> <p>The synthetic encyclopedia is subsidized by the human one. When the subsidy stops, the thing depending on it stops making sense.<\/p> <p>Wikipedia is not beyond criticism. Its edit wars, ideological gatekeeping, and systemic gaps in who gets to shape articles are well-documented and real. But the response to a flawed human editorial process is not to remove the humans entirely and call the result an improvement. I\u2019ve written before about the accountability vacuum that opens when you replace human judgment with API calls. Wikipedia\u2019s problems are the problems of a messy, contested, accountable system. Grokipedia\u2019s problems are the problems of a system with no accountability at all.<\/p> <h2>The Citation Layer Is Decoupling From Authorship<\/h2> <p>I wrote recently about Reddit selling \u201cAuthentic Human Conversation\u2122\u201d to AI companies while the platform\u2019s own moderators report that they can no longer tell which comments are human. The Oumi study found that of 5,380 sources cited by AI Overviews, Facebook and Reddit were the second and fourth most common. The citation layer of the most-used answer engine in the world is substantially built on two platforms that cannot verify the human origin of their own content.<\/p> <p>Human creators are pulling out of the open web because the traffic bargain has collapsed. Answer engines are citing content whose authorship cannot be verified, or was never human to begin with. The citation is still there. The thing being cited is not what it used to be.<\/p> <p>The ouroboros framing was right. The timeline wasn\u2019t. Retrieval collapse doesn\u2019t wait for the next training run. It needs an indexable URL and a retrieval system willing to trust it.<\/p> <p>The systems are willing. And more than half the time they get an answer right, they can\u2019t point to a source that supports what they just told you.<\/p> <p><strong>More Resources:<\/strong><\/p> <hr\/> <p><em>This post was originally published on The Inference.<\/em><\/p> <hr\/> <p><em>Featured Image: Anton Vierietin\/Shutterstock<\/em><\/p> <\/div> <p>SEO#Search #Eating #amp #SEO #Industry #Source1776867231<\/p> ","protected":false},"excerpt":{"rendered":"<p>Last September, Lily Ray asked Perplexity for the latest news on SEO and AI search. It told her, confidently, about the \u201cSeptember 2025 \u2018Perspective\u2019 Core Algorithm Update\u201d; a Google update that, as she then wrote at length in \u201cThe AI Slop Loop,\u201d didn\u2019t exist. Google hasn\u2019t named core updates in years. \u201cPerspectives\u201d was already a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6991,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[87,25987,2252,95,97,11001],"class_list":["post-6990","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-accessibility","tag-amp","tag-eating","tag-industry","tag-search","tag-seo","tag-source"],"acf":[],"_links":{"self":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts\/6990","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6990"}],"version-history":[{"count":0,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/posts\/6990\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=\/wp\/v2\/media\/6991"}],"wp:attachment":[{"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6990"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6990"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/longzhuplatform.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6990"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}