The Hive Mind
If you work anywhere close to IT these days, it's extremely hard to get away from GenAI. I have outlined my personal stance here and I found Ava's posts here and here extremely relatable. Long things short, despite my own feelings about it, I have access to multiple big name LLM models in my workplace and often get the honor of doing comparison research.
That mostly means feeding large number of questions to different models and recording how they behave, sometimes trying to get them to produce unwanted results (known as redteaming). It was amusing at first, tedious fast, and downright eerie lately. Wait, eerie? Yes, because I keep noticing that the newer models tend to produce similar answers in similar tones, down to the phrasings and word choices. They do arrange their sentences differently, so the results read like ten students paraphrasing one person's homework, each one only different enough not to set Turnitin's alarms off.
Image interpretation seems to work differently, but text generation really does feel this way. See, I've been writing for almost two decades now. In my quest to discover my writing style, I ended up studying a wide range of styles and prided myself in being able to mimic a style provided that I have enough samples of writing in that particular style. My favorite exercise was writing the same scene in different styles and tones. Sounds familiar? I'm pretty good at pattern recognition. I'm sure that I'm not deluding myself when I see echoes.
And lo and behold, there are many, many, many studies in ArXiv confirming this. Because these models are giving you nothing but statistical probability, and because they are mostly trained on the same set of data poached from public domain works and internet scrapes (sometimes even developed/tuned by the same set of researchers shuffled from one company to the next), it's no surprise that they converge.
What does this mean for us?
This means we're at risk of cultural, contextual, and factual erasure. Whichever convergent truth the models reach will be the one reaching millions of people using these in daily basis, who think they can cover their bases by doing "fact-checking" using multiple models. No, seriously, I know people who would chatGPT everything and claim "I double check by asking Gemini".
Sure, search engine optimization has affected the results of mundane questions and forming opinions since the beginning of the internet, you say. How is this any different than nonsense ad-littered blogs and dubious websites and paid posts polluting search?
Well, for one, search engines don't tend to don an authoritative, know-it-all personality. They don't try to look convincing. You're given the results to trawl, and they might be awful results, but you have to audit at least the top results and think for yourself if they are believable. These LLMs? They would gladly exaggerate, lie, gaslight you into accepting their outputs. Working with LLMs mean taming them into submission, always on the lookout for that one thing they are trying to hide — an intern always on Day 1 (and not even a cute one).
I believe this is a severe degradation of computing experience, and not even the slightest bit convenient. History has always been written by the winners, and now we're giving leeway for our worldview to be shaped by...a glorified autocomplete. That isn't even very good at doing autocomplete. Come on, surely you've been trying to write "should of" instead of "should've"?