Do I use LLMs for writing?
Effectively, no. I don't.
"Oh, but, why on earth not?!" The collective hype train asks. "Writing is what AI is good at, maybe even the best! It can generate a nearly infinite amount of coherent text in an instant! The writing is super high quality! It does your research for you!" etc., etc...
Here's the thing, it does do all that reasonably well. Sometimes extraordinarily well. LLMs can be, and are, generally useful (less so than CEOs are spending millions to make us think, though). Gemini is becoming the best search engine ever created, and Cursor is a total godsend when it comes to scientific exploratory coding. But for the process of solidifying my thoughts effectively into the idea that I want to convey, through writing? Hmm...
I genuinely have tried to make it useful, I promise. I thought using LLMs would help me increase output, improve quality, all while (and this is the most important part) maintaining my original voice. I'm not going to write or create something if it doesn't feel like I created it, and, well, it still doesn't. It can get close in some cases, but generally it still misses the mark. Beyond this, hallucinations are still a thing, which I both understand, and am perplexed at how it's still a problem, but we'll get to that.
There are places where I do still use LLM-based products, and yes, they are transformative in their own right. But for an application that LLMs should excel at (if the medium of interaction is any indicator) they feel net annoying, and not that helpful. Here's why.
What I tried: The good ole' college effort
Getting these out of the way. Yes, I gave it examples of my own writing (finished essays) for a "few shot" approach, which is a pretty standard. I use both gemini (the product) and gemini-cli for their respective strengths. I've used a few different models (some through Perplexity), though these days they blend together so that while I can still tell the difference between them, there's no real functional change. I created extensive markdown files as preprompts. I've provided it with very complicated and detailed instructions about how we should work together. I've given it simple instructions for how we should work together.
Beyond this I have tried, and still do to a certain degree, to use LLM products as research assistants, which, it turns out, is quite a bit more frustrating than you might think. Additionally, I've used the technology as a sounding board, which is useful less in an "educated-friend-who's-opinion-you-respect" sort of way, and more in a "ask-your-weird-cousin-about-something-controversial-for-posterities-sake" feel. Honestly, the most useful application for an LLM by far is as an interactive thesaurus, but more on that in a bit.
This technology is useful, just not that useful, thanks to (you guessed it) hallucinations, and the strawberry jell-o flavoring it puts into everything. Ugh.
Hallucination as canary
The hallucination. I'm not sure if another word has been co-opted so quickly, so diversely, so universally across the entire world as this one. The problem hallucinations present is so pervasive that at the bottom of every prompt produced by one of Google's LLMs, they include a warning stating that "AI can make mistakes".
Among the veritable plethora of legitimate concerns around AI, the hallucination holds a special place in my heart as the most obnoxious modern endemic technological failure.
"But it's a new technology, we're all just working out the bugs!" the collective hype train consoles the general public.
Yes and, the hallucination, until proven otherwise, is a clear symptom of a fundamentally limited underlying technology. In spite of these limitations, this technology's implementers are routinely applying it at or beyond its fundamental limitation. This limitation? These models cannot know they are right, or wrong, because they don't know anything. Transformer models predict text.
A tale of two case studies.
In researching a recent essay, I was using an LLM as a search engine, shielding me from having to interact with the broader internet (my favorite application). I knew that the author Adam Grant had written some pithy statement about using AI for writing, and wanted to find his quote. I had pretty quickly found one that now sits at the top of the Lowpass main page on my site, but thought I had remembered a different one that was more punchy and so I used Gemini to search more "thoroughly". Forty five minutes later, both Gemini and I are neck-deep in a heated argument about whether this "punchier" quote exists. The catch? I'm the one convinced it doesn't exist, while Gemini clearly has chosen this hill to die on.
Over the course of this debate it hallucinated dozens of convincing articles, titles, and full reference links mentioning the punchier quote. Not only that, but as I called it out on being wrong (being very straightforward), it began to push conspiracy theories that somehow this quote was scrubbed from the internet all at once, from dozens of sites, in order to be replaced with the currently existing less-punchy version.
Now, I recognized the hallucination earlier on in the conversation, but not as soon as I would have liked to. The problem is that the LLM's confidence is not tied to any observed fact, because it doesn't have legitimate reasoning capabilities. There are no abstract concepts of object, objective, function, or purpose.
The problem?
LLMs cannot know when they are wrong.
LLMs, when wrong, will lean into their deceptive conclusion full bore.
A similar event happened looking for a quote from a mythbuster episode I remember watching when I was younger. The LLM very convincingly made up all sorts of junk to tell me that a specific version of a quote existed. It devised entire fake chapters of a real book, one that thankfully I actually had on hand to confirm it's deception. It referenced real podcasts, with real links to the real transcripts as being the source of the quotes, but checking the podcast transcript showed no such reality.
All my experience in religious apologetics has taught me that just because someone provides a reference, it doesn't mean they are using it correctly. Many times references are used to convey confidence, but are ultimately deception, and don't support the claim made in the document. As such I checked these references all the way to the claimed quote, which clearly didn't exist. Now, I have the habit of thoroughly checking references, but how many typical users do that? Some, but probably not a high percentage.
The result? I am quickly using LLMs less and less for research. Each heated argument over an error significantly decreases my confidence in the product. Yes, these are a couple of related case studies, but my cases are far from the only ones. Hallucinations occur in more insidious ways, like the many instances showing up of LLMs claiming extraordinary conspiracy theories or pushing users toward psychosis. Relevantly, a group of researchers recently described their experience trying to get LLMs to be research assistants in well-known and highly productive scientific fields:
"In our evaluation, no [science-focused LLM product] completed a full research cycle from literature understanding through computational execution to validated results and scientific paper writing. While all systems showed competence in conceptual tasks such as planning and summarization, they consistently failed at robust implementation. Every framework produced sophisticated hallucinations. Deployment proved demanding, requiring substantial debugging and technical expertise, which undermines common claims about the democratization of science with AI." - Can AI Conduct Autonomous Scientific Research? Case Studies on Two Real-World Tasks
A search engine that occasionally, deftly, and aggressively, promotes insanity has a relatively high risk associated with it. The FDA would not approve. Originally search engines were a complicated search mechanism for the internet. Now, with an LLM tacked on to the side, it's an even more effective search mechanism, filtered through an occasional pathological liar.
In sum, it wastes just about as much of my time as it saves, and lies to me in the process.
It's strawberry Jell-o, *Shudders* everywhere
Ah, but there's more! The second, less psychosis-inducing issue I have with LLMs involves the claim they can mimic our writing style. Yes, they can get close in some cases, especially with examples of the writing style in question. But as close as the hype claims they can? Not in my experience.
Another recent study worked to quantify how well LLMs can mimic a users voice and writing style:
"Results show that while LLMs can partially emulate user style in more structured formats like news and email, they struggle with nuanced, informal expression in domains such as blogs and forums. Generated outputs often default to an average, generic tone and remain readily detectable as AI-written. Moreover, increasing the number of demonstrations offers limited gains in stylistic alignment. These findings underscore the current limitations of in-context personalization and highlight the need for more effective methods to achieve truly personalized generation, an essential step toward democratizing LLM-based writing tools." - Catch Me If You Can? Not Yet: LLMs Still Struggle to Imitate the Implicit Writing Styles of Everyday Authors (emphasis mine)
I worked with LLM-based tools for some time when refining my essays. Since a lot of the ideas for articles I put on this website have already been fleshed out I thought "this will be perfect for an LLM! They'll help me quickly get this ready to share publicly". And they do produce coherent, reasonably faithful text, but it all sounds the same. The real kicker is it doesn't sound like me, even when providing example writing.
It's like hiring a professional cook to improve your home meals. At first, they come in and quickly produce some phenomenal food; like magic! They are so good at their job. Then, after some time of eating their meals, you become a little suspicious about their work. You can sense a common theme among them, but can't quite put our finger on it. How can a pot-roast taste the same as an alfredo pasta? They're clearly different, foods, and yet...
Watching over their shoulder, you see them adding a powder to the shrimp pad-thai they're making. They're response?
"Oh, it's strawberry Jell-o! I think it adds a nice, tangy sweetness. I put it in everything!"
Yuck.
And they have been adding it to everything, even been adding it to your grandma's heirloom pecan pie! The horror.
When first writing with AI and having it make subtle suggestions, maybe adding a sentence or two here or there, it seemed decently good at it's job. However, when returning to the work after some time (it's good to leave and come back with fresh eyes) I could obviously tell the spots where AI had touched. I really didn't like that it had added strawberry Jell-o flavoring to the essays I had already nearly finished.
How do I use LLMs for writing now?
LLM-based products are a phenomenal interactive thesaurus. I love this functionality. I'll often want to find a term or word for a feeling I have about something, but can't quite put my finger on it. This is a great use case for the LLM, because we can tell it about the idea, describe what it means or feels like, then converse with it to nail down a good idea. Often the LLM doesn't find the right word or phrasing, but surfaces enough options to allow me to solve the problem myself. Often I can say "that's good, but it needs to sound more urgent, give me 10 more options that lean urgent". This is a very useful application. Is an interactive thesaurus worth $100 billion in investment? Probably not, but hey, as long as it exists I'll use it.
Yes, LLMs are a transformative technology, but the slop factor is real, and I'm not a huge fan of refining my essay ideas with an occasional psychosis-inducing pathological liar. I still use LLMs every day for coding and scientific development, but I can do without the strawberry Jell-o, thanks.

