AI Research Tools Miss the Point of Research

Last updated on 28/06/2025

Let me put a flag in the sand here. Nearly every product that says they can automate research is fundamentally wrong headed. They’re almost universally focused on the output of research. They focus on the documents. The documents that come out of research aren’t the value. The value is in building expertise in the researcher.

If you’re not a researcher, this piece is here to challenge the idea that automating research pipelines actually improves outcomes. Let me show you why research is how we build the muscle memory of knowing how to think.

And if you’re a researcher? Well, all I know is that first, you’ve got to get mad. You’ve gotta say, “I’m a human being, goddammit! My mind has value!”

Every Day is Brain Day

Expertise isn’t a download. It’s not a neatly zipped file you can hand to someone else, or to an LLM, and expect it to work the same way. Expertise is built through repeated, direct engagement.

Over the last decade, I’ve spent an enormous amount of time building taxonomies for how I think about optimization. Which methods work well when actions are deterministic or stochastic. Where state and operation decouple. What patterns tend to fail, and why. None of that came from a summary. It came from grappling with hundreds of papers, seeing conflicting claims, trying ideas that fell apart in my hands. Dozen and dozens of “You know, that doesn’t quite apply here”s and “Well shit, that didn’t work”s.

I’m currently working on a white paper about testing LLM-backed products. There are lots of thoughts out there in academia and industry about how you test, what you test, and when you test. There’s none that are an exact fit from the perspective of every regulated industry I’ll work in or every product I’ll help build.

I can’t just find the right article and accept it as the correct way to proceed. I have to read other people’s conclusions critically. That lets me build my own framework: What kinds of interactions exist? How do they impact the user’s perception of reliability? What elements are objectively testable, and what requires subjectivity? Once I’ve identified the difference, what do I do about that?

The act of researching builds knowledge that I will use again and again. It’s expensive, sure, but the cost of me doing the research is amortized over every project I do from here to the grave that is backed by LLMs. Honestly, it benefits every project that has subjective elements that need to be tested, or tests that have vastly different costs, which is effectively all the work I’ll ever do.

I’m not unique in that. Real life decisions always revisit and reuse the expertise we accumulate. That knowledge isn’t disposable. When is the last time you needed to make a decision, an important decision, with real impact on your life, career, or a product, where your understanding of the justification for that decision wasn’t critical? Be honest. Has that ever happened?

Writing is Thinking

People often focus on the output of research. The white paper. The thesis document. The slide deck. I’d like to point out that producing documents isn’t simply a clerical task. Writing is how understanding is clarified and cemented in the writer.

Simply structuring a white paper forces you to identify the moving parts and how they relate to each other. You read a sentence and ask yourself, “Will the audience build the correct understanding from this?” More often than not, the answer is no. So you revise. You add nuance. You re-examine the source.

Writing for different audiences is its own form of learning. Explaining a concept to an executive is not the same as explaining it to your mom or a colleague. The exercise teaches you what matters, to whom, and what information you need to elide or emphasize depending on the audience.

And sometimes, it isn’t writing at all. Sometimes lecturing is what reveals the gaps in your thinking. The moment you have to explain an idea out loud, you realize which parts you only half-understood.

Editing AI-generated text doesn’t create the same pressure. It’s not the same as building the knowledge yourself. It’s just grading someone else’s book report.

Automation is a Spectrum

Not all automation is equally harmful, of course, but most of it crosses the line.

Here’s a rough continuum:

Busywork (acceptable): Downloading PDFs, formatting bibliographies.
Moderate Value (caution): Sourcing papers, identifying possible venues or authors.
Truly Valuable (hands off): Summarization, synthesis, sense-making.

The problem is that AI systems are almost always marketed to replace, not augment, the critical parts. The sales pitch sounds efficient. “Just have the model read everything, and you can spend your time making decisions.”

The problem is the part you’re outsourcing is exactly the part that was building expertise in your internal experts. Chances are if you’re asking people to do research, you’re going to do things with the output of that research, likely with their recommendation.

If you skip the hard parts of research, the reading and the synthesis, you’re leaving almost all the value on the table.

A researcher who only reads summaries can’t tell you why a decision was made or what tradeoffs were considered. They can’t catch critical misinterpretations or hallucinations because they never read the source material themselves.

What’s that old IBM maxim? No machine can make a decision because the machine can’t be held accountable. That applies here too. The generative system is rarely held culpable for bad decisions made on the back of its research product, independent of the accuracy of those products. When these systems are used, we’re often asking individuals (expert or no) to make nuanced informed decisions without having the proper context or systematization of the underlying material. That’s a recipe for disappointment.

Common Objections and Why I Disagree

Objection 1:
“But there’s too much information for any one person to process!”

Rebuttal:
Globally sure, but not on a specific topic. The machines have yet to ask us to build products on their behalf, so everything we touch professionally is still a human endeavor. Everything we’re trying to understand too. If humans made it or envisioned it, there’s a fixed amount of information which it’s possible for a person to ingest, just by construction.

Breadth without depth is no substitute for real understanding. More input doesn’t matter if you can’t build an intuition for what’s important.

Objection 2:
“The researcher can just validate the AI output.”

Rebuttal:
If you haven’t read the material, you can’t meaningfully validate the synthesis. You’re just playing an elaborate game of telephone with an inscrutable black box.

Objection 3:
“Hybrid workflows will save us.”

Rebuttal:
They might, if that’s what I saw anyone talking about. Everyone talking about generative AI assisting research is talking about a many-fold improvement of the throughput of the researcher. Where’s the time coming from?

Are you reading summaries rather than whole documents? Then you’ve lost nuance in your knowledge.

Are you synthesizing output for the researcher? Then they haven’t struggled with the arguments and the information architecture.

Filtering documents? Maybe, but most researchers know if they’re wasting their time within the abstract, or maybe a paragraph or three into the introduction.

None of these add up to the many-fold improvement people are touting or looking for.

Work is its own Reward

Look, obviously I have a dog in this fight. I wrote that whole article on “What’s a PhD Anyway?” talking about why the ability to do research in the first place is valuable. It’s a big part of what my value proposition is to my employers.

But the thing is, my ability to do research and the research I’ve conducted, those are the real value.

When I say “Oh shit, you know what? I bet predicting intent to leave a job is similar to suicide intent prediction, let’s look at that literature” that insight is hard-won from a thousand coffee stained papers.

Sure, the outputs I make, the documents, are necessary and useful, but what you’re buying when you hire a researcher is expertise.

If you’re a researcher, please defend your time, your expertise, and the necessity of your process.

If you’re a decision-maker, please understand that faster doesn’t mean wiser, and no model can do the learning for you.