Challenging AI generated code from first principles

Prototyping new features and fixing bugs has become so much faster now that we have coding copilots like Cursor and LLMs (Large Language Models). We can generate boilerplate code in no time, saving time that would have gone into repetitive tasks. AI suggestions often feel like magic—type a quick prompt, accept a snippet, and plug it in.

However, this speed comes with a downside: it’s too easy to bypass deep thinking. When every problem becomes a quick “ask AI → copy → paste” routine, we risk turning our development process into a mindless code factory. Sure, the code looks perfect, but without that crucial step of thinking through first principles, we’re more likely to miss those subtle-but-important details that come back to bite us later.

Let’s look at another example:

events.where("data LIKE ?", "%#{search_term}%")

Here, an LLM might suggest this simple SQL LIKE query to search for certain events in a system by a search term. At first glance, it looks perfectly fine—it’s a common Rails pattern that many of us have written countless times. But accepting this without deeper consideration could miss critical issues:

Case sensitivity varies by database (PostgreSQL is case-sensitive by default)
The leading wildcard (%) makes it impossible to utilize database indexes effectively, forcing a full table scan
- Unless one exists, or perhaps you need a new index?
Without pagination, this query could return an enormous result set and impact application performance

A more proper solution would need to consider database-specific functions like ILIKE for case-insensitive searches, proper indexing strategies, and maybe even implementing pagination with limit and offset to handle large datasets efficiently.

There are many other subtle examples like this, and the risk of these kinds of oversights is particularly high in distributed systems I think. Even as a prompter, it can be challenging to fully articulate all the possible scenarios for an LLM to translate into truly robust, battle-tested code.

What I am mostly reflecting on is that the urge to move fast and move on to the next thing, combined with the speed at which we can get feedback from LLMs, makes us overlook the critical thinking process of writing software.

While these tools boost productivity, they’re not a replacement for critical thinking. Taking the time to understand why something works (or breaks) and building strong mental models isn’t just busy work—it’s a good defense against shipping code that you don’t understand yourself.

The progress LLMs have made in the last year is hard to overlook, and I love how productive tools like Cursor make it all feel like magic. All that said, I think we should keep using LLMs to make us ship faster while maintaining a healthy balance—challenging the seemingly perfect code they generate and forcing ourselves to think through problems from first principles.