sql ·databases ·software-design ·declarative-programming

The Smartest Thing SQL Does Is Play Dumb

4/15/2026

6 minutes read

The Relationship the Database Already Knows

You have two tables. customer and rental. There is a foreign key from rental.customer_id to customer.customer_id. The database knows this relationship. It enforces it on every insert, update, and delete. It rejects orphaned rows and maintains referential integrity around the clock.

And when you write a query to join those two tables, you have to tell it how they relate. Again. Explicitly. Every time.

sql
SELECT c.first_name, c.last_name, r.rental_date
FROM customer c
  INNER JOIN rental r
    ON c.customer_id = r.customer_id;

This bothers people. Developers coming from object-oriented languages are used to navigable relationships, where you call customer.rentals and the framework figures out the rest. SQL makes you spell out a relationship the database already knows. It feels redundant. It feels like a missing feature.

It is not.

The Sakila Experiment

MySQL's Sakila sample database is a good place to test this assumption. The customer table has a primary key customer_id. The rental table has a foreign key column also named customer_id that references it. The constraint is defined. The engine knows. So NATURAL JOIN should work, right?

sql
SELECT first_name, last_name, rental_date
FROM customer
  NATURAL JOIN rental;

NATURAL JOIN joins two tables automatically based on shared column names. No ON clause. The engine finds columns with matching names and joins on them. An explicit join on customer_id returns 16,044 rows. NATURAL JOIN returns zero.

Zero. Not a wrong number of rows. Not a subtle off-by-one. Zero.

The reason: both tables share two column names, not one. customer_id and last_update. NATURAL JOIN matches on all of them. It silently added AND customer.last_update = rental.last_update to the join condition, and since the timestamps in those two tables almost never coincide, the result set is empty. The engine did not consult the foreign key to figure out which shared column was the meaningful one. It cannot. It just matched names.

Now rename the foreign key column to confirm that the engine is truly blind to the constraint:

sql
ALTER TABLE rental CHANGE customer_id cust_id SMALLINT UNSIGNED NOT NULL;

The foreign key constraint survives the rename. Run SHOW CREATE TABLE rental and you will see fk_rental_customer now pointing cust_id at customer.customer_id. The relational metadata is fully intact.

But NATURAL JOIN is now wrong in a completely different way. The only shared column is last_update, so the query joins entirely on timestamps, ignoring the customer relationship altogether. The foreign key had no say in either outcome.

Foreign keys are integrity constraints. They are not query logic. NATURAL JOIN does not read them. It matches names, blindly, and it will match the wrong ones without hesitation.

Declarative Means Self-Contained

SQL is a declarative language. You describe what you want, not how to get it. The optimizer decides the execution plan: which index to use, what order to join, whether to do a nested loop or a hash join. That part is implicit, and deliberately so. The query author should not need to think about it.

But the join condition is not an execution detail. It is part of the question itself. "Give me customers with their rentals" is not the same question as "give me customers with their payments," even if both involve customer_id. The join condition defines what you are asking. Making it implicit does not simplify the question. It hides it.

This is why each SQL query is designed to be readable on its own. You should be able to look at a query and understand what data it produces without checking the schema, without knowing what constraints exist, without wondering if someone renamed a column last Tuesday.

A self-contained query is a predictable query. And in a language where one missing WHERE clause can update every row in a table, predictability is not a nice-to-have.

The Proposal That Proves the Point

If SQL really should leverage foreign keys for joins, someone would have proposed it. Someone did.

A community discussion piece proposed a JOIN FOREIGN syntax that would use FK constraints to determine join conditions automatically:

sql
SELECT c.first_name, c.last_name, r.rental_date
FROM customer c
  JOIN FOREIGN rental r;

No ON clause needed. The engine would look up the foreign key between the two tables and use it as the join condition.

The criticism writes itself. If someone adds, drops, or modifies a foreign key constraint, the behavior of every JOIN FOREIGN query touching those tables changes silently. No syntax error. No warning. The query returns different data because a DBA made a schema change that never touched a single line of application code.

This is the same reason SELECT * is discouraged in production. It couples your query to the current shape of the table. JOIN FOREIGN would couple your query to the current shape of the constraints. Both create a class of bug where the code did not change but the behavior did.

The proposal is a well-reasoned thought experiment. It also demonstrates exactly why SQL does not work this way.

Stay in Your Lane

The SQL case is an instance of a broader principle in software design: a tool should stay true to its design philosophy rather than feature-creep into another paradigm for the sake of convenience.

SQL is declarative. Adding implicit schema-dependent join behavior would blur the line between what the query says and what the database does behind the scenes. Once you start treating constraints as query logic, where do you stop? Should NOT NULL constraints influence result filtering? Should CHECK constraints appear in WHERE clauses? The line is clear right now precisely because SQL does not cross it.

The same tension shows up elsewhere. CSS is declarative by design, but it has been growing steadily more expressive: calc(), clamp(), custom properties, :has(), container queries. These are still declarative, not control flow, but each one widens the surface area of what a stylesheet is expected to express. Whether CSS is expanding capability or expanding purpose depends on who you ask, and the long-term cost is still being tallied. HTML resists logic at the language level, though frameworks like JSX blur that boundary at the application layer. The Unix philosophy, do one thing well, captures the entire principle in four words.

The counterargument is always convenience. And convenience is a real benefit. But convenience that breaks the mental model of the tool, that makes its behavior depend on context you cannot see in the code, is a net loss. You save thirty seconds writing the query and spend thirty minutes debugging why it silently changed.

When the Line Gets Crossed

Sometimes the principle gets violated. The results range from controversial to dead.

Adobe Flash started as a simple animation tool called FutureSplash in 1996. Then it became a scripting engine. Then a video platform. Then a full application runtime. Then a mobile runtime. Each expansion was driven by a real market need. And each one brought problems the original tool was never built to handle: security vulnerabilities from the plugin architecture, performance problems from running a full VM in the browser, accessibility failures from rendering outside the DOM. HTML5, CSS3, and JavaScript replaced it, each doing one thing properly instead of one tool doing everything poorly. Adobe killed Flash in 2020.

SOAP started its life as the Simple Object Access Protocol. Then the enterprise stack arrived. WSDL for service descriptions. XSD for type schemas. UDDI for service discovery. WS-Security, WS-Addressing, WS-ReliableMessaging. What began as a messaging protocol became so complex that developers needed expensive tooling just to make a basic API call. REST replaced it by doing less, just HTTP verbs and whatever representation the client needs, and won precisely because of that simplicity.

BlackBerry built its identity on physical keyboards and enterprise security, and dominated. Then smartphones became consumer entertainment devices, and BlackBerry decided it needed to be one too. The BlackBerry Storm in 2008, their first touchscreen phone, was so broken that Verizon saw return rates unlike anything the carrier had dealt with before. BlackBerry lost the consumer market they chased and the enterprise market they neglected in the process. They stopped making phones in 2016.

In each case, the failure was not technical. It was philosophical. A tool was asked to become something it was not designed to be, and the accumulated tradeoffs eventually exceeded the benefits.

Expanding Capability Is Not Expanding Purpose

Foreign keys are a capability of the database engine. Their purpose is data integrity: preventing orphaned rows, enforcing valid references, maintaining relational consistency. Using them as query logic would expand their purpose beyond what they were designed for. It would make them serve two masters, the storage engine and the query language. And when those masters disagree, when a constraint is modified for integrity reasons but it breaks a query, nobody wins.

SQL making you write ON c.customer_id = r.customer_id every time is not the language being obtuse. It is the language keeping your query honest. The join condition is in the query because the join condition is part of the question.

A tool that knows its limits is not a limited tool.