When Databases Made Sense: The 1995 Checkout Problem
There's a moment in web development history that everyone forgets: the exact reason we all started using MySQL and PostgreSQL. It wasn't because databases were cool or because they were "best practice." It was because flat files had a very specific, very real problem with concurrent writes.
The Problem: Double-Selling Inventory
Picture this: It's 1995. You're running an e-commerce site with CGI scripts and flat files. You have one widget left in stock. Two users click "Buy Now" at the exact same moment.
The Race Condition
# User A's CGI script
qty=$(grep 'widget_42' inventory.txt | cut -f2) # Reads: 1
# User B's CGI script runs here!
qty=$(grep 'widget_42' inventory.txt | cut -f2) # Also reads: 1
# User A writes
echo "widget_42\t0" > inventory.txt
# User B writes
echo "widget_42\t0" > inventory.txt
# Result: You just sold 2 widgets but only had 1!
This wasn't a theoretical problem. This was happening in production. People were overselling inventory, double-charging credit cards, corrupting order data. The web was getting real traffic, and flat files couldn't handle it.
Why MySQL Won
MySQL (and PostgreSQL, and others) solved this with transactions and row-level locking. The solution was elegant:
The Database Solution
-- User A's transaction
BEGIN;
UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 42
AND quantity > 0;
-- Returns: 1 row affected
COMMIT;
-- User B's transaction (runs simultaneously)
BEGIN;
UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 42
AND quantity > 0;
-- Returns: 0 rows affected (no inventory left!)
ROLLBACK;
The database handled the locking, the atomicity, the consistency. It was free (MySQL/PostgreSQL were open source), relatively easy to set up, and it just worked. The trade-off was obvious: add a 100MB daemon and some connection pooling overhead, but never worry about race conditions again.
The Trade-Off That Made Sense
In 1995-2005, using a database was clearly the right choice:
- Real Problem: Race conditions were causing actual business problems
- Easy Solution: Install MySQL, change a few lines of code
- Low Cost: 100MB RAM and setup time < lost revenue from race conditions
- Free: Open source databases were available and mature
The Cargo Cult Begins
But then something interesting happened. By 2005, using a database had become "best practice." Everyone was doing it. Tutorials assumed it. Frameworks were built around it. And we forgot why we started doing it in the first place.
The conversation shifted from:
1995: Making a Trade-off
- Question: "Do I have concurrent write problems?"
- If yes: "Use a database"
- If no: "Flat files are fine"
To:
2005: Following Best Practices
- Assumption: "Professional web apps use databases"
- Result: Everyone uses PostgreSQL
- Cost: Setup complexity, RAM overhead, connection pooling
- Benefit: Solving a problem you might not have
Do You Actually Have The Problem?
Let's be honest about what most web applications actually are:
Most Web Apps in 2025
reality_check = {
"Blog": {
"Concurrent writers": 1, # Just you
"Concurrent readers": "Doesn't matter, reads don't conflict",
"Need database?": "No"
},
"Internal tool": {
"Concurrent users": 5,
"Writes per second": 0.1,
"Need database?": "Probably not"
},
"SaaS app": {
"Concurrent users": 100,
"Writes per user": "Their own data",
"Conflicts": "Rare",
"Need database?": "Maybe?"
},
"E-commerce checkout": {
"Shared inventory": True,
"Concurrent writes": "Common",
"Need database?": "YES - this is the original problem!"
}
}
What Changed: File Locking Actually Works
Here's the thing nobody talks about: Unix had file locking in 1995. We just didn't use it properly in CGI scripts. Modern approaches can handle the original problem just fine:
Modern Flat File Solution
#!/bin/bash
# Atomic inventory update with file locking
# flock ensures only one process modifies at a time
flock inventory.txt -c '
qty=$(grep "^widget_42" inventory.txt | cut -f2)
if [ "$qty" -gt 0 ]; then
# Update with awk, write to temp, atomic rename
awk -v id=widget_42 "\$1==id {\$2--} 1" inventory.txt > temp.txt
mv temp.txt inventory.txt # Atomic operation
echo "Success"
else
echo "Out of stock"
fi
'
# This actually solves the 1995 problem!
Alternative Approaches
Even if you can't or don't want to use file locking, there are other patterns that avoid the problem entirely:
Append-Only Logs
# No concurrent update conflicts - only appends!
echo "$(date +%s)\tsale\twidget_42\tuser_123" >> sales.log
echo "$(date +%s)\tsale\twidget_42\tuser_456" >> sales.log
# Calculate current inventory by reading the log
grep widget_42 sales.log | wc -l
Optimistic Locking
# Add a version column to your TSV
# widget_42 5 v123
def update_inventory(product_id, expected_version):
# Read current state
qty, version = read_inventory(product_id)
# Check version hasn't changed
if version != expected_version:
return False # Someone else updated it
# Write with new version
write_inventory(product_id, qty - 1, generate_new_version())
return True
Single Writer Pattern
# Queue all writes to a single process
echo "decrement widget_42" >> write_queue.txt
# Separate process handles writes sequentially
while true; do
if [ -s write_queue.txt ]; then
cmd=$(head -1 write_queue.txt)
execute_write "$cmd"
tail -n +2 write_queue.txt > write_queue.tmp
mv write_queue.tmp write_queue.txt
fi
sleep 0.1
done
When You Actually Need a Database
Let's be clear: databases aren't wrong. They're just solving a specific problem. You actually need PostgreSQL/MySQL when you have:
- High-frequency concurrent writes: Multiple users modifying the same data >100 times per second
- Complex multi-table transactions: Updates that must span multiple files atomically
- E-commerce checkout: The original 1995 problem - shared inventory with concurrent purchases
- Financial transactions: Where correctness is critical and conflicts are common
- Complex relational queries: JOINs across many tables with complex conditions
When Flat Files Are Fine
For most applications, you don't have the 1995 problem:
- Blogs and content sites: One writer (you), many readers (no conflicts)
- Analytics and logs: Append-only data (no concurrent updates)
- Configuration management: Rare updates, version control is more important than transactions
- Personal tools: Single user or very low concurrency
- Prototypes and MVPs: Optimize for speed of development, not theoretical scale
The Absurdity of Modern Defaults
Here's what we've normalized:
For a 46-Article Blog
"enterprise" = {
"Data size": "500KB",
"Writers": 1,
"Concurrent writes": "Never",
"Solution": {
"PostgreSQL daemon": "100MB RAM idle",
"Connection pool": "10+ connections",
"Query parser": "Complex optimization",
"Network sockets": "Even for localhost",
"ORM layer": "Additional abstraction",
},
"Alternative": {
"TSV files": "500KB on disk",
"grep": "Native C code",
"Total overhead": "~0MB",
},
"Ratio": "200:1 complexity for zero benefit"
}
It's like renting a semi-truck to transport a sandwich. Sure, the truck works, and it would be necessary if you were transporting 10,000 sandwiches. But you're not.
What We're Building
This realization led me to create dbbasic-tsv: a TSV-based database that's honest about what it's solving:
- File locking for atomic operations
- Optimistic locking for update conflicts
- Append-only patterns where possible
- Single-writer queues when needed
- Zero setup because there's no daemon
- grep-friendly because text files are debuggable
Performance Reality
Pure Python TSV operations achieve 163,000 inserts/second - more than enough for any blog, most internal tools, and many production applications. When you actually need more speed, optional Rust acceleration provides 600,000+ ops/sec. For comparison, your blog probably receives fewer than 100 writes per day.
Conclusion: Remember Why
Databases made sense in 1995 because of a specific problem: concurrent writes to shared data causing race conditions in e-commerce checkouts. The trade-off was clearly worth it.
But somewhere along the way, we forgot to ask: Do I have the 1995 problem?
Most of us don't. We're running PostgreSQL for blogs with one writer. We're using connection pools for internal tools with five users. We're optimizing for problems we don't have.
The good news: you can go back. Flat files work fine when you don't have concurrent write conflicts. And when you do? Modern file locking, append-only logs, and optimistic concurrency control can handle far more than you think.
The Question to Ask
Before you install PostgreSQL, ask yourself:
- Do I have multiple users updating the same data simultaneously?
- Is this actually happening, or am I just planning for it?
- Could I solve this with file locking or append-only logs?
- Am I optimizing for a problem I don't have yet?
Simple > Complex. Even in 1995, we knew that. We just forgot.