Dogfooding dbbasic-tsv: Migrating This Blog From JSON to TSV
There's a special kind of satisfaction that comes from using your own tools in production. Today I want to share the story of how dbbasic-tsv went from concept to PyPI package to powering this very blog you're reading - all in the same day.
The Launch: dbbasic-tsv 1.0.1
First, let me introduce the tool. dbbasic-tsv is a Python library that treats TSV (Tab-Separated Values) files as a database. It provides a simple API for inserts, queries, updates, and deletes - while keeping all your data in human-readable text files.
Why TSV Files as a Database?
- Zero setup: No server to configure, no migrations to run
- Human-readable: You can
grep
,cat
, or edit files directly - Git-friendly: Text files mean meaningful diffs and version control
- Fast enough: 163K inserts/sec in pure Python, 600K/sec with Rust
- Simple: The whole library is under 1000 lines of code
The core philosophy is simple: most applications don't need PostgreSQL. They need something between "47 individual JSON files" and "enterprise database cluster."
The Dogfooding Moment
After publishing to PyPI, I looked at my own blog's architecture. This site was storing 47 articles as individual JSON files in content/articles/
. Each article had metadata (title, date, author, tags) and content blocks. The Flask app would open and parse these JSON files on every request.
It worked fine. But there was a certain irony in having just published a tool for managing structured data... and not using it myself.
The Decision
"Could this blog run on dbbasic-tsv?"
More importantly: Should it?
The answer to both was yes. If the tool couldn't handle 47 blog articles, how could I recommend it for real applications?
The Migration: JSON → TSV
I wrote a migration script that would read all 47 JSON files and convert them to TSV format. The schema was straightforward:
TSV Schema Design
articles_db = TSV(
"articles",
columns=[
"slug", # URL-friendly identifier
"title", # Article title
"date", # Publication date
"author", # Author name
"category", # Primary category
"description", # Meta description
"tags", # Comma-separated tags
"content_json" # Blocks as JSON string
],
data_dir=Path("data")
)
The clever bit: storing the content blocks (paragraphs, headings, cards, lists) as a JSON string in the content_json
column. This preserved the complex nested structure while keeping the rest of the data queryable as flat fields.
Running the migration:
Migration Results
$ python migrate_to_tsv.py --execute
✓ TSV database initialized at: data/articles.tsv
✓ Found 47 JSON articles
✓ Migrated: evolution-internet-clients
✓ Migrated: when-databases-made-sense
✓ Migrated: unix-foundation-web-dev
...
============================================================
Migration complete:
Migrated: 47
Skipped: 0
Errors: 0
Database stats:
Total articles: 47
File size: 950 KB
Location: data/articles.tsv
47 separate JSON files → 1 TSV file. 950KB total. Human-readable. Greppable.
The Safe Deployment: Fallback Strategy
My first instinct was defensive programming. I updated the Flask app to try TSV first, but fall back to JSON files if anything went wrong:
Initial Implementation (With Safety Net)
def load_article(slug):
# Try TSV first
if TSV_ENABLED and articles_db:
try:
row = articles_db.query_one(slug=slug)
if row:
blocks = json.loads(row['content_json'])
return {
'slug': row['slug'],
'title': row['title'],
'meta': {...},
'blocks': blocks
}
except Exception as e:
print(f"TSV load failed: {e}, falling back to JSON")
# Fallback to JSON files
json_path = f'content/articles/{slug}.json'
if os.path.exists(json_path):
with open(json_path, 'r') as f:
return json.load(f)
This felt responsible. It added 64 lines of code: TSV loading logic, error handling, JSON fallback, TSV_ENABLED
flag checking.
I deployed it at midnight. Checked the logs:
Production Logs
✓ TSV DATABASE ACTIVE: 47 articles loaded
[TSV] Loaded article: unix-foundation-web-dev
[TSV] Loaded article: evolution-internet-clients
[TSV] Loaded article: when-databases-made-sense
Every single article was loading from TSV. The JSON fallback code never executed. Not once.
The Complexity Question
Looking at the diff, I realized something: I had increased complexity by 64 lines. The goal was simplification, but defensive programming had made the codebase more complex, not less.
The Problem with Safety Nets
- 64 lines of fallback code that never runs
- Dual-system complexity (TSV and JSON)
- "What happens if TSV fails?" mental overhead
- Two sources of truth to maintain
- More branches in control flow
The irony was thick. I had just written an article about questioning whether you need a database. Now I was questioning whether I needed fallback code.
Going All In: Removing the Safety Net
After verifying TSV worked perfectly in production, I made the decision: remove all JSON fallback code.
Final Simplified Version
# Initialize (no error handling)
articles_db = TSV(
"articles",
columns=["slug", "title", "date", "author",
"category", "description", "tags", "content_json"],
data_dir=Path("data")
)
# Load article (no fallback)
def load_article(slug):
row = articles_db.query_one(slug=slug)
if not row:
return None
blocks = json.loads(row['content_json'])
return {
'slug': row['slug'],
'title': row['title'],
'meta': {...},
'blocks': blocks
}
# RSS feed (no fallback)
def generate_rss_posts():
posts = []
for article in articles_db.all():
if article.get('date'):
year, month, day = article['date'].split('-')
posts.append({...})
return sorted(posts, key=lambda x: x['date'], reverse=True)
The results:
Code Reduction
- Removed: 105 lines
- Added: 41 lines
- Net reduction: 64 lines (1836 → 1772 lines)
- load_article(): 40 lines → 20 lines
- generate_rss_posts(): 61 lines → 31 lines
- Cyclomatic complexity: ~8 → ~2
More importantly:
- Single source of truth (TSV only)
- No "what if TSV fails" mental overhead
- Simpler control flow (no try-catch, no if-enabled checks)
- Easier to understand for future maintainers
- Actually practicing the "Simple > Complex" philosophy
What We Learned
1. Dogfooding reveals truth
You can claim your tool is simple, but using it in production forces you to confront reality. If I wasn't willing to run my own blog on dbbasic-tsv, why would anyone else trust it?
2. Safety nets can be complexity nets
Defensive programming adds code. Fallback logic adds branches. Error handling adds mental overhead. Sometimes the simplest code is code that just works - with no backup plan.
3. Production verification before simplification
The two-step deployment was actually smart: First add TSV with fallback, verify it works, then remove fallback. This gave confidence to simplify without fear.
4. Text files are legitimately fast
The site "feels very fast, as if it's static." That's because TSV loads the entire file into memory on startup and queries are just dictionary lookups. For 47 articles (950KB), this is instant.
5. The 1995 problem still matters
As I wrote in When Databases Made Sense, the question is: "Do I have the 1995 checkout race condition problem?" For a blog with one author and no concurrent writes, the answer is no. TSV is perfect.
Try It Yourself
Want to try dbbasic-tsv? It's on PyPI:
Installation
pip install dbbasic-tsv
Basic usage:
Quick Start
from dbbasic.tsv import TSV
from pathlib import Path
# Create a table
users = TSV("users", ["id", "name", "email"], data_dir=Path("data"))
# Insert data
users.insert({"id": "1", "name": "Alice", "email": "[email protected]"})
# Query data
user = users.query_one(email="[email protected]")
print(user) # {'id': '1', 'name': 'Alice', 'email': '[email protected]'}
# Your data is just a text file!
# $ cat data/users.tsv
# id name email
# 1 Alice [email protected]
Check out the GitHub repo for documentation, benchmarks, and examples.
The Bottom Line
This blog now runs on a "toy" TSV database. The entire article database is a single 950KB text file. You can grep it, diff it, version control it, or edit it in vim.
It's simpler than the JSON file approach (one file vs 47). It's simpler than PostgreSQL (zero setup). It's simpler than the code I wrote yesterday (64 fewer lines).
Most importantly: it works. You're reading proof right now.
The Dogfooding Test
If you wouldn't use your own tool in production, why should anyone else?
Now we can confidently say: dbbasic-tsv powers real websites. Including this one.
Simple > Complex. Proven in production. At midnight. On a Tuesday.