Vaults

By Marcos

I run 60 sites in production.
I make 100+ infra decisions a year.
I built this for myself first.

Half of my time used to be wasted asking the same question to ChatGPT, then Claude, then Gemini, switching tabs to compare. Then second-guessing whichever one sounded most confident.

So I built Vaults — one prompt, six open-weight LLMs answering in parallel on Cloudflare's edge, with a synthesizer pointing out where they agreed and where one probably hallucinated. The cross-check that should have always existed.

— Marcos · Recife, Brasil · @marquinhos1904

Marcos's Take

Things I'm right about
that most are still wrong about.

Cost

Claude is going to triple in price by 2027.

The big labs are burning $500M+/quarter on training. They subsidize current pricing to grab market share. Once they consolidate, the bill comes due.

What I do: I'm pricing my SaaS now assuming inference cost 3x. If you build assuming today's prices, you're cooked.

Hallucination

RAG is a band-aid. Embeddings alone won't save you.

Throwing your docs into a vector DB and calling it "grounded" is how 80% of AI products break in production. The retrieval layer matters more than the model.

What I do: Layered context — structured prompt skeleton + curated examples + verified primary sources, then the LLM. Not the other way around.

Stack

Cloudflare is the most undervalued stack in 2026.

Workers, D1, R2, Durable Objects, Pages, Queues, Workers AI — you can ship a real SaaS for $0/mo idle. I run 60 sites + 2 SaaS products on it. Zero servers.

What I do: Default to CF for everything. Only leave when there's a real reason. Most "we need AWS" decisions don't survive 5 minutes of scrutiny.

Tools

"Compare Tools" comparisons are mostly affiliate slop.

Anyone can list features and pricing. The valuable signal is: which tool did the operator actually keep paying for after 6 months?

What I do: I publish the live list of what I pay for monthly. When I cancel something, I write why.

Workflow

Most "AI agents" are 4 prompts in a trenchcoat.

Real agentic workflows need structured handoffs, retry logic, observability, and a way to not nuke your budget when one step loops. Most "agent frameworks" skip 3 of those 4.

What I do: Boring deterministic glue first. AI calls only where they earn their keep. Loud failure better than silent hallucination.

Content

You can't beat AI slop with more AI slop.

Generating 1000 articles with GPT and praying Google ranks them is a 2023 strategy that died in March 2024. Helpful Content Update killed it.

What I do: Process matters more than volume. Information gain per article. Specificity that doesn't generalize. Editorial voice the model can't fake.

Dev

You don't need 12 SaaS subscriptions. You need 3.

Notion + Linear + Slack + Figma + 8 others = $400/mo and still confusion. Pick the 3 that match your actual workflow, kill the rest.

What I do: Cursor (or Claude Code) + Linear + 1 voice channel. Everything else is a distraction tax.

Truth

The AI tool you "should" use ≠ the one you'll actually use.

Best-in-benchmark tools die in production because the friction doesn't fit the workflow. The boring 80% solution that fits your habits beats the 100% solution you avoid.

What I do: Test for 1 week minimum, real work, real stakes. If I'm avoiding it by day 3, it's out.

The Research Desk

Built around three operator archetypes.

Each archetype reflects a real workflow operators face when picking AI tools. The six LLMs are tuned around how these perspectives think — so the cross-check feels native to the problem.

Indie Builder

Daniel Park

Seoul · Solo SaaS · 4 yrs shipping

Tests every new AI tool against one question: does this earn the seat in my workflow after 30 days, or is it churn? Cancels more than he keeps.

Marketing Ops

Adrián Reyes

Barcelona · Content & growth

Runs content ops for early-stage teams. Treats AI tool stacks the way an editor treats sources — cross-check or it doesn't ship.

Product Design

Lukas van Bergen

Amsterdam · Senior product designer

Senior product designer who keeps a pen sketchbook open while running Cursor and Figma. AI is in the loop — it never replaces the loop.

Personas are illustrations of the analytical perspectives the six-LLM consensus engine is tuned around. Built independently · No vendor partnerships

What's inside

One brain hallucinates.
Six brains catch it.

The
Verdict

Ask one question. 6 open-weight LLMs answer in parallel on Cloudflare's edge — Llama 70B, DeepSeek R1, Qwen 2.5, Gemma 3, Mistral, Llama 8B. A judge model summarizes the consensus and scores agreement. The cross-check before any decision you can't undo.

~3 sec1 creditPermalink

Multi-
Agent

For decisions one snapshot can't crack. 4 agents with opposing roles — Pessimist, Optimist, Engineer, Strategist — debate your dilemma across 3 rounds, rebut each other live, and a synthesizer closes the verdict. You watch it stream. The boardroom you don't have.

~25 sec5 creditsLive stream

Daily
Brief

2-3x per week, my operator-voice take on what shifted in AI infra and tooling. No "10 best tools" listicles. Specific incidents, specific numbers, what to do about each. The narration alongside the machine.

2-3/weekFreeEmail + Web

I run 60 sites in production.
I make 100+ infra decisions a year.
I built this for myself first.

Things I'm right about
that most are still wrong about.

Claude is going to triple in price by 2027.

RAG is a band-aid. Embeddings alone won't save you.

Cloudflare is the most undervalued stack in 2026.

"Compare Tools" comparisons are mostly affiliate slop.

Most "AI agents" are 4 prompts in a trenchcoat.

You can't beat AI slop with more AI slop.

You don't need 12 SaaS subscriptions. You need 3.

The AI tool you "should" use ≠ the one you'll actually use.

Built around three operator archetypes.

Daniel Park

Adrián Reyes

Lukas van Bergen

One brain hallucinates.
Six brains catch it.

The
Verdict

Multi-
Agent

Daily
Brief

Stop guessing.
Get a verdict.

VAULTS

I run 60 sites in production.I make 100+ infra decisions a year.I built this for myself first.

Things I'm right aboutthat most are still wrong about.

Claude is going to triple in price by 2027.

RAG is a band-aid. Embeddings alone won't save you.

Cloudflare is the most undervalued stack in 2026.

"Compare Tools" comparisons are mostly affiliate slop.

Most "AI agents" are 4 prompts in a trenchcoat.

You can't beat AI slop with more AI slop.

You don't need 12 SaaS subscriptions. You need 3.

The AI tool you "should" use ≠ the one you'll actually use.

Built around three operator archetypes.

Daniel Park

Adrián Reyes

Lukas van Bergen

One brain hallucinates.Six brains catch it.

TheVerdict

Multi-Agent

DailyBrief

Stop guessing.Get a verdict.

I run 60 sites in production.
I make 100+ infra decisions a year.
I built this for myself first.

Things I'm right about
that most are still wrong about.

One brain hallucinates.
Six brains catch it.

The
Verdict

Multi-
Agent

Daily
Brief

Stop guessing.
Get a verdict.