Mission · Runestone Labs

Every month the models get better. Every month the conversation about them gets a little less interesting. The sharp question moved a while ago. It isn't which model you use. It's what happens when that model decides to run a shell command, write to your disk, send an email, or move money. Most people don't have a good answer yet. That gap is the thing we work on.

The parts of the stack that matter over the long run are the slow parts. Policy. Approval. Audit. Memory with access control. Boring words, but they're the reason an agent is safe to leave running while you sleep. They're also what makes the next model, and the one after that, drop-in replaceable. You get to keep the guardrails and the logs when whatever's underneath changes in six months.

How we sequence what gets built.

Infrastructure is a slow game. We're not building everything at once. The rule we follow: pick the thing that hurts the most right now, ship it as OSS, see if anyone actually uses it. If they do, and they have opinions, we build the next layer. If they don't, we stop. The roadmap doesn't have dates on it. It has evidence gates. Gatekeeper exists because we kept running agents that would have been a lot safer with it. The next primitive will exist for the same reason, or not at all.

Local-first, for a strategic reason.

One reason we self-host is principle. The bigger reason is strategic. Every tool call an agent makes is metadata about how a company actually works. The arguments, the paths, the URLs, which actions got approved, which got blocked. Whoever ends up holding all of that ends up being very important to everyone else. We'd rather not be that. We'd rather be the thing you run on your own hardware and forget about. If we ever do a hosted tier it sits on top of the same OSS code, not instead of it. You can always take the stack and go home.

What we're not building.

Some things we get asked about often enough to be explicit:

Not an agent framework. Gatekeeper doesn't care which framework you use. Plain HTTP. Any language.
Not a model. We use Claude. We'll use whatever's best next year. The model isn't the product.
Not a prompt-security wall. Those exist. They solve a different problem. Prompt-injection defense and tool-call enforcement are not the same thing.
Not an evals company. Important category. Not ours.
Not a product that only works if you use our cloud. If we're not useful on your laptop first, we're not useful.

Where we are.

We're small and early. Gatekeeper is on GitHub and on npm as @runestone-labs/gatekeeper-client. If you want to use it, break it, integrate it somewhere real, or argue about what we're getting wrong, the contact page has working addresses. We answer our own mail.

One side note. The same people behind this also run a couple of smaller consumer-facing AI projects, on separate brands with separate audiences. We keep them off this site on purpose. It'd muddy what Runestone Labs is for. If you're curious and you find them, that's fine. Mostly we want one sentence to set expectations here: this company is about the infrastructure, not the content layer.

The interesting part of an agent isn't the model.
It's where the model meets the world.

Boundary primitives.

Policy

Approval

Audit

Memory with a boundary

How we sequence what gets built.

Local-first, for a strategic reason.

What we're not building.

Where we are.

The interesting part of an agent isn't the model. It's where the model meets the world.

Boundary primitives.

Policy

Approval

Audit

Memory with a boundary

How we sequence what gets built.

Local-first, for a strategic reason.

What we're not building.

Where we are.

The interesting part of an agent isn't the model.
It's where the model meets the world.