FAQ

Questions, answered.

Parity with RAG is our bar. We measure cartridge answers head-to-head against RAG on the same model, questions, and retrieval, and only ship a config once it matches — cheaper-but-worse isn't the product.

Only the affected shards retrain — minutes, not a full rebuild. Incremental sync detects changes from your connected sources.

Open-weight LLMs you control (e.g. Qwen3), in your own cloud. Cartridges inject trainable KV into the frozen model — no per-token lock-in to a frontier vendor.

Inside your own cloud account, with no public ingress. Your documents never train a shared model and never leave your perimeter.

Fine-tuning bakes facts into weights — hard to update, prone to hallucination, no sources. Everyday keeps knowledge in a swappable, grounded cartridge that updates per-shard.

A one-time training step (self-study + distillation) fanned out across cheap GPUs. After that, serving is flat and cheap, and break-even vs RAG is typically a few hundred queries.

Still have questions?

We'll run a demo on your own corpus and answer them with your numbers.

Book a demo