Safebots, Inc. · Investment Thesis Restricted · For prospective investors

The next layer of AI infrastructure is sovereign, verifiable, and shared.

Frontier-class open-weight models meet a sealed, attested execution substrate. Enterprises bring their data and keep it. Auditors verify by cryptography rather than questionnaire. Costs collapse — and what was operating expense becomes infrastructure other people pay to use.

Series   Seed extension / Series A Stack   Qbix · Intercoin · Safebots · Safebox Companion   safebots.ai/costs.html
I · The moment

Three things just became simultaneously true.

For the first time, an open-weight model with a permissive licence beats the leading proprietary frontier model on the benchmark that matters most for production agent work. GLM-5.1, MIT-licensed, leads SWE-Bench Pro ahead of GPT-5.4 and Claude Opus 4.6. Within the same month, four other frontier-class open-weight families shipped — DeepSeek V4, Qwen 3.6, Llama 4, Gemma 4. The argument that open-weights run two years behind the frontier is now empirically wrong.

At the same time, infrastructure for running those models has become turnkey. A single AMI deploys an attested inference cluster on AWS, GCP, or Azure in eight hours instead of the hundred and sixty hours a hand-rolled vLLM stack used to take. KV-cache reuse, per-tenant cache partitioning, and explicit cache flushing are first-class controls — capabilities the commercial APIs cannot offer because they multiplex across customers.

And demand has come for the model that most enterprises have so far refused to send their data to. The result is an opening: institutional AI infrastructure, owned by the customer, serving the workload at a fraction of the API price, with safety properties auditors can verify by cryptography rather than questionnaire.

SHIFT I

Open-weight models reach competitive frontier on real production tasks.

GLM-5.1 (MIT) at 58.4% SWE-Bench Pro beats GPT-5.4 and Claude Opus 4.6. DeepSeek V4 at 80.6% SWE-Bench Verified is within 0.2 of Opus. Qwen 3.6 runs on a single consumer GPU at 73.4%. The training cost story has been told; the deployment story is now the one to tell.

SHIFT II

Local inference gives operators KV-cache control that the APIs cannot.

Per-tenant cache tags. Explicit flush by tenant, model, or scope. Per-request cache mode. vLLM with prefix caching reuses prompt prefixes across multi-turn agents at near-100% hit rates; long-running workloads converge on a 10× cost reduction the APIs offer only at their own margin.

SHIFT III

Sealed attestation makes compliance a cryptographic property.

AWS Nitro attestation proves which code is running. M-of-N governance enforces separation of duties. Append-only stream logs are the audit trail. The recurring annual cost of proving a control existed is replaced with a one-time signature any auditor can verify.

SHIFT IV

An open-source movement creates demand without absorbing it.

The OpenClaude wave proved enterprises will run their own inference if the path is short. What it did not do is solve the safety, governance, and multi-tenancy problems that enterprises actually buy. The market is asking for the next layer; we are building it.

II · The model layer

What the benchmarks actually say.

The deck does not ask investors to take "open-weight is competitive" on faith. The numbers are public, recent, and from the benchmark families that production agent teams care about — software engineering, terminal-task completion, agentic reliability. We quote the leading variant of each family.

Model
SWE-Pro
Term-2.0
Licence
Self-host
GLM-5.1 Z.ai
58.4%
MIT
2× H100
DeepSeek V4-Pro
57.7%
67.9%
MIT
8× H100
Claude Opus 4.6 (closed)
57.3%
65.4%
API only
GPT-5.4 (closed)
57.7%
API only
Qwen 3.6-35B-A3B
~52%
Apache 2
1× RTX 4090
Llama 3.3 70B
~46%
Llama 3.3
2× A100

Source: Lushbinary, BuildFastWithAI, BenchLM.ai · April 2026 · Benchmarks vary by harness; absolute scores are less interpretable than the proximity of leading open-weight to leading closed-weight.

Two cautions. Benchmark scores diverge from real-world agentic reliability. Independent reproduction tests in April 2026 found that of fifteen leading models, only four — the two Claude Opus versions and the two GLM-5 versions — produced agent code that actually ran without inventing nonexistent APIs. The SWE-Pro score is necessary, not sufficient. We architect Safebots for multi-model deployment with frontier-class fallback for the cases where local inference falls short.

The second caution: training costs to push models to this level have become extreme — DeepSeek's V4 training reportedly used Huawei Ascend chips, and the next generation of frontier models is expected to require investment unavailable to most labs. Our bet is not that open-weights will continue closing the gap forever. It is that the gap is small enough today that production workloads can be served from local inference with cloud fallback for the long tail. That bet is good for as long as it needs to be.

III · The substrate

Why the data cannot leave the Safebox.

The Safebox is an attested execution environment that runs on standard cloud infrastructure but enforces architectural properties that ordinary cloud deployments cannot enforce. The enterprise buys a property; the property is verifiable. Auditors do not ask whether it holds; they verify the signature.

Architecture · simplified
SAFEBOX · ATTESTED PERIMETER · NITRO ENCLAVE · M-OF-N GOVERNED DATA EXECUTION INFERENCE Streams & relations Append-only · per-tenant ZFS dataset Sealed in tenant volume · never copied out Sandboxed capabilities Read-only by default · writes via proposal SHA-256 verified at load · no live network Local model runner vLLM · Unix socket · UID-bounded KV cache scoped per tenant tag Action queue · M-of-N voting · policy graph Every state mutation proposes · is voted · is logged. Auditor reads the log; auditor verifies the chain of signatures. Inference log Per-call · token counts · cache hits Source of truth for billing · rebates Nitro attestation · TPM-bound boot · whole-image SHA verified at every cold start A signed measurement at boot proves to any external party which code is executing — including the model weights and the sandbox runtime.

Tenant data, capability execution, and inference live in the same attested perimeter. Nothing leaves.

Four properties that compose to provable safety.

IV · Compliance, recast

From annual evidence-gathering to continuous proof.

The recurring spend on PCI, HIPAA, GDPR, and SOC 2 audits is overwhelmingly the cost of producing evidence — interview notes, screenshots, control-effectiveness samples, vendor questionnaires. The substrate replaces most of that with continuous cryptographic evidence the auditor can verify directly. Audits do not vanish; auditors still review process, training, and incident response. The cost of meeting the technical control requirements collapses.

SOC 2 · CC6.1

Logical access enforced.

Was: quarterly access reviews, role mapping spreadsheets, sampled evidence of approval workflows.

Now: M-of-N policy attested at the substrate. Every write carries the signatures of the keyholders that approved it. The audit trail is the substrate; the auditor verifies signatures.

PCI DSS · 3.5

Cryptographic key management.

Was: key custodian roster, dual-control attestations, HSM access logs reviewed quarterly.

Now: keys never leave the enclave; quorum is M-of-N. Compromise of any single operator does not produce a valid signature. The enclave attestation is the key-isolation proof.

HIPAA · §164.312

Technical safeguards on PHI.

Was: BAAs with every subprocessor; encryption-at-rest evidence; access-log reviews.

Now: PHI is in the tenant's ZFS volume; capabilities that touch it run in the enclave; outbound network is policy-restricted. There is no subprocessor to BAA with that the data was not already shielded from.

GDPR · Art. 28, 32

Processor obligations and data residency.

Was: per-region contracts; data-flow diagrams; transfer-impact assessments.

Now: deploy the same AMI in any region. The substrate guarantees data does not leave the enclave; jurisdiction is a deployment parameter, not a contractual obligation chain.

The deck is careful with the framing. Compliance is also legal, procedural, and human. Cryptography is a powerful audit tool, not a substitute for a compliance program. The claim is not that the program disappears. The claim is that the recurring annual cost of producing technical-control evidence approaches zero, because the evidence is generated continuously and verified mechanically. Enterprises that spend seven figures per year on this category will save the majority of that spend.

V · The economics

What the customer's line item looks like.

The cost case has been documented at length on safebots.ai/costs.html. The summary, with figures unchanged from the public page:

Organisation API path Safebox path Annual saving
Mid-market · $1M annual 900M tokens $50K infra $950K · 95.0%
Enterprise · $10M annual Custom contract $240K multi-tenant $9.76M · 97.6%
Fortune 500 · $1B annual Dedicated infrastructure $240K multi-tenant $999.76M · 99.98%
Three-year cumulative · mid-market $5.19M $150K $5.04M · 97.1%

Caching, reusing, and the asymptote.

The headline number understates the case. Two effects compound:

The flagship case for an institution buying this is not the 95% saving versus their current API bill. It is what they spend the saving on. A mid-market customer redirects $950K per year into engineering headcount or product. A Fortune 500 customer redirects nine hundred and ninety-nine million.

VI · The market structure

How SAFE and SafeBux actually flow.

The customer pays in dollars. The infrastructure operator earns SafeBux by serving requests. SafeBux are sold through a bonding curve, and revenue from those sales flows to staked SAFE tokens. The structure is simple and the incentives align cleanly.

SafeBux — the unit of compute.

Anyone can deploy a Safebox AMI in AWS, GCP, or Azure. When the instance serves an inference request, an artifact fetch, a capability invocation, or a storage operation, it earns SafeBux proportional to the work done. Cache hits earn at a discounted rate; the original cache-warmer is rebated. The network therefore rewards both serving requests and producing artifacts that other tenants reuse.

  • Earned by infrastructure operators, per request served.
  • Spent by tenants, to pay for compute and storage.
  • Discount on cache hits; rebate to the warmer.
  • Decentralized issuance — no single operator controls supply.

SAFE — the security token.

SAFE tokens represent equity-like rights to the cashflows generated when SafeBux are purchased on the bonding curve. Investors stake SAFE; tenants buy SafeBux; bonding-curve revenue distributes to stakers proportional to their stake. Liquid secondary markets exist; staking is opt-in.

  • Cashflows from SafeBux purchases via bonding curve.
  • Distribution proportional to stake.
  • Tradable on secondary markets.
  • Issued under the Unblockers SAFE-token framework with a custodial agreement and Issuer Encumbrance Compliance Agreement — built for tokenized securities, not vapor.

The reason this works as an investment vehicle, not as a speculative coin offering, is the directness of the cashflow. SafeBux are bought because compute is needed. The bonding curve is the price-discovery mechanism for that compute. Stakers do not need belief in token narratives; they need belief in compute demand. We have spent a decade building the substrate that makes that demand structural.

VII · Why this team

Fifteen years of building what most teams discover they need.

Qbix has shipped a streams-based collaborative substrate since 2011 and reached over seven million app downloads. Intercoin, founded 2018, deployed on eight blockchain mainnets, has been iterating on the M-of-N governance, sealed attestation, and chilling-effect consensus primitives that compose into Safebox. The Magarshak Architecture papers — Magarshak Machine, Grokers, Context — have formalised what was previously implementation folklore into theorems with proofs.

Five active patent applications cover the components that matter: sealed-computation execution, reactive capability partitioning, KV-cache-aware deterministic context assembly, cross-domain state-transition verification, and the fleet-learning inference acceleration system. The provisional has been filed ahead of the public arXiv release of the Context paper, preserving international rights.

i.
The model layer has caught up. The infrastructure layer is the new battleground. Customers will not pay frontier margins to vendors who cannot prove what runs on their data.
ii.
We have spent a decade building the streams, governance, sandboxing, and attestation layers that competitors will need at least three years to assemble.
iii.
The economics work without a token narrative. Customers save 95–99% on infrastructure cost. SafeBux are denominated in compute, not speculation.
iv.
The market timing is narrow. Within three years the API price normalisation will have happened, the open-weight models will have settled, and the compliance category will have absorbed verifiable-execution as a baseline expectation. We launch into that window.
VIII · What could go wrong

The four risks an honest deck must name.

Frontier models pull away again.

Open-weight has caught up; that does not mean it stays caught up. If the next-generation frontier reopens a clear gap on agentic reasoning, the cost case loses its sharpest edge.

Mitigation: the substrate is model-agnostic. Customers route commodity workloads to local inference and frontier-class workloads to API fallback within the same architecture. Both pay through the substrate; the substrate's value is the substrate, not the specific model.

Hyperscalers vertically integrate the same offering.

AWS, Azure, and GCP could ship attested-AI-runtime products that compress our margin against their captive customers.

Mitigation: the deployment surface is each hyperscaler's own AMI image. We are a thin layer above their primitives, not a competing cloud. Their incentive is to enable workloads on their infrastructure; ours is to make those workloads verifiable and inter-operable across their boundaries.

Token regulatory regime shifts unfavourably.

SAFE-token cashflow rights are being issued under the Unblockers framework, but securities regulation in this category remains in flux globally.

Mitigation: Unblockers is a custodial-agreement-and-ICEA structure built specifically to address this regulatory category, not a coin-offering wrapper. Cashflows are real and attributable. The structure has been designed to survive the regulatory normalisation we expect, not to evade it.

Enterprise sales are slow.

Procurement and security review for an attested infrastructure substrate is not measured in weeks. Slow sales cycles can starve a company before adoption compounds.

Mitigation: turnkey AMI deployment shortens the technical review by an order of magnitude. The free-tier deployment is useful as is; expansion follows usage rather than purchase. We have already shipped through Qbix-Groups and have early institutional pilots in negotiation.
IX · The ask

What a check buys, and what it builds.

The proceeds finance the next twelve months of engineering on Safebots, Safebox, and the inference layer; the bonding-curve deployment for SafeBux issuance; and the institutional sales motion against three named enterprise pilots. We are raising on the strength of the substrate and the timing — both of which are described above and verifiable independently.

For institutional investors

A direct equity allocation in Safebots, Inc., or a SAFE-token position with attached cashflow rights via the bonding curve. Diligence packet on request — including the patent portfolio, engineering roadmap, and the three institutional pilot prospectuses.

For strategic partners

Co-deployment programmes for hyperscaler partners, financial institutions evaluating substrate-level compliance, or domain-specific operators who want to deploy a sovereign substrate for their tenants. The AMI is shippable today.

For the curious

The codebase is open source on GitHub. The architecture papers are on arXiv. The cost calculator is on safebots.ai/costs.html. All claims in this deck are reproducible from primary sources.