Billing & Usage¶
HX-SDP meters all API operations in Compute Units (CUs). Each tenant has a monthly CU quota determined by their billing tier.
CU cost table¶
| Operation | Endpoint | CU cost | Description |
|---|---|---|---|
put |
POST /v1/put |
1.0 | Ingest + QTT-SVD compression |
put_cores |
POST /v1/put_cores |
1.0 | Direct TT-core upload |
put_cores_batch |
POST /v1/put_cores_batch |
2.0 | Batch TT-core upload |
get |
GET /v1/get/{ns}/{key} |
0.1 | Metadata retrieval (no decompress) |
serve |
POST /v1/serve (dense) |
0.5 | Dense array reconstruction |
serve_gpu |
POST /v1/serve (gpu) |
2.0 | Load TT-cores to GPU VRAM |
search |
POST /v1/search |
0.5 | Metadata-filtered search |
delete |
DELETE /v1/delete/{ns}/{key} |
0.1 | Soft-delete entry |
list |
GET /v1/list/{ns} |
0.1 | List keys in namespace |
query_similarity |
POST /v1/query/similarity |
1.0 | Pairwise core inner product |
query_topk |
POST /v1/query/topk |
1.0 | Top-K similarity search |
query_vector |
POST /v1/query/vector |
1.0 | External vector top-K search |
The X-HX-CUs response header reports the cost charged for each request.
Billing tiers¶
Managed (SaaS)¶
| Tier | Price | Monthly CUs | Rate limit | Overage |
|---|---|---|---|---|
| Starter | Free | 1,000 | 10 req/min | — |
| Builder | $49/mo | 50,000 | 100 req/60s | $0.05/CU |
| Pro | $299/mo | 500,000 | 1,000 req/60s | $0.03/CU |
| Enterprise | Custom | Custom | Custom | Custom |
Self-hosted¶
Self-hosted deployments have no CU limits by default. Enable metering via HX_GATE_BILLING_ENABLED=true if you want to track usage across internal tenants.
Checking usage¶
Via API¶
# All tenants (admin)
curl https://gate.holonomx.com/gate/admin/usage \
-H "Authorization: Bearer $SERVICE_KEY"
# Single tenant (admin)
curl https://gate.holonomx.com/gate/admin/usage/acme-corp \
-H "Authorization: Bearer $SERVICE_KEY"
Via Console¶
The Console dashboard shows a real-time usage bar for the authenticated tenant.
Response example¶
{
"tenant_id": "acme-corp",
"period": "2026-01",
"cus_used": 12450.5,
"monthly_quota": 500000,
"utilization": 0.0249,
"breakdown": {
"put": 5000.0,
"get": 250.0,
"query_topk": 3200.0,
"serve": 2000.0,
"search": 1500.0,
"delete": 500.5
}
}
Quota enforcement¶
When a tenant exceeds their monthly CU quota:
- Requests return
503 Service Unavailablewith: - The tenant can upgrade their tier immediately via the Console or API.
- Enterprise tenants can configure soft limits with overage billing.
Stripe integration¶
Managed deployments use Stripe for subscription management:
- Checkout:
POST /gate/checkoutcreates a Stripe Checkout session. - Provisioning: On successful payment, a webhook (
POST /webhooks/stripe) automatically provisions the tenant with the corresponding tier. - Upgrades/downgrades: Use
POST /gate/onboard/update-tieror let the customer manage via Stripe Customer Portal.
Required environment variables:
| Variable | Description |
|---|---|
HX_GATE_STRIPE_SECRET_KEY |
Stripe secret key (sk_live_... or sk_test_...) |
HX_GATE_STRIPE_PUBLISHABLE_KEY |
Stripe publishable key (pk_live_...) |
HX_GATE_STRIPE_WEBHOOK_SECRET |
Webhook signing secret (whsec_...) |
HX_GATE_STRIPE_PRICE_BUILDER |
Stripe Price ID for Builder tier |
HX_GATE_STRIPE_PRICE_PRO |
Stripe Price ID for Pro tier |
HX_GATE_BILLING_ENABLED |
Set to true to enforce billing |
Billing period¶
- CU counters reset on the 1st of each calendar month (UTC).
- Overage charges are calculated at the end of the billing period.
- Unused CUs do not roll over.