Skip to content

Concepts

Core concepts behind HX-SDP's architecture and compression pipeline.


QTT compression

HX-SDP stores all data in Quantized Tensor Train (QTT) format. Instead of storing raw arrays, values are decomposed into a chain of small 3D tensors called TT-cores:

$$ A(i_1, i_2, \ldots, i_d) = G_1(i_1) \cdot G_2(i_2) \cdots G_d(i_d) $$

Each core $G_k$ is a matrix of size $r_{k-1} \times 2 \times r_k$, where $r_k$ is the bond dimension (rank) at position $k$.

Why this matters:

  • A 1,000,000-element array stored dense = 8 MB (float64)
  • The same array in QTT with max rank 16 = ~5 KB — a 1,600× compression ratio
  • All operations (similarity, search, reconstruction) work directly on the compressed cores

TT-cores

A TT-core is a 3D tensor slice. The full collection of cores for one entry represents the complete data without loss (within the tolerance).

Core 1      Core 2      Core 3           Core d
[1×2×r₁] × [r₁×2×r₂] × [r₂×2×r₃] × ... × [r_{d-1}×2×1]

Key properties:

  • Bond dimension (rank): Controls compression vs. fidelity trade-off. Higher rank = higher fidelity, more storage.
  • Number of cores ($d$): For an array of length $2^d$, there are $d$ cores. A 1024-element array → 10 cores.
  • Storage: Total bytes = $\sum_{k=1}^d r_{k-1} \times 2 \times r_k \times 8$ (float64). Scales as $O(d \cdot r^2)$, not $O(2^d)$.

Oracle

Every PUT operation passes through the Oracle — a fidelity evaluator that compares the compressed representation against the original:

Verdict Meaning
exact Relative error < $10^{-12}$ — effectively lossless
safe Relative error < $10^{-6}$ — suitable for most applications
weak Relative error < $10^{-3}$ — lossy but structure-preserving
passthrough Compression ratio below threshold — stored uncompressed

The Oracle verdict is returned with every PUT response and stored in metadata.

Namespaces

Namespaces provide logical isolation within a single HX-SDP instance:

  • Each tenant can access one or more namespaces
  • Keys are unique within a namespace
  • Cross-namespace queries are not supported (by design — isolation guarantee)
  • Default namespace: default

Versions

Every PUT to the same key creates a new version. Versions are immutable — the engine never overwrites data:

sensor_001 v1 → Original reading
sensor_001 v2 → Updated reading (v1 still accessible)
sensor_001 v3 → Latest reading

GET and SERVE return the latest version by default. Pass version=N to retrieve a specific version.

Tenants

In multi-tenant deployments (via HX-Gate), each tenant has:

  • A unique API key (SHA-256 hashed at rest)
  • A namespace ACL controlling which namespaces they can access
  • A billing tier with CU quota and rate limits
  • An audit trail of all operations

Compute Units (CUs)

Every API operation costs a defined number of Compute Units. CUs are metered per tenant per billing period. See Billing & Usage for the complete cost table.

Architecture overview

Client → HX-Gate (proxy) → HX-Engine (GPU worker) → QTT Store (disk)
            │                      │
            ├─ Auth + ACL          ├─ TT-SVD decomposition
            ├─ Rate limiting       ├─ Oracle fidelity check
            ├─ CU metering         ├─ Core-native similarity
            └─ Audit logging       └─ GPU VRAM serving
  • HX-Gate (port 8080): Stateless reverse proxy. Handles authentication, rate limiting, CU metering, and routing. Scales horizontally.
  • HX-Engine (port 8000): GPU-accelerated compute worker. Runs TT-SVD, core operations, and PDE solvers. Scales vertically (GPU count).
  • Redis (port 6379): Optional shared state for multi-instance Gate deployments.
  • QTT Store: On-disk storage of TT-cores. One directory per namespace, one file per key-version.