Compression Strategies
Gilial compresses vectors using TurboQuant — a random-rotation scalar quantisation algorithm. Instead of deleting vectors, it reduces each vector’s storage footprint by lowering the number of bits used per dimension. All vectors are retained; only their precision changes.
Named Strategies
Section titled “Named Strategies”Balanced (default)
Section titled “Balanced (default)”4 bits per dimension, ~8× compression. The safe production default.
result = client.compress(strategy="balanced", dry_run=False)| Property | Value |
|---|---|
| Bits/dim | 4 |
| Compression ratio | ~8× |
| Typical recall@10 | ≥ 98% |
| Use case | Production, safety-first |
Aggressive
Section titled “Aggressive”2 bits per dimension, ~16× compression. Maximum savings.
result = client.compress(strategy="aggressive", dry_run=False)| Property | Value |
|---|---|
| Bits/dim | 2 |
| Compression ratio | ~16× |
| Typical recall@10 | ≥ 93% |
| Use case | Cost optimisation, non-critical indexes |
Always validate recall when using Aggressive on production:
result = client.compress( strategy="aggressive", dry_run=False, validate_recall=True, recall_n_queries=200,)print(result.recall)High Quality
Section titled “High Quality”6 bits per dimension, ~5× compression. Near-lossless.
result = client.compress(strategy="high_quality", dry_run=False)| Property | Value |
|---|---|
| Bits/dim | 6 |
| Compression ratio | ~5× |
| Typical recall@10 | ≥ 99.5% |
| Use case | Sensitive indexes, fine-tuned models |
Custom Bit Depth
Section titled “Custom Bit Depth”Override the strategy default with bits_per_dim (2–8). Useful when you want something between two named strategies.
# Between balanced (4) and high_quality (6)result = client.compress(bits_per_dim=5, dry_run=False)
# Or combine with a named strategy as a starting pointresult = client.compress(strategy="aggressive", bits_per_dim=3, dry_run=False)How TurboQuant Works
Section titled “How TurboQuant Works”- Random rotation — vectors are multiplied by a seeded random orthogonal matrix, spreading energy evenly across dimensions
- Scalar quantisation — each rotated dimension is quantised to N bits
- Optional QJL correction — a 1-bit residual stage (enabled by default on Balanced and High Quality) improves reconstruction accuracy
- Decode to float32 — vectors are decoded back before upserting so the index stays compatible with existing query infrastructure
The precision loss from quantisation is the compression mechanism. No vectors are deleted.
Choosing a Strategy
Section titled “Choosing a Strategy”Need maximum savings and can tolerate minor quality loss? → aggressive (2-bit)
Running production queries where quality matters most? → high_quality (6-bit)
Everything else? → balanced (4-bit) ← start here
Want fine-grained control? → bits_per_dim=NAlways use validate_recall=True before committing to aggressive quantisation on a production index.