@OpenLedger Infini-gram takes on massive datasets with suffix-array-based indexing, offering speed, scalability, and precision at token level.
Here’s how it works 👇
🔹 How It’s Built:
All tokens are stored in one continuous sequence.
A suffix array sorts every possible suffix for lightning-fast lookups — just O(log N) per query.
Each token needs only ~7 bytes, making even 5 trillion tokens manageable (~35 TB total).
🔹 Query Execution:
1️⃣ Binary search to locate the context
2️⃣ Count prefix + full n-gram frequency
3️⃣ Estimate probability
4️⃣ Backoff if no direct match found
Performance Benchmarks:
5-gram count → ~20 ms
∞-gram probability → ~135 ms
Full distribution → ~180 ms
This architecture transforms AI attribution from theory to real-time, verifiable insight — even at trillion-token scale.