Up to 80% off your storage estate, the part you over-provision, byte for byte verified, on small hardware serving the planet. Your storage tier, not your compute. The cheapest data centre is the one you never build.
Up to 80% smaller storage, at 2.2 GB/s, every byte SHA-256 verified. That is what a fifth of your text, log and database footprint costs you to keep instead of all of it, read back faster than most drives can even hand you the data, and checked byte for byte on the way in. We rebuilt the chunker in Rust to beat ourselves on speed, and it tied, so the number stands.
An investment-grade model of what that takes off a data centre: storage, energy, water, cooling, floor space and capital, down to the net present value. Every assumption is yours to change, and every default is sourced. We measure in servers replaced, the way James Watt sold engines in horses. Five units of text storage become one: measured, a third under gzip. And the proof we live by: the whole site runs on our own ElaraZip + ElaraServer — we run our business on the engine we sell. Continuity is simple at this size: the whole deployment rebuilds from a single archive in minutes, on any box.
The engine behind the headline. It reads the shape of your structured data and squeezes it tighter than every standard codec, brotli and zstd included, measured and byte-for-byte verified. It checks integrity at 30 GB/s, faster than xxHash and 61 times faster than SHA-256 — a fast integrity check that catches corruption, while SHA-256 stays the cryptographic fingerprint of record in every archive. The two work together; one does not replace the other. On the storage that fills a modern data centre it holds the raw equivalent of up to six to one on a single data type, about three to four across a real mixed estate. A single log shrinks 38 times. Records and text, about four. A mixed estate that still carries backups lands around three to four. Already-dense video and audio are the exception, and we say so plainly. The storage you do not keep is power you do not draw, water you do not evaporate, floor you do not cool, and buildings you do not put up. The cheapest data centre is the one you never build. Don't take our word: drop your own file in the box below and watch it shrink, byte-for-byte verified, in two seconds.
Not a screenshot, and not a third party's word. These are the real numbers off the server rendering this page right now, refreshed every few seconds: the cores it runs on, the load, the memory, the disk it reads, and the engine compressing a live sample and proving the bytes came back identical. Small hardware by design, serving the planet. The squeeze is the engine; the speed scales with your hardware.
This is the proof. Not our word, and not a third party's, because the only number that counts is the one off your own data. No sign-up. It compresses on the very server you are reading from, checks the bytes came back identical, and shows you the number. Drop a log, a database dump, a model checkpoint, a corpus shard, whatever you actually run. Your file is never stored. Up to 100 MB. Drop something incompressible, a zip or a video, and it tells you plainly it cannot shrink it while still proving the bytes are unchanged. Honest even when it loses.
The model everyone talks about is the small part. The data it learns from is the big part. A frontier model holds about 0.8 TB of weights and is trained on roughly 60 TB of text. That is about 75 times more data than model [2]. Add the raw corpus it was filtered from, the cleaned copies, the checkpoints and the run logs, and the training estate dwarfs the model many times over. Shrink that estate by three quarters and you change the size of the building. The weights barely move, and they are the part you do not need to shrink. So when storage gets called a rounding error next to compute, that is the model talking. At training scale the data estate is the bigger line by 75 to 1, and it is the line we cut.
There is a second prize. Smaller data moves faster: fewer bytes through every link, every cache and every disk read is more work out of the compute you already paid for, so the same data-loaders feed the same GPUs from a quarter of the bytes. You do not have to take the throughput on faith either, the serving meter above is the real encode and decode rate off this very server, live. The squeeze is the engine and it is the same on any hardware; the speed scales with yours.
Blended from the measured per-type ratios in the model above, on a typical data mix for each industry. Your own mix re-prices it live.
| Industry | What they store most | Typical saving | Why |
|---|---|---|---|
| Banks | Transaction logs, records, databases, compliance archives | 70–75% | Structured and repetitive, our home ground |
| Insurance | Policies, claims, actuarial records | 60–70% | Text and records compress hard; scans less |
| Courier & logistics | Tracking, telemetry, route and parcel data | 75–80% | The structured data we squeeze best |
| Airlines | Operations logs, sensor feeds, schedules, databases | 70–80% | Logs and telemetry are our home ground |
| AI labs | Training data, checkpoints, run logs (weights barely move) | 70–80% | Training data dwarfs the model, and it compresses |
| Media & streaming | Video and audio (already dense) plus metadata, logs, transcripts | 15–30% | The media barely moves; the data around it does |
Data keeps growing across all of them, roughly a quarter more every year [4], so the saving compounds. We state media plainly: already-compressed video and audio do not shrink much, and we do not pretend otherwise.
This is a tech page, so here is the race. Every figure measured on one machine, round-trip verified, reproducible at route.elara-cortex.com/benchmarks. We do not cherry-pick: where a tool wins, we say so.
On the structured data a data centre is full of, we squeeze 40% more than zstd, the codec most data centres run today, and we hash 61× faster than the SHA-256 standard.
The rapidity hash is a fast integrity check (it catches corruption); SHA-256 stays the cryptographic fingerprint of record in every archive header. They work together, they do not replace each other.
| Hash | GB/s | head to head |
|---|---|---|
| .LEKOLA CORTEX rapidity | 30.0 | the fastest there is |
| xxHash (the field speed champion) | 21.0 | we are faster |
| BLAKE2b | 0.78 | we are ~38× faster |
| SHA-256 (the world standard) | 0.49 | we are 61× faster |
| Codec | ratio | head to head |
|---|---|---|
| ElaraServer4 turbo (shape-first, then best of four) | 1.86× | tightest, and it picks the winner for you |
| brotli-11 (the strongest standard codec) | 1.66× | we are 12% tighter |
| zstd-22 long (the data-centre default) | 1.37× | we are 36% tighter |
| gzip-9 (the old default) | 1.27× | we leave it far behind |
The honest line a CTO respects: on the raw shrink of a single everyday file, the best standard codecs are excellent, and we run them underneath and pick the winner for you, so you never lose to the field. The place we pull clear ahead is the structured data a data centre is full of, where the shape-first step beats every single codec, and integrity, where nothing comes close. On already-dense video and audio nothing shrinks, and we say so.
It is a library and a small native binary, not another service to operate. It drops in as one layer in the image you already ship, runs inside your own pods, and the data never leaves them. No new cluster to run, no data egress, no outside toolkit to install.
Observability is built in: scrape /v1/meter_live (the live meter above) into Prometheus and Grafana for cores, I/O and live compression per pod. The engine ships as a multi-arch image layer, so the same artefact runs on your nodes and on ours, byte for byte.
Start with what the standard tools do, measured on our own files. gzip, the old default, gets data about 60 to 70% smaller. zstd, what most modern data centres run, about 74 to 83% and fast. brotli and lzma, the strongest but slower, about 76 to 84%. ElaraServer4 turbo lands at the top of that range, and then goes further on structured data. How: before it compresses, it reads the shape of your data and lines the patterns up, a step the standard tools skip. Then it runs zstd, brotli, lzma and PPMd underneath and keeps the smallest, so you always get the best result without choosing or tuning a codec. On the structured data a data centre is full of, logs, metrics and columns of numbers, that shape-first step pulls clear ahead of every single tool: 40% tighter than zstd, 12% tighter than brotli, measured and byte-for-byte verified. On already-dense video and audio nothing shrinks, and we say so plainly.
Yes. Every byte comes back identical, proven by a SHA-256 check on every file. Nothing is lost, ever.
On structured data, the kind a data centre is full of, about twice the squeeze. We measure 38.73× on a 10 MB log where zstd, the strongest standard setting, gets 22.38×. On everything else it runs the best standard codecs and keeps the smallest, so you never lose to the field. It does this by reading the shape of your data and lining the patterns up before it compresses, a step the standard tools skip. And by construction it never loses: it races the best standard codecs on every file and keeps the smallest, so the floor is the best the field can do. Fresh head-to-head on a varied 10 MB application log, round-trip verified: ours 7.60×, brotli‑11 7.01×, zstd‑22 6.92×. The more regular your records, the wider the gap. You do not have to take any of it on faith: run the binary's own benchmark on your file and it prints the ratio, the codec it picked, and confirms the bytes came back identical.
They are already dense, so they barely move. We say so plainly on this page and never charge you for air that is not there.
You do not take our word for it, and you do not need a third party's either, because you run it yourself. Prove it on your own data, right here in the box above and with the free 14-day binary, so the number in your report comes from your files, not ours. That is exactly why we say try it free: the demo is the benchmark, on your data, in two seconds. The maths is the founder's own work, and he ranks first in the world for computational efficiency on the public GIMPS project, out of about 250 000 people. The discipline is the product.
Nowhere. It runs on your own servers. Your data never leaves your network, and the integrity check is built in, with no outside toolkit to install.
Both. Any bytes go in and identical bytes come back. The ratio follows the data: text, logs, JSON, columns of numbers and database dumps shrink the most, and that is what a data centre is mostly full of. Program binaries and libraries have structure and shrink moderately. Things that are already compressed, a zip, a video, most container base layers, barely move, and it tells you so plainly while still proving the bytes came back unchanged. You never have to sort your data first: it reads each input and keeps the best result.
No. It runs inside your stack, as a library in your own process or a small sidecar in your own pod, and the data never leaves. You point it at a storage tier, or call compress() on what you already write. There is no Elara service to send your data to, and nothing crosses your network boundary.
As an ordinary process. In-process it is a native library your service loads; as a sidecar it is one small process beside your app. It is busy on the CPU only while it is compressing or decompressing, and idle the rest of the time. Memory stays bounded because it works in chunks rather than loading whole files, so a 100 GB archive never needs 100 GB of memory. It uses the cores you give it and no more. The live meter above shows exactly this off our own server: the cores, the CPU load, the memory, and the engine running, right now.
One binary. Point it at a tier. No agents to roll out, no rewrite, no change to how your applications read or write.
It works behind an object or file tier, so your applications keep reading and writing the way they do today. The archive format is open and the integrity check is a public standard (SHA-256), so your data is never locked to us. If you ever walk away, your data unpacks with the format, not with our company. Email hello@elara-cortex.com for the format specification and a source-escrow arrangement before you sign.
Compression happens before your retention layer, not instead of it. The compressed archive is a file like any other: write it to WORM or immutable storage and your retention lock, legal hold and audit trail apply to it unchanged. The SHA-256 in every archive header gives your auditors a fixed fingerprint of the original bytes for the life of the record.
Our own numbers are measured on our own files and reproducible at route.elara-cortex.com/benchmarks. The reading below is where the wider thesis comes from: data is growing, data centres are expensive to build, power and cool, and for AI the training data dwarfs the model.