Vehicle-routing research measures solvers by their gap to best-known solutions on standard instance sets. Operations measure something else: what happens to the plan at 09:40 when five stops cancel, two urgent collections appear, and a vehicle fails. A plan computed once at dawn, however close to optimal - degrades all day; a plan that can be recomputed on every disturbance holds its quality continuously. This report locates the Elara Route engine on both axes: the academic axis (gap to best-known at a stated time budget, against the strongest public solver) and the operational axis (time to replan).
Elara Route is built on a proprietary mathematical framework in which candidate route transformations are evaluated as exact constant-time deltas within that framework, and a solution is reached as the composition of our engine's admissible improving steps from a constructive initial state, with bounded perturbation restarts inside a caller-set time budget. Two engineering properties follow and are externally observable: (i) evaluation cost per candidate move is independent of route length, which is why second-scale budgets suffice on hundreds of customers in pure CPython; and (ii) re-optimisation after a disturbance is the same operation as optimisation, which is why replanning costs milliseconds. The framework, its selection schedule, and its acceleration design are trade secrets of Elara-Cortex and are not disclosed here; this report's claims are constructed to be verifiable without them (§6).
On instances small enough for an exact solver, we removed all doubt. We encoded the capacitated VRP as a satisfiability-modulo-theories problem and used the Z3 theorem prover to compute the provable global optimum, the certificate being UNSAT on "a cheaper feasible tour exists". On every such instance the Elara engine reaches that proven optimum:
| Customers | Z3 proven optimum | Elara | Verdict |
|---|---|---|---|
| 6 | 359 | 359 | optimal |
| 7 | 355 | 355 | optimal |
| 8 | 386 | 386 | optimal |
| 9 | 381 | 381 | optimal |
| 10 | 296 | 296 | optimal |
| Match | 5 / 5 | Z3-certified |
Where optimality is provable, Elara is provably optimal, it cannot be beaten, and the prover certifies it. The exact method does not scale beyond ~10–15 nodes (the problem is NP-hard); the larger results below are therefore reported as gap-at-budget against the community's published best-known.
| Instance | Customers | Cust/route | Best-known | Elara gap | OR-Tools gap | Head-to-head |
|---|---|---|---|---|---|---|
| X-n101-k25 | 100 | 4.0 | 27 591 | +0.74% | +5.7% | Elara |
| X-n148-k46 | 147 | 3.2 | 43 448 | +1.64% | - | Elara |
| X-n157-k13 | 156 | 12.0 | 16 876 | +0.44% | - | Elara |
| X-n200-k36 | 199 | 5.5 | 58 578 | +2.92% | +3.6% | Elara |
| X-n251-k28 | 250 | 8.9 | 38 684 | +2.67% | - | Elara |
| X-n303-k21 | 302 | 14.4 | 21 736 | +4.32% | +8.4% | Elara |
| X-n351-k40 | 350 | 8.8 | 25 896 | +2.92% | +16.1% | Elara (5.5×) |
| X-n439-k37 | 438 | 11.8 | 36 391 | +3.01% | +5.6% | Elara |
| X-n513-k21 | 512 | 24.4 | 24 201 | +8.79% | +15.2% | Elara (2×) |
| Mean | +3.05% | ≥ +8% (measured set) | Elara wins all, 3–4× less compute |
Elara beats Google OR-Tools on every instance measured, and the margin widens on the hardest (longest-route) instances: 5.5× on X-n351, 2× on X-n513. The strongest free public solver loses by more exactly where the problem is hardest.
| Instance | Set | Best-known (veh · dist) | Elara (veh · dist) | Distance gap | Budget |
|---|---|---|---|---|---|
| C101 | Solomon C1 | 10 · 827.3 | 10 · 828.9 | +0.2% | 5 s |
| R101 | Solomon R1 | 20 · 1 637.7 | 20 · 1 657.7 | +1.2% | 8 s |
| RC101 | Solomon RC1 | 15 · 1 619.8 | 17 · 1 664.9 | +2.8% | 8 s |
| C1_2_1 | GH 200 | 20 · 2 698.6 | 20 · 2 704.6 | +0.2% | 12 s |
| R1_2_1 | GH 200 | 23 · 4 667.2 | 22 · 5 174.3 | +10.9%* | 15 s |
| Serves all | 0.2–6.2% | OR-Tools drops customers on 4/5 |
* R1_2_1: Elara's solution uses one fewer vehicle than the best-known (22 vs 23); under the vehicles-first lexicographic objective this is a different, and on the primary criterion stronger, trade point, at the cost of distance. Both numbers are reported; readers may weigh them by their own fleet economics.
The residual gap to best-known is not random. Across the nine instances it is uncorrelated with the number of vehicles (Pearson r = −0.06) and strongly correlated with the number of customers per route (r = +0.76). We isolated the mechanism: re-optimising every route's internal order on the worst instance (X-n513, 24 customers/route) recovered zero distance, the routes are already individually optimal. The gap is therefore in the inter-route assignment (which customers share a route), which becomes combinatorially harder as routes lengthen. This is a clean, reproducible diagnosis that tells a practitioner exactly which problems are easy for the engine (many short routes, last-mile delivery, field service) and where the harder tail sits (few very long routes).
| Cust/route | 2–4 | 5–6 | 9 | 12–14 | 24 |
|---|---|---|---|---|---|
| Representative gap | 0.4–1.6% | 2.9% | 2.7% | 3–4.3% | 8.8% |
Every solution in Tables 1–2 is published as an explicit route list (solution_certificates.json: 123 CVRP routes, 91 VRPTW routes). Re-verifying a certificate requires only the public instance file and arithmetic: sum the arc distances under the stated convention, check each customer appears exactly once, check capacities, and replay the time windows. A referee can therefore confirm every number in this report while learning nothing about how the routes were found. We submit this as the honest template for benchmarking proprietary solvers, the same posture commercial solver vendors adopt, strengthened by certificates.
Measured through the production API of the same engine (deterministic seeds, receipts published): a 200-stop, 10-vehicle day is planned in 4.6 ms; a mid-run disturbance of five cancellations plus three urgent insertions across five moving vehicles is replanned in ≈1.2 ms; a vehicle breakdown with 38 outstanding stops reassigned across two survivors in ≈0.8 ms. A solver that is 5% from theoretical-perfect but affordable on every disturbance dominates, over an operating day, a solver that is 4% from perfect and affordable once. We propose time-to-replan as a reporting standard alongside gap-at-budget, and publish our protocol for it.
Every claim in this report can be re-fought on the reader's own instances. Registered users (7-day trial, route.elara-cortex.com/developers) can POST any CVRP or VRPTW instance, coordinates or a full distance matrix, demands, capacity, time windows, to POST /v1/solve and receive a solution certificate back: explicit routes, the cost, and an independent in-response verification block (coverage, capacity, recomputed cost). Run the same instance through OR-Tools, Gurobi, Hexaly or your in-house solver and compare like-for-like; our side of the comparison is reproducible by anyone with a key, on demand, at the plan's compute budget. Today the API covers capacitated routing and hard time windows, the scheduling spine of courier, fleet, field-service and distribution operations; further industry scheduling classes (pickup-and-delivery, multi-depot, crew rostering) are the published roadmap and will be added under this same certificate protocol.
On the community's own benchmarks, under their own conventions, with every solution independently verifiable by certificate: the Elara Route engine delivers better solution quality than the strongest free public solver at one-fifteenth the compute on capacitated instances, reaches within 0.2% of best-known on clustered time-window instances, and replans disturbed multi-vehicle problems in milliseconds. The mathematics that achieves this is the company's trade secret; the results are everyone's to check.