Skip to main content
Version: 0.10.7

Performance Overview

Measured benchmarks comparing aerospike-py (Rust/PyO3, native async) against the official aerospike Python client (C extension wrapped with loop.run_in_executor). Reproducible from benchmark/ in this repo.

lower is better, higher is better, 🔥 ≥50% improvement. Default test setup: FastAPI + DLRM + Aerospike CE, k6 10 VUs × 60s.

How fast — cumulative effect on DLRM-serving p95

Stepp95vs original
Original (official client + Python 3.11 + gather)324 msbaseline
+ Replace with aerospike-py189 ms−42%
+ gather(N) → single batch_read(mixed keys)126 ms−61%
+ Python 3.14t free-threaded97 ms−70% 🔥 (3.3× faster)
#ActionEffect
1Replace official client → aerospike-pyp95 −42% (Python 3.11)
2Move runtime to Python 3.14t free-threadedp95 −49% more, TPS +47% (no Rust changes)
3gather(N) → single batch_read(mixed keys)p95 −33% (under GIL)
4Keep AEROSPIKE_PY_INTERNAL_METRICS=1 always onE2E overhead ≈ 0, instant per-stage attribution

Environment summary (Python 3.11 + GIL)

The thinner the surrounding stack, the larger the gap. Tail latency advantage survives even in production-shaped workloads.

Environmentaerospike-py vs officialDetail
A) Pure DB client (no HTTP/ML)avg −80% (108→22 ms), TPS +171% (138→374) 🔥Benchmarks → A
B) uvicorn ASGI (FastAPI + DB, no ML)mean −21% (290→228 ms), TPS +17%Benchmarks → B
C) uvicorn + DLRM (real serving)p95 −42% (324→189 ms), avg −19%Benchmarks → C

To reproduce locally, see benchmark/README.md.