Performance Report
Comprehensive benchmarking results comparing Antarys against leading vector databases including Chroma, Qdrant, and Milvus.
Performance Report
Comprehensive benchmarking results demonstrating Antarys's superior performance across various metrics compared to industry-leading vector databases.
Testing Environment: Current benchmarks performed on Apple M1 Pro. Additional hardware configurations coming soon.
Executive Summary
Antarys consistently outperforms major vector database solutions across key performance metrics:
⚡ Write Performance
2,017 vectors/sec throughput with industry-leading batch processing
🚀 Query Speed
1.66ms average query time - 31x faster than Qdrant, 133x faster than Milvus
🎯 Accuracy
High recall rates with consistent performance across different workloads
📈 Scalability
HTTP/2 REST API architecture with upcoming gRPC support for enhanced performance
Benchmark Results
Write Performance
Performance metrics for batch vector insertion operations:
Database | Throughput (vectors/sec) | Performance vs Antarys |
---|---|---|
Antarys | 2,017 | Baseline |
Chroma | 1,234 | 1.6x slower |
Qdrant | 892 | 2.3x slower |
Milvus | 445 | 4.5x slower |
Key Insight: Antarys achieves 2x higher throughput than the next best competitor through optimized batch processing and parallel workers.
Average Batch Time
Database | Avg Batch Time (ms) | P99 Latency (ms) |
---|---|---|
Antarys | 495.7 | 570.3 |
Chroma | 810.4 | 890.2 |
Qdrant | 1,121.6 | 1,456.8 |
Milvus | 2,247.3 | 3,102.5 |
Test Configuration:
- Batch size: 1,000 vectors
- Vector dimensions: 1,536 (OpenAI embeddings)
- Total vectors: 100,000
- Hardware: Apple M1 Pro
Results Summary:
- ✅ 100% success rate across all batch operations
- ⚡ Consistent sub-second latency for 1K vector batches
- 🔄 Linear scalability with batch size optimization
Read Performance (1,000 Queries)
Query performance analysis across different database systems:
🥇 Antarys
1.66ms average query time Fastest
🥈 Chroma
2.94ms average query time 1.8x slower
🥉 Qdrant
51.47ms average query time 31x slower
Milvus
220.46ms average query time 133x slower
Database | Throughput (queries/sec) | Avg Query Time (ms) | P99 Latency (ms) |
---|---|---|---|
Antarys | 602.4 | 1.66 | 6.9 |
Chroma | 340.1 | 2.94 | 14.2 |
Qdrant | 19.4 | 51.47 | 186.3 |
Milvus | 4.5 | 220.46 | 892.1 |
Performance Advantage: Antarys processes 31x more queries per second than Qdrant and 133x more than Milvus.
P50/P90/P99 Latency Breakdown:
Antarys: 1.3ms / 2.1ms / 6.9ms ⚡ Consistently fast
Chroma: 2.1ms / 4.8ms / 14.2ms 📊 Good performance
Qdrant: 42ms / 89ms / 186ms 🐌 High latency
Milvus: 185ms / 420ms / 892ms 🐌 Very high latency
Key Observations:
- 📈 Antarys maintains sub-10ms P99 latency
- ⚡ 60% of Antarys queries complete under 1.5ms
- 🎯 Minimal variance in response times
Accuracy Metrics
Search quality and recall performance comparison:
Database | Recall@100 (%) | Recall Standard Deviation |
---|---|---|
Antarys | 98.47% | 0.0023 |
Chroma | 97.12% | 0.0034 |
Qdrant | 96.83% | 0.0041 |
Milvus | 95.67% | 0.0056 |
Quality Leadership: Antarys delivers the highest recall rates with the lowest variance, ensuring consistent, accurate results.
Upcoming gRPC Implementation
Performance Evolution: Our upcoming gRPC implementation will provide even more justified benchmarking against market competitors.
Expected Improvements:
- 🚀 Lower latency through binary protocol efficiency
- 📦 Reduced payload size with Protocol Buffers
- 🔄 Bidirectional streaming for real-time applications
- ⚡ Enhanced connection management and multiplexing
Test Environment
Current Hardware Configuration
Apple M1 Pro Specifications
- CPU: 8-core (6 performance + 2 efficiency)
- GPU: 14-core integrated GPU
- Neural Engine: 16-core Neural Engine
- Memory: 16GB unified memory
- Memory Bandwidth: 200GB/s
- Storage: 512GB SSD with ~7.4GB/s throughput
- Architecture: ARM64 with unified memory architecture
- OS: macOS Sonoma 14.x
Hardware Diversity: We're expanding testing to include server-grade hardware and embedded systems for comprehensive performance analysis.
Benchmark Methodology
Test Parameters:
- Dataset Size: 100,000 vectors
- Vector Dimensions: 1,536 (OpenAI embeddings)
- Batch Size: 1,000 vectors per batch
- Query Count: 1,000 similarity searches
- Concurrency: Variable parallel workers
- Metrics Collection: P50, P90, P99 latencies + throughput
Future Benchmarking Roadmap
Expanded Hardware Testing
Server-Grade Hardware
Planned Configurations:
- 🖥️ Intel Xeon Gold processors with high core counts
- 🖥️ AMD EPYC processors for multi-socket performance
- 💾 DDR5 memory configurations (128GB - 1TB)
- 💿 NVMe SSD arrays for storage-intensive workloads
Embedded Systems (Antarys Edge)
Target Hardware:
- 🔧 NVIDIA Jetson Nano (4GB/8GB variants)
- 🔧 Raspberry Pi 4 Model B (8GB)
- 🔧 Intel NUC mini computers
- 🔧 ARM-based edge devices for IoT deployment
Cloud Infrastructure
Platform Testing:
- ☁️ AWS EC2 instances (c5, m5, r5 families)
- ☁️ Google Cloud Compute Engine
- ☁️ Azure Virtual Machines
- ☁️ Multi-region latency analysis
Benchmark Expansion Plans
📊 Dataset Diversity
Multiple Data Sources
- OpenAI embeddings (1536D)
- Sentence transformers (768D, 384D)
- Custom domain embeddings
- Multi-modal vectors
🔍 Query Patterns
Real-world Workloads
- Batch similarity search
- Real-time recommendations
- Semantic search applications
- Hybrid search scenarios
🌐 Network Conditions
Connectivity Testing
- Local network (LAN)
- Wide area network (WAN)
- High-latency connections
- Mobile network conditions
⚡ Stress Testing
Performance Limits
- Concurrent user simulation
- Memory pressure testing
- CPU saturation analysis
- I/O bottleneck identification
Performance Insights
What makes Antarys Fast
Optimized Architecture
Design Principles:
- Purpose-built for vector operations
- Memory-first approach with efficient data structures
- Asynchronous processing throughout the stack
- Batch-optimized operations by default
Advanced Algorithms
Technical Innovations:
- Custom HNSW implementation with performance optimizations
- Adaptive indexing based on data distribution
- Smart caching strategies for frequently accessed vectors
- Load balancing across multiple processing cores
System Integration
Performance Multipliers:
- Native HTTP/2 support with multiplexing
- Zero-copy operations where possible
- Parallel batch processing with optimal worker allocation
- Compressed data transfer for reduced network overhead
Performance Recommendations
Based on benchmarking results, here are optimal configurations:
< 100K vectors
# Optimal configuration
client = antarys.create_client(
connection_pool_size=20,
cache_size=500,
thread_pool_size=4
)
# Batch settings
batch_size = 1000
parallel_workers = 2
Expected Performance:
- ⚡ 1,500+ vectors/sec write throughput
- 🔍 ** < 2ms** average query time
- 💾 ** < 100MB** memory usage
100K - 1M vectors
# Optimal configuration
client = antarys.create_client(
connection_pool_size=50,
cache_size=2000,
thread_pool_size=8
)
# Batch settings
batch_size = 3000
parallel_workers = 6
Expected Performance:
- ⚡ 2,000+ vectors/sec write throughput
- 🔍 ** < 3ms** average query time
- 💾 ** < 500MB** memory usage
1M+ vectors
# Optimal configuration
client = antarys.create_client(
connection_pool_size=100,
cache_size=5000,
thread_pool_size=16
)
# Batch settings
batch_size = 5000
parallel_workers = 12
Expected Performance:
- ⚡ 2,500+ vectors/sec write throughput
- 🔍 ** < 5ms** average query time
- 💾 ** < 2GB** memory usage
Competitive Analysis
Market Position
Feature | Antarys | Chroma | Qdrant | Milvus |
---|---|---|---|---|
Query Speed | 1.66ms | 2.94ms | 51.47ms | 220.46ms |
Write Throughput | 2,017/s | 1,234/s | 892/s | 445/s |
Accuracy | 98.47% | 97.12% | 96.83% | 95.67% |
Memory Efficiency | Excellent | Good | Fair | Poor |
API Design | HTTP/2 + | HTTP/FFI | gRPC | gRPC |
Key Differentiators
Speed Leadership
31-133x faster query performance than major competitors
High Accuracy
Highest recall rates with lowest variance across all tests
Edge-Ready
Our vector database was designed to work with low powered computers with minimal latency
Next Steps: Follow our performance updates as we expand testing to additional hardware configurations and larger datasets throughout 2025.