Antarys

|

Antarys

Python Client

Performance Report

Comprehensive benchmarking results comparing Antarys against leading vector databases including Chroma, Qdrant, and Milvus.

Performance Report

Comprehensive benchmarking results demonstrating Antarys's superior performance across various metrics compared to industry-leading vector databases.

Testing Environment: Current benchmarks performed on Apple M1 Pro. Additional hardware configurations coming soon.

Executive Summary

Antarys consistently outperforms major vector database solutions across key performance metrics:

⚡ Write Performance

2,017 vectors/sec throughput with industry-leading batch processing

🚀 Query Speed

1.66ms average query time - 31x faster than Qdrant, 133x faster than Milvus

🎯 Accuracy

High recall rates with consistent performance across different workloads

📈 Scalability

HTTP/2 REST API architecture with upcoming gRPC support for enhanced performance

Benchmark Results

Write Performance

Performance metrics for batch vector insertion operations:

DatabaseThroughput (vectors/sec)Performance vs Antarys
Antarys2,017Baseline
Chroma1,2341.6x slower
Qdrant8922.3x slower
Milvus4454.5x slower

Key Insight: Antarys achieves 2x higher throughput than the next best competitor through optimized batch processing and parallel workers.

Average Batch Time

DatabaseAvg Batch Time (ms)P99 Latency (ms)
Antarys495.7570.3
Chroma810.4890.2
Qdrant1,121.61,456.8
Milvus2,247.33,102.5

Test Configuration:

  • Batch size: 1,000 vectors
  • Vector dimensions: 1,536 (OpenAI embeddings)
  • Total vectors: 100,000
  • Hardware: Apple M1 Pro

Results Summary:

  • 100% success rate across all batch operations
  • Consistent sub-second latency for 1K vector batches
  • 🔄 Linear scalability with batch size optimization

Read Performance (1,000 Queries)

Query performance analysis across different database systems:

🥇 Antarys

1.66ms average query time Fastest

🥈 Chroma

2.94ms average query time 1.8x slower

🥉 Qdrant

51.47ms average query time 31x slower

Milvus

220.46ms average query time 133x slower

DatabaseThroughput (queries/sec)Avg Query Time (ms)P99 Latency (ms)
Antarys602.41.666.9
Chroma340.12.9414.2
Qdrant19.451.47186.3
Milvus4.5220.46892.1

Performance Advantage: Antarys processes 31x more queries per second than Qdrant and 133x more than Milvus.

P50/P90/P99 Latency Breakdown:

Antarys:  1.3ms / 2.1ms / 6.9ms   ⚡ Consistently fast
Chroma:   2.1ms / 4.8ms / 14.2ms  📊 Good performance  
Qdrant:   42ms / 89ms / 186ms     🐌 High latency
Milvus:   185ms / 420ms / 892ms   🐌 Very high latency

Key Observations:

  • 📈 Antarys maintains sub-10ms P99 latency
  • 60% of Antarys queries complete under 1.5ms
  • 🎯 Minimal variance in response times

Accuracy Metrics

Search quality and recall performance comparison:

DatabaseRecall@100 (%)Recall Standard Deviation
Antarys98.47%0.0023
Chroma97.12%0.0034
Qdrant96.83%0.0041
Milvus95.67%0.0056

Quality Leadership: Antarys delivers the highest recall rates with the lowest variance, ensuring consistent, accurate results.

Upcoming gRPC Implementation

Performance Evolution: Our upcoming gRPC implementation will provide even more justified benchmarking against market competitors.

Expected Improvements:

  • 🚀 Lower latency through binary protocol efficiency
  • 📦 Reduced payload size with Protocol Buffers
  • 🔄 Bidirectional streaming for real-time applications
  • Enhanced connection management and multiplexing

Test Environment

Current Hardware Configuration

Apple M1 Pro Specifications

  • CPU: 8-core (6 performance + 2 efficiency)
  • GPU: 14-core integrated GPU
  • Neural Engine: 16-core Neural Engine
  • Memory: 16GB unified memory
  • Memory Bandwidth: 200GB/s
  • Storage: 512GB SSD with ~7.4GB/s throughput
  • Architecture: ARM64 with unified memory architecture
  • OS: macOS Sonoma 14.x

Hardware Diversity: We're expanding testing to include server-grade hardware and embedded systems for comprehensive performance analysis.

Benchmark Methodology

Test Parameters:

  • Dataset Size: 100,000 vectors
  • Vector Dimensions: 1,536 (OpenAI embeddings)
  • Batch Size: 1,000 vectors per batch
  • Query Count: 1,000 similarity searches
  • Concurrency: Variable parallel workers
  • Metrics Collection: P50, P90, P99 latencies + throughput

Future Benchmarking Roadmap

Expanded Hardware Testing

Server-Grade Hardware

Planned Configurations:

  • 🖥️ Intel Xeon Gold processors with high core counts
  • 🖥️ AMD EPYC processors for multi-socket performance
  • 💾 DDR5 memory configurations (128GB - 1TB)
  • 💿 NVMe SSD arrays for storage-intensive workloads

Embedded Systems (Antarys Edge)

Target Hardware:

  • 🔧 NVIDIA Jetson Nano (4GB/8GB variants)
  • 🔧 Raspberry Pi 4 Model B (8GB)
  • 🔧 Intel NUC mini computers
  • 🔧 ARM-based edge devices for IoT deployment

Cloud Infrastructure

Platform Testing:

  • ☁️ AWS EC2 instances (c5, m5, r5 families)
  • ☁️ Google Cloud Compute Engine
  • ☁️ Azure Virtual Machines
  • ☁️ Multi-region latency analysis

Benchmark Expansion Plans

📊 Dataset Diversity

Multiple Data Sources

  • OpenAI embeddings (1536D)
  • Sentence transformers (768D, 384D)
  • Custom domain embeddings
  • Multi-modal vectors

🔍 Query Patterns

Real-world Workloads

  • Batch similarity search
  • Real-time recommendations
  • Semantic search applications
  • Hybrid search scenarios

🌐 Network Conditions

Connectivity Testing

  • Local network (LAN)
  • Wide area network (WAN)
  • High-latency connections
  • Mobile network conditions

⚡ Stress Testing

Performance Limits

  • Concurrent user simulation
  • Memory pressure testing
  • CPU saturation analysis
  • I/O bottleneck identification

Performance Insights

What makes Antarys Fast

Optimized Architecture

Design Principles:

  • Purpose-built for vector operations
  • Memory-first approach with efficient data structures
  • Asynchronous processing throughout the stack
  • Batch-optimized operations by default

Advanced Algorithms

Technical Innovations:

  • Custom HNSW implementation with performance optimizations
  • Adaptive indexing based on data distribution
  • Smart caching strategies for frequently accessed vectors
  • Load balancing across multiple processing cores

System Integration

Performance Multipliers:

  • Native HTTP/2 support with multiplexing
  • Zero-copy operations where possible
  • Parallel batch processing with optimal worker allocation
  • Compressed data transfer for reduced network overhead

Performance Recommendations

Based on benchmarking results, here are optimal configurations:

< 100K vectors

# Optimal configuration
client = antarys.create_client(
    connection_pool_size=20,
    cache_size=500,
    thread_pool_size=4
)

# Batch settings
batch_size = 1000
parallel_workers = 2

Expected Performance:

  • 1,500+ vectors/sec write throughput
  • 🔍 ** < 2ms** average query time
  • 💾 ** < 100MB** memory usage

100K - 1M vectors

# Optimal configuration  
client = antarys.create_client(
    connection_pool_size=50,
    cache_size=2000,
    thread_pool_size=8
)

# Batch settings
batch_size = 3000
parallel_workers = 6

Expected Performance:

  • 2,000+ vectors/sec write throughput
  • 🔍 ** < 3ms** average query time
  • 💾 ** < 500MB** memory usage

1M+ vectors

# Optimal configuration
client = antarys.create_client(
    connection_pool_size=100,
    cache_size=5000,
    thread_pool_size=16
)

# Batch settings
batch_size = 5000
parallel_workers = 12

Expected Performance:

  • 2,500+ vectors/sec write throughput
  • 🔍 ** < 5ms** average query time
  • 💾 ** < 2GB** memory usage

Competitive Analysis

Market Position

FeatureAntarysChromaQdrantMilvus
Query Speed1.66ms2.94ms51.47ms220.46ms
Write Throughput2,017/s1,234/s892/s445/s
Accuracy98.47%97.12%96.83%95.67%
Memory EfficiencyExcellentGoodFairPoor
API DesignHTTP/2 +HTTP/FFIgRPCgRPC

Key Differentiators

Speed Leadership

31-133x faster query performance than major competitors

High Accuracy

Highest recall rates with lowest variance across all tests

Edge-Ready

Our vector database was designed to work with low powered computers with minimal latency

Next Steps: Follow our performance updates as we expand testing to additional hardware configurations and larger datasets throughout 2025.