CLI Reference
Complete command-line interface reference for Antarys server
CLI Reference
Complete guide to Antarys command-line options for server configuration and model management.
Commands
Start Server
Start the Antarys vector database server with optional flags:
antarys [flags]Download Manager
Manage ONNX runtime and embedding models:
antarys download <command>Version
Display the Antarys version:
antarys versionServer Flags
Core Configuration
--port (default: 8080)
- API server port
- Example:
antarys --port 9000
--data-dir (default: ./data)
- Path to vector data storage directory
- Example:
antarys --data-dir /var/lib/antarys/data
--meta-dir (default: ./metadata)
- Path to metadata storage (BadgerDB)
- Example:
antarys --meta-dir /var/lib/antarys/metadata
Performance Tuning
--shards (default: 16)
- Number of shards for parallel processing
- Higher values improve write throughput
- Example:
antarys --shards 32
--query-threads (default: CPU count × 2)
- Number of concurrent query threads
- Adjust based on your workload
- Example:
antarys --query-threads 16
--cache-size (default: 10000)
- Size of result cache (number of entries)
- Larger cache improves query performance
- Example:
antarys --cache-size 50000
--batch-size (default: 5000)
- Default batch size for operations
- Example:
antarys --batch-size 10000
--commit-interval (default: 60)
- Auto-commit interval in seconds
- Lower values increase durability, higher values improve performance
- Example:
antarys --commit-interval 30
Indexing Options
--enable-hnsw (default: true)
- Enable HNSW (Hierarchical Navigable Small World) indexing
- Provides fast approximate nearest neighbor search
- Example:
antarys --enable-hnsw=false
--enable-pq (default: false)
- Enable Product Quantization for memory efficiency
- Reduces memory usage at slight accuracy cost
- Example:
antarys --enable-pq
--similarity (default: 0.7)
- Default similarity threshold (0.0 to 1.0)
- Example:
antarys --similarity 0.85
--max-results (default: 100)
- Maximum number of results per query
- Example:
antarys --max-results 200
GPU Acceleration
GPU Support is backed by OpenCL and still experimental
--enable-gpu (default: false)
- Enable GPU acceleration for vector operations
- Requires compatible GPU and drivers
- Example:
antarys --enable-gpu
--optimization (default: 3)
- GPU optimization level (0-3)
- Higher values provide better performance
- Example:
antarys --optimization 2
Embedding Support
--enable-embedding (default: false)
- Enable built-in embedding generation
- Requires ONNX runtime and model to be downloaded
- Example:
antarys --enable-embedding
--embedding-model (default: 4)
- Embedding model ID to use (see model list below)
- Example:
antarys --enable-embedding --embedding-model 2
Embedding Models:
- 1: BGE-Base-EN (768 dimensions)
- 2: BGE-Base-EN-v1.5 (768 dimensions)
- 3: BGE-Small-EN (384 dimensions)
- 4: BGE-Small-EN-v1.5 (384 dimensions) - Default
- 5: BGE-Small-ZH-v1.5 (512 dimensions)
Download Commands
Download ONNX Runtime
Download the ONNX runtime required for embedding generation:
antarys download onnxThe runtime is automatically selected for your platform:
- macOS: Universal binary (Intel + ARM)
- Linux ARM64: ARM-optimized runtime
- Linux x64: x86-64 optimized runtime
Example Output:
Downloading ONNX runtime...
/home/user/.antarys/libonnxruntime.1.22.0.dylib
ONNX runtime downloaded successfullyIf already downloaded:
ONNX runtime is already downloaded and availableDownload Embedding Model
Download a specific embedding model:
antarys download embedding <model_id>Examples:
# Download default model (ID: 4)
antarys download embedding
# Download BGE-Base-EN-v1.5
antarys download embedding 2
# Download Chinese model
antarys download embedding 5Output:
Downloading embedding model...
[Progress bar]
Embedding model 2 downloaded successfullyList Available Models
View all embedding models and their status:
antarys download list-embeddingsOutput:
Available embedding models:
ID Name Dimensions Downloaded Description
-- ---- ---------- ---------- -----------
1 fast-bge-base-en 768 No Base English model
2 fast-bge-base-en-v1.5 768 Yes Improved base model
3 fast-bge-small-en 384 No Fast English model
4 fast-bge-small-en-v1.5 384 Yes Default, fast and accurate
5 fast-bge-small-zh-v1.5 512 No Chinese language modelRemove Cached Model
Delete a downloaded embedding model:
antarys download remove embedding <model_id>Example:
antarys download remove embedding 2Output:
Embedding model 2 (fast-bge-base-en-v1.5) removed successfullyConfiguration Examples
Development Setup
Basic development server with embedding support:
# Download requirements
antarys download onnx
antarys download embedding 4
# Start server
antarys --enable-embedding --port 8080Production Setup
High-performance production configuration:
antarys \
--port 8080 \
--data-dir /var/lib/antarys/data \
--meta-dir /var/lib/antarys/metadata \
--shards 32 \
--query-threads 32 \
--cache-size 100000 \
--batch-size 10000 \
--commit-interval 30 \
--enable-hnsw \
--enable-embedding \
--embedding-model 4GPU-Accelerated Setup
Maximum performance with GPU acceleration:
antarys \
--enable-gpu \
--optimization 3 \
--shards 64 \
--query-threads 64 \
--cache-size 200000 \
--enable-hnsw \
--enable-embedding \
--embedding-model 2Memory-Optimized Setup
Configuration for memory-constrained environments:
antarys \
--shards 8 \
--query-threads 8 \
--cache-size 5000 \
--batch-size 2000 \
--enable-pq \
--enable-embedding \
--embedding-model 3Storage Locations
All cached files are stored in the user's home directory:
Directory Structure:
~/.antarys/
├── libonnxruntime.1.22.0.dylib # ONNX runtime (macOS)
├── libonnxruntime_arm64.so # ONNX runtime (Linux ARM64)
├── libonnxruntime_x64.so # ONNX runtime (Linux x64)
├── fast-bge-base-en/ # Embedding model
│ ├── config.json
│ ├── model_optimized.onnx
│ ├── ort_config.json
│ ├── tokenizer.json
│ └── vocab.txt
└── fast-bge-small-en-v1.5/ # Embedding model
├── config.json
├── model_optimized.onnx
└── ...Environment Variables
While Antarys primarily uses command-line flags, you can set defaults using environment variables in wrapper scripts:
export ANTARYS_PORT=8080
export ANTARYS_DATA_DIR=/var/lib/antarys/data
export ANTARYS_ENABLE_EMBEDDING=trueError Messages
Embedding Not Enabled
Error: Embedding generation is not enabledSolution: Start server with --enable-embedding flag
ONNX Runtime Not Found
Error: ONNX runtime not found. Please run the following command to download it:
antarys download onnxSolution: Download ONNX runtime before enabling embeddings
Model Not Downloaded
Error: Embedding model 2 (fast-bge-base-en-v1.5) is not downloaded. Please run:
antarys download embedding 2Solution: Download the required model
Corrupted Files
Error: Corrupted ONNX runtime detected. Removing corrupt file...
Please run the following command to re-download ONNX runtime:
antarys download onnxSolution: Re-download the corrupted file
Validation on Startup
When embedding is enabled, Antarys performs automatic validation:
- Platform Check: Verifies OS is supported (macOS/Linux)
- Runtime Check: Validates ONNX runtime exists and is not corrupted
- Model Check: Validates embedding model exists and is not corrupted
- Compatibility Check: Ensures model works with ONNX runtime
If any check fails, the server will not start and will display clear instructions for resolution.
Performance Recommendations
For High Throughput:
- Increase
--shards(32-64) - Increase
--query-threads(32-64) - Increase
--batch-size(10000+) - Enable
--enable-gpuif available
For Low Latency:
- Increase
--cache-size(50000-100000) - Enable
--enable-hnsw - Keep
--commit-intervalhigher (60-120)
For Memory Efficiency:
- Enable
--enable-pq - Reduce
--shards(8-16) - Reduce
--cache-size(5000-10000) - Use smaller embedding models (ID: 3 or 4)
Important: Always download ONNX runtime and models before starting the server with --enable-embedding. The server will not start if required files are missing or corrupted.