|
| 1 | +## Infera's Configuration Guide |
| 2 | + |
| 3 | +Infera supports configuration via environment variables to customize its behavior without code changes. |
| 4 | + |
| 5 | +### Environment Variables |
| 6 | + |
| 7 | +#### Cache Configuration |
| 8 | + |
| 9 | +##### INFERA_CACHE_DIR |
| 10 | + |
| 11 | +- **Description**: Directory path for caching remote models |
| 12 | +- **Type**: String (path) |
| 13 | +- **Default**: `$TMPDIR/infera_cache` (system temp directory) |
| 14 | +- **Example**: |
| 15 | + ```bash |
| 16 | + export INFERA_CACHE_DIR="/var/cache/infera" |
| 17 | + ``` |
| 18 | + |
| 19 | +##### INFERA_CACHE_SIZE_LIMIT |
| 20 | + |
| 21 | +- **Description**: Maximum cache size in bytes |
| 22 | +- **Type**: Integer (bytes) |
| 23 | +- **Default**: `1073741824` (1GB) |
| 24 | +- **Example**: |
| 25 | + ```bash |
| 26 | + ## Set to 5GB |
| 27 | + export INFERA_CACHE_SIZE_LIMIT=5368709120 |
| 28 | + |
| 29 | + ## Set to 500MB |
| 30 | + export INFERA_CACHE_SIZE_LIMIT=524288000 |
| 31 | + ``` |
| 32 | + |
| 33 | +##### INFERA_CACHE_EVICTION |
| 34 | + |
| 35 | +- **Description**: Cache eviction strategy to use when cache is full |
| 36 | +- **Type**: String (`LRU`, `LFU`, `FIFO`) |
| 37 | +- **Default**: `LRU` (Least Recently Used) |
| 38 | +- **Example**: |
| 39 | + ```bash |
| 40 | + export INFERA_CACHE_EVICTION=LRU |
| 41 | + ## Note: Currently only LRU is implemented, LFU and FIFO are planned |
| 42 | + ``` |
| 43 | + |
| 44 | +#### HTTP Configuration |
| 45 | + |
| 46 | +##### INFERA_HTTP_TIMEOUT |
| 47 | + |
| 48 | +- **Description**: HTTP request timeout in seconds for downloading remote models |
| 49 | +- **Type**: Integer (seconds) |
| 50 | +- **Default**: `30` |
| 51 | +- **Example**: |
| 52 | + ```bash |
| 53 | + export INFERA_HTTP_TIMEOUT=60 |
| 54 | + ``` |
| 55 | + |
| 56 | +##### INFERA_HTTP_RETRY_ATTEMPTS |
| 57 | + |
| 58 | +- **Description**: Number of retry attempts for failed downloads |
| 59 | +- **Type**: Integer |
| 60 | +- **Default**: `3` |
| 61 | +- **Example**: |
| 62 | + ```bash |
| 63 | + ## Retry up to 5 times on failure |
| 64 | + export INFERA_HTTP_RETRY_ATTEMPTS=5 |
| 65 | + ``` |
| 66 | + |
| 67 | +##### INFERA_HTTP_RETRY_DELAY |
| 68 | + |
| 69 | +- **Description**: Initial delay between retry attempts in milliseconds (uses exponential backoff) |
| 70 | +- **Type**: Integer (milliseconds) |
| 71 | +- **Default**: `1000` (1 second) |
| 72 | +- **Example**: |
| 73 | + ```bash |
| 74 | + ## Wait 2 seconds between retries |
| 75 | + export INFERA_HTTP_RETRY_DELAY=2000 |
| 76 | + ``` |
| 77 | + |
| 78 | +#### Logging Configuration |
| 79 | + |
| 80 | +##### INFERA_VERBOSE |
| 81 | + |
| 82 | +- **Description**: Enable verbose logging (deprecated, use INFERA_LOG_LEVEL instead) |
| 83 | +- **Type**: Boolean (`1`, `true`, or `0`, `false`) |
| 84 | +- **Default**: `false` |
| 85 | +- **Example**: |
| 86 | + ```bash |
| 87 | + export INFERA_VERBOSE=1 |
| 88 | + ``` |
| 89 | + |
| 90 | +##### INFERA_LOG_LEVEL |
| 91 | + |
| 92 | +- **Description**: Set logging level for detailed output |
| 93 | +- **Type**: String (`ERROR`, `WARN`, `INFO`, `DEBUG`) |
| 94 | +- **Default**: `WARN` |
| 95 | +- **Example**: |
| 96 | + ```bash |
| 97 | + ## Show all messages including debug |
| 98 | + export INFERA_LOG_LEVEL=DEBUG |
| 99 | + |
| 100 | + ## Show only errors |
| 101 | + export INFERA_LOG_LEVEL=ERROR |
| 102 | + |
| 103 | + ## Show informational messages and above |
| 104 | + export INFERA_LOG_LEVEL=INFO |
| 105 | + ``` |
| 106 | + |
| 107 | +### Usage Examples |
| 108 | + |
| 109 | +#### Example 1: Custom Cache Directory |
| 110 | + |
| 111 | +```bash |
| 112 | +## Set custom cache directory |
| 113 | +export INFERA_CACHE_DIR="/mnt/fast-ssd/ml-cache" |
| 114 | + |
| 115 | +## Start DuckDB |
| 116 | +./build/release/duckdb |
| 117 | + |
| 118 | +## Check configuration |
| 119 | +SELECT infera_get_version(); |
| 120 | +SELECT infera_get_cache_info(); |
| 121 | +``` |
| 122 | + |
| 123 | +#### Example 2: Larger Cache for Big Models |
| 124 | + |
| 125 | +```bash |
| 126 | +## Set cache to 10GB for large models |
| 127 | +export INFERA_CACHE_SIZE_LIMIT=10737418240 |
| 128 | + |
| 129 | +## Load large models from remote URLs |
| 130 | +./build/release/duckdb |
| 131 | +``` |
| 132 | + |
| 133 | +#### Example 3: Production Configuration |
| 134 | + |
| 135 | +```bash |
| 136 | +## Complete production configuration |
| 137 | +export INFERA_CACHE_DIR="/var/lib/infera/cache" |
| 138 | +export INFERA_CACHE_SIZE_LIMIT=5368709120 ## 5GB |
| 139 | +export INFERA_HTTP_TIMEOUT=120 ## 2 minutes |
| 140 | +export INFERA_HTTP_RETRY_ATTEMPTS=5 ## Retry up to 5 times |
| 141 | +export INFERA_HTTP_RETRY_DELAY=2000 ## 2 second initial delay |
| 142 | +export INFERA_LOG_LEVEL=WARN ## Production logging |
| 143 | +export INFERA_CACHE_EVICTION=LRU ## LRU cache strategy |
| 144 | + |
| 145 | +## Run DuckDB with Infera |
| 146 | +./build/release/duckdb |
| 147 | +``` |
| 148 | + |
| 149 | +#### Example 4: Development/Debug Configuration |
| 150 | + |
| 151 | +```bash |
| 152 | +## Development setup with verbose logging |
| 153 | +export INFERA_CACHE_DIR="./dev-cache" |
| 154 | +export INFERA_LOG_LEVEL=DEBUG ## Detailed debug logs |
| 155 | +export INFERA_HTTP_TIMEOUT=10 ## Shorter timeout for dev |
| 156 | +export INFERA_HTTP_RETRY_ATTEMPTS=1 ## Fail fast in development |
| 157 | + |
| 158 | +## Run DuckDB |
| 159 | +./build/release/duckdb |
| 160 | +``` |
| 161 | + |
| 162 | +#### Example 5: Slow Network Configuration |
| 163 | + |
| 164 | +```bash |
| 165 | +## Configuration for slow or unreliable networks |
| 166 | +export INFERA_HTTP_TIMEOUT=300 ## 5 minute timeout |
| 167 | +export INFERA_HTTP_RETRY_ATTEMPTS=10 ## Many retries |
| 168 | +export INFERA_HTTP_RETRY_DELAY=5000 ## 5 second initial delay |
| 169 | +export INFERA_LOG_LEVEL=INFO ## Track download progress |
| 170 | + |
| 171 | +./build/release/duckdb |
| 172 | +``` |
| 173 | + |
| 174 | +### Configuration Verification |
| 175 | + |
| 176 | +You can verify your configuration at runtime: |
| 177 | + |
| 178 | +```sql |
| 179 | +-- Check version and cache directory |
| 180 | +SELECT infera_get_version(); |
| 181 | + |
| 182 | +-- Check cache statistics |
| 183 | +SELECT infera_get_cache_info(); |
| 184 | +``` |
| 185 | + |
| 186 | +Example output: |
| 187 | + |
| 188 | +```json |
| 189 | +{ |
| 190 | + "cache_dir": "/var/cache/infera", |
| 191 | + "total_size_bytes": 204800, |
| 192 | + "file_count": 3, |
| 193 | + "size_limit_bytes": 5368709120 |
| 194 | +} |
| 195 | +``` |
| 196 | + |
| 197 | +### Retry Policy Details |
| 198 | + |
| 199 | +When downloading remote models, Infera automatically retries failed downloads with exponential backoff: |
| 200 | + |
| 201 | +1. **Attempt 1**: Download immediately |
| 202 | +2. **Attempt 2**: Wait `INFERA_HTTP_RETRY_DELAY` milliseconds (e.g., 1 second) |
| 203 | +3. **Attempt 3**: Wait `INFERA_HTTP_RETRY_DELAY * 2` milliseconds (e.g., 2 seconds) |
| 204 | +4. **Attempt N**: Wait `INFERA_HTTP_RETRY_DELAY * N` milliseconds |
| 205 | + |
| 206 | +This helps handle temporary network issues, server rate limiting, and transient failures. |
| 207 | + |
| 208 | +### Logging Levels |
| 209 | + |
| 210 | +Logging levels control the verbosity of output to stderr: |
| 211 | + |
| 212 | +- **ERROR**: Only critical errors that prevent operations |
| 213 | +- **WARN**: Warnings about potential issues (default) |
| 214 | +- **INFO**: Informational messages about operations (cache hits/misses, downloads) |
| 215 | +- **DEBUG**: Detailed debugging information (retry attempts, file sizes, etc.) |
| 216 | + |
| 217 | +Example log output with `INFERA_LOG_LEVEL=INFO`: |
| 218 | + |
| 219 | +``` |
| 220 | +[INFO] Cache miss for URL: https://example.com/model.onnx, downloading... |
| 221 | +[INFO] Successfully downloaded: https://example.com/model.onnx |
| 222 | +[INFO] Cache hit for URL: https://example.com/model.onnx |
| 223 | +``` |
| 224 | + |
| 225 | +Example log output with `INFERA_LOG_LEVEL=DEBUG`: |
| 226 | + |
| 227 | +``` |
| 228 | +[DEBUG] Download attempt 1/3 for https://example.com/model.onnx |
| 229 | +[INFO] Successfully downloaded: https://example.com/model.onnx |
| 230 | +[DEBUG] Downloaded file size: 15728640 bytes |
| 231 | +``` |
| 232 | + |
| 233 | +### Cache Eviction Strategies |
| 234 | + |
| 235 | +Currently implemented: |
| 236 | + |
| 237 | +- **LRU (Least Recently Used)**: Evicts files that haven't been accessed in the longest time |
| 238 | + |
| 239 | +Planned for future releases: |
| 240 | + |
| 241 | +- **LFU (Least Frequently Used)**: Evicts files with the lowest access count |
| 242 | +- **FIFO (First In First Out)**: Evicts oldest downloaded files first |
| 243 | + |
| 244 | +### Notes |
| 245 | + |
| 246 | +- Environment variables are read once when Infera initializes |
| 247 | +- Changes to environment variables require restarting DuckDB |
| 248 | +- Invalid values fall back to defaults (no errors thrown) |
| 249 | +- Cache directory is created automatically if it doesn't exist |
| 250 | +- LRU eviction happens automatically when cache limit is reached |
| 251 | +- Logging output goes to stderr and doesn't interfere with SQL query results |
| 252 | +- Retry delays use exponential backoff to handle rate limiting gracefully |
0 commit comments