Skip to content

Commit f68854b

Browse files
authored
The first revsion (#11)
1 parent 59201cb commit f68854b

29 files changed

+1817
-110
lines changed
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
name: Bug report
3+
about: Create a report to help us improve
4+
title: ''
5+
labels: bug
6+
assignees: ''
7+
8+
---
9+
10+
**Describe the bug**
11+
A clear and concise description of what the bug is.
12+
13+
**To Reproduce**
14+
Steps to reproduce the behavior:
15+
1.
16+
17+
**Expected behavior**
18+
A clear and concise description of what you expected to happen.
19+
20+
**Logs**
21+
If applicable, add logs to help explain your problem.
22+
23+
**Additional context**
24+
Add any other context about the problem here.

.github/ISSUE_TEMPLATE/config.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
blank_issues_enabled: true
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
name: Feature request
3+
about: Suggest an idea for this project
4+
title: ''
5+
labels: enhancement
6+
assignees: ''
7+
8+
---
9+
10+
**Is your feature request related to a problem? Please describe.**
11+
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12+
13+
**Describe the solution you'd like**
14+
A clear and concise description of what you want to happen.
15+
16+
**Describe alternatives you've considered**
17+
A clear and concise description of any alternative solutions or features you've considered.
18+
19+
**Additional context**
20+
Add any other context or screenshots about the feature request here.

.github/workflows/dist_pipeline.yml

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,44 @@ jobs:
2020
ci_tools_version: main
2121
extension_name: infera
2222
enable_rust: true
23-
exclude_archs: "linux_amd64_musl;windows_amd64_mingw;osx_amd64;wasm_mvp;wasm_eh;wasm_threads"
23+
exclude_archs: "windows_amd64_mingw;osx_amd64;wasm_mvp;wasm_eh;wasm_threads"
2424

2525
duckdb-stable-build:
26-
uses: duckdb/extension-ci-tools/.github/workflows/_extension_distribution.yml@v1.4.0
26+
uses: duckdb/extension-ci-tools/.github/workflows/_extension_distribution.yml@v1.4.1
2727
with:
28-
duckdb_version: v1.4.0
29-
ci_tools_version: v1.4.0
28+
duckdb_version: v1.4.1
29+
ci_tools_version: v1.4.1
3030
extension_name: infera
3131
enable_rust: true
32-
exclude_archs: "linux_amd64_musl;windows_amd64_mingw;osx_amd64;wasm_mvp;wasm_eh;wasm_threads"
32+
exclude_archs: "windows_amd64_mingw;osx_amd64;wasm_mvp;wasm_eh;wasm_threads"
33+
34+
create-release-draft:
35+
name: Create Draft Release with Built Binaries
36+
needs:
37+
- duckdb-stable-build
38+
if: startsWith(github.ref, 'refs/tags/')
39+
runs-on: ubuntu-latest
40+
permissions:
41+
contents: write
42+
steps:
43+
- name: Download All Build Artifacts
44+
uses: actions/download-artifact@v4
45+
with:
46+
path: dist
47+
merge-multiple: true
48+
- name: List Artifacts
49+
run: |
50+
echo "Downloaded artifacts to: $(pwd)/dist"
51+
ls -la dist || true
52+
find dist -type f -maxdepth 2 -print || true
53+
- name: Create Draft Release and Upload Assets
54+
uses: softprops/action-gh-release@v2
55+
with:
56+
draft: true
57+
name: Infera ${{ github.ref_name }}
58+
tag_name: ${{ github.ref_name }}
59+
generate_release_notes: true
60+
files: |
61+
dist/**
62+
env:
63+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

.pre-commit-config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
default_stages: [ pre-push ]
22
fail_fast: false
3+
exclude: '(^external/)'
34

45
repos:
56
- repo: https://github.com/pre-commit/pre-commit-hooks

Makefile

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,6 @@ EXT_NAME := infera
66
RUST_LIB := infera/target/release/$(EXT_NAME).a
77
DUCKDB_SRCDIR := ./external/duckdb/
88
EXT_CONFIG := ${PROJ_DIR}extension_config.cmake
9-
TESTS_DIR := tests
109
EXAMPLES_DIR := docs/examples
1110
SHELL := /bin/bash
1211
PYTHON := python3

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ See the [ROADMAP.md](ROADMAP.md) for the list of implemented and planned feature
6161
git clone --recursive https://github.com/CogitatorTech/infera.git
6262
cd infera
6363

64+
# This might take a while to run
6465
make release
6566
```
6667

ROADMAP.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,13 @@ It outlines features to be implemented and their current status.
1111
* **Input Data Types**
1212
* [x] `FLOAT` features from table columns.
1313
* [x] Type casting from `INTEGER`, `BIGINT`, and `DOUBLE` columns.
14+
* [x] Type casting from `DECIMAL` columns.
1415
* [x] `BLOB` input for tensor data.
1516
* [ ] `STRUCT` or `MAP` input for named features.
1617
* **Output Data Types**
1718
* [x] Single `FLOAT` scalar output.
1819
* [x] Multiple `FLOAT` outputs as a `VARCHAR` containing JSON.
1920
* [x] Multiple `FLOAT` outputs as a `LIST[FLOAT]`.
20-
* [ ] Return multiple outputs as a `STRUCT`.
2121
* **Batch Processing**
2222
* [x] Inference on batches for models with dynamic dimensions.
2323
* [ ] Automatic batch splitting for models with a fixed batch size.
@@ -32,7 +32,8 @@ It outlines features to be implemented and their current status.
3232
* [x] Unload models from memory.
3333
* [x] List loaded models.
3434
* [x] Get model metadata as a JSON object.
35-
* [ ] Cache eviction policies for remote models.
35+
* [x] Check if a model is currently loaded.
36+
* [x] Cache eviction policies for remote models.
3637

3738
### 3. Performance and Concurrency
3839

docs/CONFIGURATION.md

Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
## Infera's Configuration Guide
2+
3+
Infera supports configuration via environment variables to customize its behavior without code changes.
4+
5+
### Environment Variables
6+
7+
#### Cache Configuration
8+
9+
##### INFERA_CACHE_DIR
10+
11+
- **Description**: Directory path for caching remote models
12+
- **Type**: String (path)
13+
- **Default**: `$TMPDIR/infera_cache` (system temp directory)
14+
- **Example**:
15+
```bash
16+
export INFERA_CACHE_DIR="/var/cache/infera"
17+
```
18+
19+
##### INFERA_CACHE_SIZE_LIMIT
20+
21+
- **Description**: Maximum cache size in bytes
22+
- **Type**: Integer (bytes)
23+
- **Default**: `1073741824` (1GB)
24+
- **Example**:
25+
```bash
26+
## Set to 5GB
27+
export INFERA_CACHE_SIZE_LIMIT=5368709120
28+
29+
## Set to 500MB
30+
export INFERA_CACHE_SIZE_LIMIT=524288000
31+
```
32+
33+
##### INFERA_CACHE_EVICTION
34+
35+
- **Description**: Cache eviction strategy to use when cache is full
36+
- **Type**: String (`LRU`, `LFU`, `FIFO`)
37+
- **Default**: `LRU` (Least Recently Used)
38+
- **Example**:
39+
```bash
40+
export INFERA_CACHE_EVICTION=LRU
41+
## Note: Currently only LRU is implemented, LFU and FIFO are planned
42+
```
43+
44+
#### HTTP Configuration
45+
46+
##### INFERA_HTTP_TIMEOUT
47+
48+
- **Description**: HTTP request timeout in seconds for downloading remote models
49+
- **Type**: Integer (seconds)
50+
- **Default**: `30`
51+
- **Example**:
52+
```bash
53+
export INFERA_HTTP_TIMEOUT=60
54+
```
55+
56+
##### INFERA_HTTP_RETRY_ATTEMPTS
57+
58+
- **Description**: Number of retry attempts for failed downloads
59+
- **Type**: Integer
60+
- **Default**: `3`
61+
- **Example**:
62+
```bash
63+
## Retry up to 5 times on failure
64+
export INFERA_HTTP_RETRY_ATTEMPTS=5
65+
```
66+
67+
##### INFERA_HTTP_RETRY_DELAY
68+
69+
- **Description**: Initial delay between retry attempts in milliseconds (uses exponential backoff)
70+
- **Type**: Integer (milliseconds)
71+
- **Default**: `1000` (1 second)
72+
- **Example**:
73+
```bash
74+
## Wait 2 seconds between retries
75+
export INFERA_HTTP_RETRY_DELAY=2000
76+
```
77+
78+
#### Logging Configuration
79+
80+
##### INFERA_VERBOSE
81+
82+
- **Description**: Enable verbose logging (deprecated, use INFERA_LOG_LEVEL instead)
83+
- **Type**: Boolean (`1`, `true`, or `0`, `false`)
84+
- **Default**: `false`
85+
- **Example**:
86+
```bash
87+
export INFERA_VERBOSE=1
88+
```
89+
90+
##### INFERA_LOG_LEVEL
91+
92+
- **Description**: Set logging level for detailed output
93+
- **Type**: String (`ERROR`, `WARN`, `INFO`, `DEBUG`)
94+
- **Default**: `WARN`
95+
- **Example**:
96+
```bash
97+
## Show all messages including debug
98+
export INFERA_LOG_LEVEL=DEBUG
99+
100+
## Show only errors
101+
export INFERA_LOG_LEVEL=ERROR
102+
103+
## Show informational messages and above
104+
export INFERA_LOG_LEVEL=INFO
105+
```
106+
107+
### Usage Examples
108+
109+
#### Example 1: Custom Cache Directory
110+
111+
```bash
112+
## Set custom cache directory
113+
export INFERA_CACHE_DIR="/mnt/fast-ssd/ml-cache"
114+
115+
## Start DuckDB
116+
./build/release/duckdb
117+
118+
## Check configuration
119+
SELECT infera_get_version();
120+
SELECT infera_get_cache_info();
121+
```
122+
123+
#### Example 2: Larger Cache for Big Models
124+
125+
```bash
126+
## Set cache to 10GB for large models
127+
export INFERA_CACHE_SIZE_LIMIT=10737418240
128+
129+
## Load large models from remote URLs
130+
./build/release/duckdb
131+
```
132+
133+
#### Example 3: Production Configuration
134+
135+
```bash
136+
## Complete production configuration
137+
export INFERA_CACHE_DIR="/var/lib/infera/cache"
138+
export INFERA_CACHE_SIZE_LIMIT=5368709120 ## 5GB
139+
export INFERA_HTTP_TIMEOUT=120 ## 2 minutes
140+
export INFERA_HTTP_RETRY_ATTEMPTS=5 ## Retry up to 5 times
141+
export INFERA_HTTP_RETRY_DELAY=2000 ## 2 second initial delay
142+
export INFERA_LOG_LEVEL=WARN ## Production logging
143+
export INFERA_CACHE_EVICTION=LRU ## LRU cache strategy
144+
145+
## Run DuckDB with Infera
146+
./build/release/duckdb
147+
```
148+
149+
#### Example 4: Development/Debug Configuration
150+
151+
```bash
152+
## Development setup with verbose logging
153+
export INFERA_CACHE_DIR="./dev-cache"
154+
export INFERA_LOG_LEVEL=DEBUG ## Detailed debug logs
155+
export INFERA_HTTP_TIMEOUT=10 ## Shorter timeout for dev
156+
export INFERA_HTTP_RETRY_ATTEMPTS=1 ## Fail fast in development
157+
158+
## Run DuckDB
159+
./build/release/duckdb
160+
```
161+
162+
#### Example 5: Slow Network Configuration
163+
164+
```bash
165+
## Configuration for slow or unreliable networks
166+
export INFERA_HTTP_TIMEOUT=300 ## 5 minute timeout
167+
export INFERA_HTTP_RETRY_ATTEMPTS=10 ## Many retries
168+
export INFERA_HTTP_RETRY_DELAY=5000 ## 5 second initial delay
169+
export INFERA_LOG_LEVEL=INFO ## Track download progress
170+
171+
./build/release/duckdb
172+
```
173+
174+
### Configuration Verification
175+
176+
You can verify your configuration at runtime:
177+
178+
```sql
179+
-- Check version and cache directory
180+
SELECT infera_get_version();
181+
182+
-- Check cache statistics
183+
SELECT infera_get_cache_info();
184+
```
185+
186+
Example output:
187+
188+
```json
189+
{
190+
"cache_dir": "/var/cache/infera",
191+
"total_size_bytes": 204800,
192+
"file_count": 3,
193+
"size_limit_bytes": 5368709120
194+
}
195+
```
196+
197+
### Retry Policy Details
198+
199+
When downloading remote models, Infera automatically retries failed downloads with exponential backoff:
200+
201+
1. **Attempt 1**: Download immediately
202+
2. **Attempt 2**: Wait `INFERA_HTTP_RETRY_DELAY` milliseconds (e.g., 1 second)
203+
3. **Attempt 3**: Wait `INFERA_HTTP_RETRY_DELAY * 2` milliseconds (e.g., 2 seconds)
204+
4. **Attempt N**: Wait `INFERA_HTTP_RETRY_DELAY * N` milliseconds
205+
206+
This helps handle temporary network issues, server rate limiting, and transient failures.
207+
208+
### Logging Levels
209+
210+
Logging levels control the verbosity of output to stderr:
211+
212+
- **ERROR**: Only critical errors that prevent operations
213+
- **WARN**: Warnings about potential issues (default)
214+
- **INFO**: Informational messages about operations (cache hits/misses, downloads)
215+
- **DEBUG**: Detailed debugging information (retry attempts, file sizes, etc.)
216+
217+
Example log output with `INFERA_LOG_LEVEL=INFO`:
218+
219+
```
220+
[INFO] Cache miss for URL: https://example.com/model.onnx, downloading...
221+
[INFO] Successfully downloaded: https://example.com/model.onnx
222+
[INFO] Cache hit for URL: https://example.com/model.onnx
223+
```
224+
225+
Example log output with `INFERA_LOG_LEVEL=DEBUG`:
226+
227+
```
228+
[DEBUG] Download attempt 1/3 for https://example.com/model.onnx
229+
[INFO] Successfully downloaded: https://example.com/model.onnx
230+
[DEBUG] Downloaded file size: 15728640 bytes
231+
```
232+
233+
### Cache Eviction Strategies
234+
235+
Currently implemented:
236+
237+
- **LRU (Least Recently Used)**: Evicts files that haven't been accessed in the longest time
238+
239+
Planned for future releases:
240+
241+
- **LFU (Least Frequently Used)**: Evicts files with the lowest access count
242+
- **FIFO (First In First Out)**: Evicts oldest downloaded files first
243+
244+
### Notes
245+
246+
- Environment variables are read once when Infera initializes
247+
- Changes to environment variables require restarting DuckDB
248+
- Invalid values fall back to defaults (no errors thrown)
249+
- Cache directory is created automatically if it doesn't exist
250+
- LRU eviction happens automatically when cache limit is reached
251+
- Logging output goes to stderr and doesn't interfere with SQL query results
252+
- Retry delays use exponential backoff to handle rate limiting gracefully

0 commit comments

Comments
 (0)