Skip to content

Commit 412bf71

Browse files
authored
feat: add embedding model continuous batching scheduler (#564)
* feat: add embedding benchmark and implement continuous batching Signed-off-by: Huamin Chen <hchen@redhat.com> * add embedding examples Signed-off-by: Huamin Chen <hchen@redhat.com> * expose the batch api Signed-off-by: Huamin Chen <hchen@redhat.com> * review feedback Signed-off-by: Huamin Chen <hchen@redhat.com> * review feedback Signed-off-by: Huamin Chen <hchen@redhat.com> * review feedback Signed-off-by: Huamin Chen <hchen@redhat.com> * fix precommit Signed-off-by: Huamin Chen <hchen@redhat.com> * review feedback Signed-off-by: Huamin Chen <hchen@redhat.com> --------- Signed-off-by: Huamin Chen <hchen@redhat.com> Signed-off-by: Huamin Chen <rootfs@users.noreply.github.com>
1 parent fcc4273 commit 412bf71

File tree

17 files changed

+4853
-48
lines changed

17 files changed

+4853
-48
lines changed

candle-binding/Cargo.lock

Lines changed: 10 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

candle-binding/Cargo.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ license = "MIT OR Apache-2.0"
77

88
[lib]
99
name = "candle_semantic_router"
10-
crate-type = ["staticlib", "cdylib", "rlib"]
10+
# Order: rlib for Rust-to-Rust linking, staticlib for C/Go FFI (static), cdylib for dynamic linking
11+
crate-type = ["rlib", "staticlib", "cdylib"]
1112

1213
[features]
1314
default = ["cuda"]
@@ -38,6 +39,7 @@ rand = "0.8.5"
3839
rayon = "1.8"
3940
once_cell = "1.19" # Used for lazy initialization of global state
4041
parking_lot = "0.12"
42+
crossbeam-channel = "0.5" # Efficient multi-channel select for scheduler wakeup
4143

4244
[dev-dependencies]
4345
rstest = "0.18"

0 commit comments

Comments
 (0)