You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
>**Note:** If you try to use image generation or editing without `mflux` installed, you'll receive a clear error message directing you to install it manually.
277
283
284
+
#### Enhanced Caching Support
285
+
For enhanced caching and performance when working with complex ML models and objects, install with the enhanced-caching extra:
286
+
287
+
```bash
288
+
# Install with enhanced caching support
289
+
pip install mlx-openai-server[enhanced-caching]
290
+
```
291
+
292
+
This enables better serialization and caching of objects from:
293
+
- spaCy (NLP processing)
294
+
- regex (regular expressions)
295
+
- tiktoken (tokenization)
296
+
- torch (PyTorch tensors and models)
297
+
- transformers (Hugging Face models)
298
+
278
299
#### Whisper Models Support
279
300
For whisper models to work properly, you need to install ffmpeg:
"""Top-level Click command group for the MLX server CLI.
80
82
81
83
Subcommands (such as ``launch``) are registered on this group and
82
84
invoked by the console entry point.
83
85
"""
84
-
pass
85
86
86
87
87
-
@cli.command()
88
+
@cli.command(help="Start the MLX OpenAI Server with the supplied flags")
88
89
@click.option(
89
90
"--model-path",
90
91
required=True,
91
-
help="Path to the model (required for lm, multimodal, embeddings, image-generation, image-edit, whisper model types). With `image-generation` or `image-edit` model types, it should be the local path to the model.",
92
+
help="Path to the model (required for lm, multimodal, embeddings, image-generation, image-edit, whisper model types). Can be a local path or Hugging Face repository ID (e.g., 'blackforestlabs/FLUX.1-dev').",
92
93
)
93
94
@click.option(
94
95
"--model-type",
@@ -186,35 +187,77 @@ def cli():
186
187
help="Path to a custom chat template file. Only works with language models (lm) and multimodal models.",
187
188
)
188
189
deflaunch(
189
-
model_path,
190
-
model_type,
191
-
context_length,
192
-
port,
193
-
host,
194
-
max_concurrency,
195
-
queue_timeout,
196
-
queue_size,
197
-
quantize,
198
-
config_name,
199
-
lora_paths,
200
-
lora_scales,
201
-
disable_auto_resize,
202
-
log_file,
203
-
no_log_file,
204
-
log_level,
205
-
enable_auto_tool_choice,
206
-
tool_call_parser,
207
-
reasoning_parser,
208
-
trust_remote_code,
209
-
chat_template_file,
190
+
model_path: str,
191
+
model_type: str,
192
+
context_length: int,
193
+
port: int,
194
+
host: str,
195
+
max_concurrency: int,
196
+
queue_timeout: int,
197
+
queue_size: int,
198
+
quantize: int,
199
+
config_name: str|None,
200
+
lora_paths: str|None,
201
+
lora_scales: str|None,
202
+
disable_auto_resize: bool,
203
+
log_file: str|None,
204
+
no_log_file: bool,
205
+
log_level: str,
206
+
enable_auto_tool_choice: bool,
207
+
tool_call_parser: str|None,
208
+
reasoning_parser: str|None,
209
+
trust_remote_code: bool,
210
+
chat_template_file: str|None,
210
211
) ->None:
211
212
"""Start the FastAPI/Uvicorn server with the supplied flags.
212
213
213
214
The command builds a server configuration object using
214
215
``MLXServerConfig`` and then calls the async ``start`` routine
215
216
which handles the event loop and server lifecycle.
216
-
"""
217
217
218
+
Parameters
219
+
----------
220
+
model_path : str
221
+
Path to the model (required for lm, multimodal, embeddings, image-generation, image-edit, whisper model types).
222
+
model_type : str
223
+
Type of model to run (lm, multimodal, image-generation, image-edit, embeddings, whisper).
0 commit comments