Skip to content

Conversation

@yuluo-yx
Copy link
Contributor

Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
@netlify
Copy link

netlify bot commented Oct 24, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit da52893
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/690f52e6b6d631000870529a
😎 Deploy Preview https://deploy-preview-532--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link

github-actions bot commented Oct 24, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • .github/workflows/helm-ci.yml
  • Makefile

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/helm/README.md
  • deploy/helm/semantic-router/.helmignore
  • deploy/helm/semantic-router/Chart.yaml
  • deploy/helm/semantic-router/README.md
  • deploy/helm/semantic-router/templates/NOTES.txt
  • deploy/helm/semantic-router/templates/_helpers.tpl
  • deploy/helm/semantic-router/templates/configmap.yaml
  • deploy/helm/semantic-router/templates/deployment.yaml
  • deploy/helm/semantic-router/templates/hpa.yaml
  • deploy/helm/semantic-router/templates/ingress.yaml
  • deploy/helm/semantic-router/templates/namespace.yaml
  • deploy/helm/semantic-router/templates/pvc.yaml
  • deploy/helm/semantic-router/templates/service.yaml
  • deploy/helm/semantic-router/templates/serviceaccount.yaml
  • deploy/helm/semantic-router/values-dev.yaml
  • deploy/helm/semantic-router/values-example.yaml
  • deploy/helm/semantic-router/values-prod.yaml
  • deploy/helm/semantic-router/values.yaml
  • deploy/helm/validate-chart.sh

📁 tools

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

  • tools/make/helm.mk
  • tools/kind/generate-kind-config.sh
  • tools/linter/yaml/.yamllint

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@rootfs rootfs requested a review from Copilot October 24, 2025 16:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive Helm chart support for deploying Semantic Router on Kubernetes, providing an alternative to the existing Kustomize deployment method. The implementation includes production and development configurations, validation tooling, and extensive Make target automation.

Key Changes

  • Added complete Helm chart structure with templates for all Kubernetes resources (Deployment, Service, ConfigMap, PVC, Ingress, HPA, etc.)
  • Introduced environment-specific values files (dev, prod, example) with optimized configurations for different deployment scenarios
  • Integrated Helm deployment automation through Make targets and validation scripts

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 22 comments.

Show a summary per file
File Description
tools/make/helm.mk Comprehensive Make targets for Helm operations including install, upgrade, testing, and port-forwarding
tools/make/linter.mk Removed documentation linting targets (likely relocated or obsolete)
deploy/helm/semantic-router/Chart.yaml Helm chart metadata and project information
deploy/helm/semantic-router/values.yaml Default configuration values for the Helm deployment
deploy/helm/semantic-router/values-dev.yaml Development environment optimized values
deploy/helm/semantic-router/values-prod.yaml Production environment optimized values with HA setup
deploy/helm/semantic-router/values-example.yaml Example configuration demonstrating customization options
deploy/helm/semantic-router/templates/*.yaml Kubernetes resource templates for deployment infrastructure
deploy/helm/semantic-router/templates/_helpers.tpl Helm template helper functions
deploy/helm/validate-chart.sh Automated validation script for chart testing
deploy/helm/README.md Comprehensive deployment guide with examples
deploy/helm/semantic-router/README.md Detailed chart documentation
Makefile Integration of helm.mk into main build system
Comments suppressed due to low confidence (1)

deploy/helm/semantic-router/values.yaml:1

  • Inconsistent model path format. Line 218 uses 'models/all-MiniLM-L12-v2' (with 'models/' prefix) while line 57 uses 'sentence-transformers/all-MiniLM-L12-v2' (repository format). These should be consistent - line 218 should match the repository format used for downloading.
# Default values for semantic-router.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yuluo-yx yuluo-yx marked this pull request as draft October 24, 2025 16:55
@yuluo-yx yuluo-yx requested a review from Copilot October 25, 2025 09:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nithin8702
Copy link

Hi @yuluo-yx We would like to use semantic router with nginx ingress. Is that possible with your PR?
#557

@yuluo-yx
Copy link
Contributor Author

yuluo-yx commented Nov 1, 2025

If all goes well, it should be possible to do this. I've been quite busy lately, but I'll test it locally this weekend and prepare for the merge. 👀

when I'm ready to merge, pls review the code if you have time. thx @nithin8702

@nithin8702
Copy link

@yuluo-yx Could you please confirm your PR works with nginx ingress or envoy ai gateway?

Is there a way i can test your helm chart now?

Also there is a chat going on in Slack. Please check
https://vllm-dev.slack.com/archives/C09CTGF8KCN/p1761751362685129

@yuluo-yx
Copy link
Contributor Author

yuluo-yx commented Nov 1, 2025

@yuluo-yx Could you please confirm your PR works with nginx ingress or envoy ai gateway?

Is there a way i can test your helm chart now?

Also there is a chat going on in Slack. Please check https://vllm-dev.slack.com/archives/C09CTGF8KCN/p1761751362685129

you can use tools/make/helm.mk related make target test. I wrote some comment

Signed-off-by: jishiwen.jsw <jishiwen.jsw@digital-engine.com>
@yuluo-yx yuluo-yx marked this pull request as ready for review November 1, 2025 05:41
@yuluo-yx yuluo-yx changed the title [WIP] feat: add helm support deploy support feat: add helm support deploy support Nov 3, 2025
@rootfs rootfs requested a review from Copilot November 3, 2025 14:53
@rootfs
Copy link
Collaborator

rootfs commented Nov 3, 2025

@yuluo-yx this is great! would you please add a CI test too, either in this PR or in a followup PR. Thanks

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

deploy/helm/semantic-router/values.yaml:1

  • Setting runAsNonRoot: false in default values is a security concern. This allows the container to run as root user, which contradicts the best practice shown in production values. The default should be true to enforce running as non-root unless explicitly overridden for specific use cases.
# Default values for semantic-router.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make sure CI passed, thanks!

Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
@yuluo-yx
Copy link
Contributor Author

yuluo-yx commented Nov 5, 2025

I'm encountering some issues while testing Helm in a CI environment. Specifically, model downloads are failing or timed out, and I've tried using CI caching, but it doesn't seem to be working. Therefore, I'm putting this as a draft for now. Once I resolve the CI issues, I'll move this PR to approval status, and I'll complete it by the weekend.

@yuluo-yx yuluo-yx marked this pull request as draft November 5, 2025 16:05
@rootfs
Copy link
Collaborator

rootfs commented Nov 5, 2025

looks the model downloading is not finished

  Downloading all-MiniLM-L12-v2 from sentence-transformers/all-MiniLM-L12-v2...
  /usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:202: UserWarning: The `local_dir_use_symlinks` argument is deprecated and ignored in `snapshot_download`. Downloading to a local directory does not use symlinks anymore.
    warnings.warn(
  
  Fetching 20 files:   0%|          | 0/20 [00:00<?, ?it/s]
  Fetching 20 files:   5%|▌         | 1/20 [00:00<00:07,  2.55it/s]
  Fetching 20 files:  10%|█         | 2/20 [00:00<00:05,  3.56it/s]
  Fetching 20 files:  35%|███▌      | 7/20 [00:03<00:06,  1.97it/s]
  Fetching 20 files: 100%|██████████| 20/20 [00:03<00:00,  5.73it/s]
  Checking all-MiniLM-L12-v2 for required files:
  ✓ Found PyTorch model weights in all-MiniLM-L12-v2
  Downloading category_classifier_modernbert-base_model from LLM-Semantic-Router/category_classifier_modernbert-base_model...
  /usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py:202: UserWarning: The `local_dir_use_symlinks` argument is deprecated and ignored in `snapshot_download`. Downloading to a local directory does not use symlinks anymore.
    warnings.warn(
  
  Fetching 9 files:   0%|          | 0/9 [00:00<?, ?it/s]
  Fetching 9 files:  11%|█         | 1/9 [00:00<00:02,  3.26it/s]/bin/bash: line 27:    43 Killed                  python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='LLM-Semantic-Router/category_classifier_modernbert-base_model', local_dir='category_classifier_modernbert-base_model', local_dir_use_symlinks=False, ignore_patterns=['*.onnx', '*.msgpack', '*.h5', '*.tflite'] if 'category_classifier_modernbert-base_model' == 'all-MiniLM-L12-v2' else None)"
  Error from server (BadRequest): container "semantic-router" in pod "semantic-router-7fcc4cf984-88chb" is waiting to start: PodInitializing

@yuluo-yx would it be ok to create a base container image to have all the models and build extproc on that base container, so we don't have to wait for model download

Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
@yuluo-yx
Copy link
Contributor Author

yuluo-yx commented Nov 6, 2025

This fine, but I don't think the network speed in the GitHub container should be this slow. I want to find out why. Once I find the reason, I can back up a copy containing the basic model as an optimization.

@rootfs
Copy link
Collaborator

rootfs commented Nov 6, 2025

This fine, but I don't think the network speed in the GitHub container should be this slow. I want to find out why. Once I find the reason, I can back up a copy containing the basic model as an optimization.

sounds good! let's find our options based on the CI results.

Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
@yuluo-yx
Copy link
Contributor Author

yuluo-yx commented Nov 6, 2025

logs:

Fetching 9 files:   0%|          | 0/9 [00:00<?, ?it/s]
  Fetching 9 files:  11%|█         | 1/9 [00:00<00:02,  3.26it/s]/bin/bash: line 27:    43 Killed                  python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='LLM-Semantic-Router/category_classifier_modernbert-base_model', local_dir='category_classifier_modernbert-base_model', local_dir_use_symlinks=False, ignore_patterns=['*.onnx', '*.msgpack', '*.h5', '*.tflite'] if 'category_classifier_modernbert-base_model' == 'all-MiniLM-L12-v2' else None)"
  Error from server (BadRequest): container "semantic-router" in pod "semantic-router-7fcc4cf984-88chb" is waiting to start: PodInitializing

refer:

I searched for information related to GitHub Actions and found the problem. It seems the memory was full, so the process was killed.

@rootfs
Copy link
Collaborator

rootfs commented Nov 6, 2025

@yuluo-yx can you adjust the readiness and health probe? I suspect the readiness takes longer time on CI so the init pod is killed.

Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: shown <yuluo08290126@gmail.com>
@yuluo-yx
Copy link
Contributor Author

yuluo-yx commented Nov 8, 2025

hi @rootfs

After I tried adjusting the timeout and resource limits for initializing the container, it worked. pls take a look.

@yuluo-yx yuluo-yx marked this pull request as ready for review November 8, 2025 10:20
@yuluo-yx
Copy link
Contributor Author

yuluo-yx commented Nov 8, 2025

btw @rootfs

What are the requirements to join the vllm GitHub organization? I'd like to give it a try. 😬

@rootfs
Copy link
Collaborator

rootfs commented Nov 8, 2025

btw @rootfs

What are the requirements to join the vllm GitHub organization? I'd like to give it a try. 😬

I don't know, I am not a member either

@rootfs rootfs requested a review from Copilot November 8, 2025 14:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

done; \
if kubectl get namespace $(HELM_NAMESPACE) &>/dev/null; then \
echo "$(YELLOW)[WARNING]$(NC) Namespace still exists after $$timeout seconds, forcing cleanup..."; \
kubectl get namespace $(HELM_NAMESPACE) -o json 2>/dev/null | jq '.spec.finalizers = []' | kubectl replace --raw /api/v1/namespaces/$(HELM_NAMESPACE)/finalize -f - 2>/dev/null || true; \
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing dependency documentation. This command requires 'jq' to be installed, but there's no validation or error message if it's missing. Consider adding a comment explaining the jq dependency or adding a check for its availability before use.

Copilot uses AI. Check for mistakes.
-f ${{ env.CHART_PATH }}/values-dev.yaml \
--set initContainer.resources.limits.memory=2Gi \
--set initContainer.resources.requests.memory=1Gi \
--set-json 'initContainer.models=[{"name":"all-MiniLM-L12-v2","repo":"sentence-transformers/all-MiniLM-L12-v2"}]' \
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated model configuration. The minimal model configuration for CI testing is defined both in line 188-200 (install) and line 304-308 (upgrade test). Consider extracting this as a workflow variable or referencing the same values file to avoid maintenance issues if the test configuration needs to change.

Copilot uses AI. Check for mistakes.
@rootfs rootfs merged commit 183e8f4 into vllm-project:main Nov 8, 2025
28 checks passed
@rootfs
Copy link
Collaborator

rootfs commented Nov 8, 2025

@yuluo-yx this is great! thank you!

@yuluo-yx yuluo-yx deleted the 1022-yuluo/helm branch November 8, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Semantic Router Helm Chart Support

4 participants