-
Notifications
You must be signed in to change notification settings - Fork 295
feat: add helm support deploy support #532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
yuluo-yx
commented
Oct 24, 2025
- Fixed Semantic Router Helm Chart Support #277
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive Helm chart support for deploying Semantic Router on Kubernetes, providing an alternative to the existing Kustomize deployment method. The implementation includes production and development configurations, validation tooling, and extensive Make target automation.
Key Changes
- Added complete Helm chart structure with templates for all Kubernetes resources (Deployment, Service, ConfigMap, PVC, Ingress, HPA, etc.)
- Introduced environment-specific values files (dev, prod, example) with optimized configurations for different deployment scenarios
- Integrated Helm deployment automation through Make targets and validation scripts
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 22 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/make/helm.mk | Comprehensive Make targets for Helm operations including install, upgrade, testing, and port-forwarding |
| tools/make/linter.mk | Removed documentation linting targets (likely relocated or obsolete) |
| deploy/helm/semantic-router/Chart.yaml | Helm chart metadata and project information |
| deploy/helm/semantic-router/values.yaml | Default configuration values for the Helm deployment |
| deploy/helm/semantic-router/values-dev.yaml | Development environment optimized values |
| deploy/helm/semantic-router/values-prod.yaml | Production environment optimized values with HA setup |
| deploy/helm/semantic-router/values-example.yaml | Example configuration demonstrating customization options |
| deploy/helm/semantic-router/templates/*.yaml | Kubernetes resource templates for deployment infrastructure |
| deploy/helm/semantic-router/templates/_helpers.tpl | Helm template helper functions |
| deploy/helm/validate-chart.sh | Automated validation script for chart testing |
| deploy/helm/README.md | Comprehensive deployment guide with examples |
| deploy/helm/semantic-router/README.md | Detailed chart documentation |
| Makefile | Integration of helm.mk into main build system |
Comments suppressed due to low confidence (1)
deploy/helm/semantic-router/values.yaml:1
- Inconsistent model path format. Line 218 uses 'models/all-MiniLM-L12-v2' (with 'models/' prefix) while line 57 uses 'sentence-transformers/all-MiniLM-L12-v2' (repository format). These should be consistent - line 218 should match the repository format used for downloading.
# Default values for semantic-router.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
If all goes well, it should be possible to do this. I've been quite busy lately, but I'll test it locally this weekend and prepare for the merge. 👀 when I'm ready to merge, pls review the code if you have time. thx @nithin8702 |
|
@yuluo-yx Could you please confirm your PR works with nginx ingress or envoy ai gateway? Is there a way i can test your helm chart now? Also there is a chat going on in Slack. Please check |
you can use |
Signed-off-by: jishiwen.jsw <jishiwen.jsw@digital-engine.com>
|
@yuluo-yx this is great! would you please add a CI test too, either in this PR or in a followup PR. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
deploy/helm/semantic-router/values.yaml:1
- Setting
runAsNonRoot: falsein default values is a security concern. This allows the container to run as root user, which contradicts the best practice shown in production values. The default should betrueto enforce running as non-root unless explicitly overridden for specific use cases.
# Default values for semantic-router.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Xunzhuo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make sure CI passed, thanks!
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
|
I'm encountering some issues while testing Helm in a CI environment. Specifically, model downloads are failing or timed out, and I've tried using CI caching, but it doesn't seem to be working. Therefore, I'm putting this as a draft for now. Once I resolve the CI issues, I'll move this PR to approval status, and I'll complete it by the weekend. |
|
looks the model downloading is not finished @yuluo-yx would it be ok to create a base container image to have all the models and build extproc on that base container, so we don't have to wait for model download |
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
|
This fine, but I don't think the network speed in the GitHub container should be this slow. I want to find out why. Once I find the reason, I can back up a copy containing the basic model as an optimization. |
sounds good! let's find our options based on the CI results. |
|
logs: refer:
I searched for information related to GitHub Actions and found the problem. It seems the memory was full, so the process was killed. |
|
@yuluo-yx can you adjust the readiness and health probe? I suspect the readiness takes longer time on CI so the init pod is killed. |
Signed-off-by: yuluo-yx <yuluo08290126@gmail.com>
Signed-off-by: shown <yuluo08290126@gmail.com>
|
hi @rootfs After I tried adjusting the timeout and resource limits for initializing the container, it worked. pls take a look. |
|
btw @rootfs What are the requirements to join the vllm GitHub organization? I'd like to give it a try. 😬 |
I don't know, I am not a member either |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| done; \ | ||
| if kubectl get namespace $(HELM_NAMESPACE) &>/dev/null; then \ | ||
| echo "$(YELLOW)[WARNING]$(NC) Namespace still exists after $$timeout seconds, forcing cleanup..."; \ | ||
| kubectl get namespace $(HELM_NAMESPACE) -o json 2>/dev/null | jq '.spec.finalizers = []' | kubectl replace --raw /api/v1/namespaces/$(HELM_NAMESPACE)/finalize -f - 2>/dev/null || true; \ |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing dependency documentation. This command requires 'jq' to be installed, but there's no validation or error message if it's missing. Consider adding a comment explaining the jq dependency or adding a check for its availability before use.
| -f ${{ env.CHART_PATH }}/values-dev.yaml \ | ||
| --set initContainer.resources.limits.memory=2Gi \ | ||
| --set initContainer.resources.requests.memory=1Gi \ | ||
| --set-json 'initContainer.models=[{"name":"all-MiniLM-L12-v2","repo":"sentence-transformers/all-MiniLM-L12-v2"}]' \ |
Copilot
AI
Nov 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated model configuration. The minimal model configuration for CI testing is defined both in line 188-200 (install) and line 304-308 (upgrade test). Consider extracting this as a workflow variable or referencing the same values file to avoid maintenance issues if the test configuration needs to change.
|
@yuluo-yx this is great! thank you! |
