diff --git a/config/manifests/gateway/nginxgatewayfabric/gateway.yaml b/config/manifests/gateway/nginxgatewayfabric/gateway.yaml new file mode 100644 index 000000000..cdc1308ea --- /dev/null +++ b/config/manifests/gateway/nginxgatewayfabric/gateway.yaml @@ -0,0 +1,10 @@ +apiVersion: gateway.networking.k8s.io/v1 +kind: Gateway +metadata: + name: inference-gateway +spec: + gatewayClassName: nginx + listeners: + - name: http + port: 80 + protocol: HTTP diff --git a/config/manifests/gateway/nginxgatewayfabric/httproute.yaml b/config/manifests/gateway/nginxgatewayfabric/httproute.yaml new file mode 100644 index 000000000..70c38873f --- /dev/null +++ b/config/manifests/gateway/nginxgatewayfabric/httproute.yaml @@ -0,0 +1,18 @@ +apiVersion: gateway.networking.k8s.io/v1 +kind: HTTPRoute +metadata: + name: llm-route + namespace: default +spec: + parentRefs: + - name: inference-gateway + rules: + - matches: + - path: + type: PathPrefix + value: / + backendRefs: + - group: inference.networking.k8s.io + kind: InferencePool + name: vllm-llama3-8b-instruct + diff --git a/site-src/_includes/epp-latest.md b/site-src/_includes/epp-latest.md index ef08a61be..e090778f0 100644 --- a/site-src/_includes/epp-latest.md +++ b/site-src/_includes/epp-latest.md @@ -30,3 +30,14 @@ --version $IGW_CHART_VERSION \ oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool ``` + +=== "NGINX Gateway Fabric" + + ```bash + export GATEWAY_PROVIDER=none + helm install vllm-llama3-8b-instruct \ + --set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \ + --set provider.name=$GATEWAY_PROVIDER \ + --version $IGW_CHART_VERSION \ + oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool + ``` \ No newline at end of file diff --git a/site-src/_includes/epp.md b/site-src/_includes/epp.md index 73e24786f..a82ca68d3 100644 --- a/site-src/_includes/epp.md +++ b/site-src/_includes/epp.md @@ -30,3 +30,14 @@ --version $IGW_CHART_VERSION \ oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool ``` + +=== "NGINX Gateway Fabric" + + ```bash + export GATEWAY_PROVIDER=none + helm install vllm-llama3-8b-instruct \ + --set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \ + --set provider.name=$GATEWAY_PROVIDER \ + --version $IGW_CHART_VERSION \ + oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool + ``` diff --git a/site-src/guides/getting-started-latest.md b/site-src/guides/getting-started-latest.md index 13fb830b8..31cd12d6b 100644 --- a/site-src/guides/getting-started-latest.md +++ b/site-src/guides/getting-started-latest.md @@ -193,6 +193,69 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens kubectl get httproute llm-route -o yaml ``` +=== "NGINX Gateway Fabric" + + NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric. + + 1. Requirements + + - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel). + - [Helm](https://helm.sh/docs/intro/install/) installed. + - A Kubernetes cluster with LoadBalancer or NodePort access. + + 2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value + + ```bash + helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true + ``` + This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective. + + 3. Deploy the Gateway + + ```bash + kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml + ``` + + 4. Verify the Gateway status + + Ensure that the Gateway is running and has been assigned an address: + + ```bash + kubectl get gateway inference-gateway + ``` + + Check that the Gateway has been successfully provisioned and that its status shows Programmed=True + + 5. Deploy the HTTPRoute + + Create the HTTPRoute resource to route traffic to your InferencePool: + + ```bash + kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml + ``` + + 6. Verify the route status + + Check that the HTTPRoute was successfully configured and references were resolved: + + ```bash + kubectl get httproute llm-route -o yaml + ``` + + The route status should include Accepted=True and ResolvedRefs=True. + + 7. Verify the InferencePool Status + + Make sure the InferencePool is active before sending traffic. + + ```bash + kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct + ``` + + Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic. + + For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview) + ### Deploy InferenceObjective (Optional) Deploy the sample InferenceObjective which allows you to specify priority of requests. @@ -285,3 +348,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req ```bash kubectl delete ns kgateway-system ``` + +=== "NGINX Gateway Fabric" + + Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources. + + + 1. Remove Inference Gateway and HTTPRoute: + + ```bash + kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found + kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found + ``` + + 2. Uninstall NGINX Gateway Fabric: + + ```bash + helm uninstall ngf -n nginx-gateway + ``` + + 3. Clean up namespace: + + ```bash + kubectl delete ns nginx-gateway + ``` \ No newline at end of file diff --git a/site-src/guides/index.md b/site-src/guides/index.md index 0dab6ba76..cf4980e9e 100644 --- a/site-src/guides/index.md +++ b/site-src/guides/index.md @@ -87,6 +87,22 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true ``` +=== "NGINX Gateway Fabric" + + 1. Requirements + + - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel). + - [Helm](https://helm.sh/docs/intro/install/) installed. + - A Kubernetes cluster with LoadBalancer or NodePort access. + + 2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value + + ```bash + helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true + ``` + This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective. + + ### Deploy the InferencePool and Endpoint Picker Extension Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources. @@ -200,6 +216,57 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens kubectl get httproute llm-route -o yaml ``` +=== "NGINX Gateway Fabric" + + NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric. + + 1. Deploy the Gateway + + ```bash + kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml + ``` + + 2. Verify the Gateway status + + Ensure that the Gateway is running and has been assigned an address: + + ```bash + kubectl get gateway inference-gateway + ``` + + Check that the Gateway has been successfully provisioned and that its status shows Programmed=True + + 3. Deploy the HTTPRoute + + Create the HTTPRoute resource to route traffic to your InferencePool: + + ```bash + kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml + ``` + + 4. Verify the route status + + Check that the HTTPRoute was successfully configured and references were resolved: + + ```bash + kubectl get httproute llm-route -o yaml + ``` + + The route status should include Accepted=True and ResolvedRefs=True. + + 5. Verify the InferencePool Status + + Make sure the InferencePool is active before sending traffic. + + ```bash + kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct + ``` + + Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic. + + For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview) + + ### Deploy InferenceObjective (Optional) Deploy the sample InferenceObjective which allows you to specify priority of requests. @@ -293,3 +360,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req ```bash kubectl delete ns kgateway-system ``` + +=== "NGINX Gateway Fabric" + + Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources. + + + 1. Remove Inference Gateway and HTTPRoute: + + ```bash + kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found + kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found + ``` + + 2. Uninstall NGINX Gateway Fabric: + + ```bash + helm uninstall ngf -n nginx-gateway + ``` + + 3. Clean up namespace: + + ```bash + kubectl delete ns nginx-gateway + ``` diff --git a/site-src/implementations/gateways.md b/site-src/implementations/gateways.md index 8c7ee8dea..7ea67f1a3 100644 --- a/site-src/implementations/gateways.md +++ b/site-src/implementations/gateways.md @@ -9,6 +9,7 @@ This project has several implementations that are planned or in progress: - [Istio](#istio) - [Kgateway](#kgateway) - [Kubvernor](#kubvernor) + - [NGINX Gateway Fabric](#nginx-gateway-fabric) [1]:#alibaba-cloud-container-service-for-kubernetes [2]:#envoy-ai-gateway @@ -16,6 +17,7 @@ This project has several implementations that are planned or in progress: [4]:#istio [5]:#kgateway [6]:#kubvernor +[7]:#nginx-gateway-fabric Agentgateway can run independently or can be managed by [Kgateway](https://kgateway.dev/). @@ -98,3 +100,10 @@ Kgateway supports Inference Gateway with the [agentgateway](https://agentgateway [krg]:https://github.com/kubvernor/kubvernor [krgu]: https://github.com/kubvernor/kubvernor/blob/main/README.md +## NGINX Gateway Fabric + +[NGINX Gateway Fabric][nginx-gateway-fabric] is an open-source project that provides an implementation of the Gateway API using [NGINX][nginx] as the data plane. The goal of this project is to implement the core Gateway API to configure an HTTP or TCP/UDP load balancer, reverse-proxy, or API gateway for applications running on Kubernetes. You can find the comprehensive NGINX Gateway Fabric user documentation on the [NGINX Documentation][nginx-docs] website. + +[nginx-gateway-fabric]: https://github.com/nginx/nginx-gateway-fabric +[nginx]:https://nginx.org/ +[nginx-docs]:https://docs.nginx.com/nginx-gateway-fabric/ \ No newline at end of file