Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions config/manifests/gateway/nginxgatewayfabric/gateway.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: inference-gateway
spec:
gatewayClassName: nginx
listeners:
- name: http
port: 80
protocol: HTTP
17 changes: 17 additions & 0 deletions config/manifests/gateway/nginxgatewayfabric/httproute.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: llm-route
namespace: default
spec:
parentRefs:
- name: inference-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- group: inference.networking.k8s.io
kind: InferencePool
name: vllm-llama3-8b-instruct
10 changes: 10 additions & 0 deletions site-src/_includes/epp-latest.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,13 @@
--version $IGW_CHART_VERSION \
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool
```

=== "Nginx Gateway Fabric"

```bash
export IGW_CHART_VERSION=v1.0.2
helm install vllm-llama3-8b-instruct \
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
--version $IGW_CHART_VERSION \
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
```
106 changes: 106 additions & 0 deletions site-src/guides/getting-started-latest.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,72 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
kubectl get httproute llm-route -o yaml
```

=== "Nginx Gateway Fabric"

Nginx Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGF.

1. Requirements

- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
- [Helm](https://helm.sh/docs/intro/install/) installed.
- A Kubernetes cluster with LoadBalancer or NodePort access.

2. Install the Inference Extension CRDs

```bash
kubectl kustomize "https://github.com/nginx/nginx-gateway-fabric/config/crd/inference-extension/?ref=v2.2.0" | kubectl apply -f -
```

3. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the nginxGateway.gwAPIInferenceExtension.enable=true Helm value

```bash
helm repo add nginx-stable https://helm.nginx.com/stable
helm upgrade -i nginx-gateway-fabric nginx-stable/nginx-gateway-fabric \
--namespace nginx-gateway --create-namespace \
--set nginxGateway.gwAPIInferenceExtension.enable=true
```
This enables NGF to recognize and manage Inference Extension resources such as InferencePool and InferenceObjective.

4. Deploy the Gateway

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
```

Verify that the Gateway was successfully provisioned and shows Programmed=True:

```bash
kubectl describe gateway inference-gateway
```

5. Verify the Gateway status

Confirm that the Gateway is running and has been assigned an address:

```bash
kubectl get gateway inference-gateway
```

6. Deploy the HTTPRoute

Create the HTTPRoute resource to route traffic to your InferencePool:

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
```

7. Verify the route status

Check that the HTTPRoute was successfully configured and references were resolved:

```bash
kubectl get httproute llm-route -o yaml
```

The route status should include Accepted=True and ResolvedRefs=True.

For more information, see the [NGF - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)

### Deploy InferenceObjective (Optional)

Deploy the sample InferenceObjective which allows you to specify priority of requests.
Expand Down Expand Up @@ -285,3 +351,43 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
```bash
kubectl delete ns kgateway-system
```

=== "Nginx Gateway Fabric"

Follow these steps to remove the NGINX Gateway Fabric (NGF) Inference Gateway and all related resources.

1. Remove Inference resources InferencePool, InferenceObjective, and model server resources:

```bash
helm uninstall vllm-llama3-8b-instruct
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml --ignore-not-found
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/sim-deployment.yaml --ignore-not-found
```

2. Delete Gateway API Inference Extension CRDs:

```bash
kubectl delete -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd --ignore-not-found
```

3. Remove Inference Gateway and HTTPRoute:

```bash
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
```

4. Uninstall NGINX Gateway Fabric:

```bash
helm uninstall ngf -n nginx-gateway
```

5. Clean up namespace and CRDs:

```bash
kubectl delete ns nginx-gateway
kubectl delete -f https://raw.githubusercontent.com/nginx/nginx-gateway-fabric/v2.2.0/deploy/crds.yaml
```
9 changes: 9 additions & 0 deletions site-src/implementations/gateways.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,15 @@ This project has several implementations that are planned or in progress:
- [Istio](#istio)
- [Kgateway](#kgateway)
- [Kubvernor](#kubvernor)
- [Nginx Gateway Fabric](#nginx-gateway-fabric)

[1]:#alibaba-cloud-container-service-for-kubernetes
[2]:#envoy-ai-gateway
[3]:#google-kubernetes-engine
[4]:#istio
[5]:#kgateway
[6]:#kubvernor
[7]:#nginx-gateway-fabric

Agentgateway can run independently or can be managed by [Kgateway](https://kgateway.dev/).

Expand Down Expand Up @@ -98,3 +100,10 @@ Kgateway supports Inference Gateway with the [agentgateway](https://agentgateway
[krg]:https://github.com/kubvernor/kubvernor
[krgu]: https://github.com/kubvernor/kubvernor/blob/main/README.md

## Nginx Gateway Fabric

[NGINX Gateway Fabric][nginx-gateway-fabric] is an open-source project that provides an implementation of the Gateway API using [NGINX][nginx] as the data plane. The goal of this project is to implement the core Gateway API to configure an HTTP or TCP/UDP load balancer, reverse-proxy, or API gateway for applications running on Kubernetes. You can find the comprehensive NGINX Gateway Fabric user documentation on the [NGINX Documentation][nginx-docs] website.

[nginx-gateway-fabric]: https://github.com/nginx/nginx-gateway-fabric
[nginx]:https://nginx.org/
[nginx-docs]:https://docs.nginx.com/nginx-gateway-fabric/