Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions config/manifests/gateway/nginxgatewayfabric/gateway.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: inference-gateway
spec:
gatewayClassName: nginx
listeners:
- name: http
port: 80
protocol: HTTP
18 changes: 18 additions & 0 deletions config/manifests/gateway/nginxgatewayfabric/httproute.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: llm-route
namespace: default
spec:
parentRefs:
- name: inference-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- group: inference.networking.k8s.io
kind: InferencePool
name: vllm-llama3-8b-instruct

11 changes: 11 additions & 0 deletions site-src/_includes/epp-latest.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,14 @@
--version $IGW_CHART_VERSION \
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool
```

=== "NGINX Gateway Fabric"

```bash
export GATEWAY_PROVIDER=none
helm install vllm-llama3-8b-instruct \
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
--set provider.name=$GATEWAY_PROVIDER \
--version $IGW_CHART_VERSION \
oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts/inferencepool
```
11 changes: 11 additions & 0 deletions site-src/_includes/epp.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,14 @@
--version $IGW_CHART_VERSION \
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
```

=== "NGINX Gateway Fabric"

```bash
export GATEWAY_PROVIDER=none
helm install vllm-llama3-8b-instruct \
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
--set provider.name=$GATEWAY_PROVIDER \
--version $IGW_CHART_VERSION \
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
```
87 changes: 87 additions & 0 deletions site-src/guides/getting-started-latest.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,69 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
kubectl get httproute llm-route -o yaml
```

=== "NGINX Gateway Fabric"

NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.

1. Requirements

- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
- [Helm](https://helm.sh/docs/intro/install/) installed.
- A Kubernetes cluster with LoadBalancer or NodePort access.

2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value

```bash
helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true
```
This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective.

3. Deploy the Gateway

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
```

4. Verify the Gateway status

Ensure that the Gateway is running and has been assigned an address:

```bash
kubectl get gateway inference-gateway
```

Check that the Gateway has been successfully provisioned and that its status shows Programmed=True

5. Deploy the HTTPRoute

Create the HTTPRoute resource to route traffic to your InferencePool:

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
```

6. Verify the route status

Check that the HTTPRoute was successfully configured and references were resolved:

```bash
kubectl get httproute llm-route -o yaml
```

The route status should include Accepted=True and ResolvedRefs=True.

7. Verify the InferencePool Status

Make sure the InferencePool is active before sending traffic.

```bash
kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct
```

Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.

For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)

### Deploy InferenceObjective (Optional)

Deploy the sample InferenceObjective which allows you to specify priority of requests.
Expand Down Expand Up @@ -285,3 +348,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
```bash
kubectl delete ns kgateway-system
```

=== "NGINX Gateway Fabric"

Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.


1. Remove Inference Gateway and HTTPRoute:

```bash
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
```

2. Uninstall NGINX Gateway Fabric:

```bash
helm uninstall ngf -n nginx-gateway
```

3. Clean up namespace:

```bash
kubectl delete ns nginx-gateway
```
91 changes: 91 additions & 0 deletions site-src/guides/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,22 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
```

=== "NGINX Gateway Fabric"

1. Requirements

- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed (Standard or Experimental channel).
- [Helm](https://helm.sh/docs/intro/install/) installed.
- A Kubernetes cluster with LoadBalancer or NodePort access.

2. Install NGINX Gateway Fabric with the Inference Extension enabled by setting the `nginxGateway.gwAPIInferenceExtension.enable=true` Helm value

```bash
helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric --create-namespace -n nginx-gateway --set nginxGateway.gwAPIInferenceExtension.enable=true
```
This enables NGINX Gateway Fabric to watch and manage Inference Extension resources such as InferencePool and InferenceObjective.


### Deploy the InferencePool and Endpoint Picker Extension

Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
Expand Down Expand Up @@ -200,6 +216,57 @@ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extens
kubectl get httproute llm-route -o yaml
```

=== "NGINX Gateway Fabric"

NGINX Gateway Fabric is an implementation of the Gateway API that supports the Inference Extension. Follow these steps to deploy an Inference Gateway using NGINX Gateway Fabric.

1. Deploy the Gateway

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml
```

2. Verify the Gateway status

Ensure that the Gateway is running and has been assigned an address:

```bash
kubectl get gateway inference-gateway
```

Check that the Gateway has been successfully provisioned and that its status shows Programmed=True

3. Deploy the HTTPRoute

Create the HTTPRoute resource to route traffic to your InferencePool:

```bash
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml
```

4. Verify the route status

Check that the HTTPRoute was successfully configured and references were resolved:

```bash
kubectl get httproute llm-route -o yaml
```

The route status should include Accepted=True and ResolvedRefs=True.

5. Verify the InferencePool Status

Make sure the InferencePool is active before sending traffic.

```bash
kubectl describe inferencepools.inference.networking.k8s.io vllm-llama3-8b-instruct
```

Check that the status shows Accepted=True and ResolvedRefs=True. This confirms the InferencePool is ready to handle traffic.

For more information, see the [NGINX Gateway Fabric - Inference Gateway Setup guide](https://docs.nginx.com/nginx-gateway-fabric/how-to/gateway-api-inference-extension/#overview)


### Deploy InferenceObjective (Optional)

Deploy the sample InferenceObjective which allows you to specify priority of requests.
Expand Down Expand Up @@ -293,3 +360,27 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
```bash
kubectl delete ns kgateway-system
```

=== "NGINX Gateway Fabric"

Follow these steps to remove the NGINX Gateway Fabric Inference Gateway and all related resources.


1. Remove Inference Gateway and HTTPRoute:

```bash
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/gateway.yaml --ignore-not-found
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/nginxgatewayfabric/httproute.yaml --ignore-not-found
```

2. Uninstall NGINX Gateway Fabric:

```bash
helm uninstall ngf -n nginx-gateway
```

3. Clean up namespace:

```bash
kubectl delete ns nginx-gateway
```
9 changes: 9 additions & 0 deletions site-src/implementations/gateways.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,15 @@ This project has several implementations that are planned or in progress:
- [Istio](#istio)
- [Kgateway](#kgateway)
- [Kubvernor](#kubvernor)
- [NGINX Gateway Fabric](#nginx-gateway-fabric)

[1]:#alibaba-cloud-container-service-for-kubernetes
[2]:#envoy-ai-gateway
[3]:#google-kubernetes-engine
[4]:#istio
[5]:#kgateway
[6]:#kubvernor
[7]:#nginx-gateway-fabric

Agentgateway can run independently or can be managed by [Kgateway](https://kgateway.dev/).

Expand Down Expand Up @@ -98,3 +100,10 @@ Kgateway supports Inference Gateway with the [agentgateway](https://agentgateway
[krg]:https://github.com/kubvernor/kubvernor
[krgu]: https://github.com/kubvernor/kubvernor/blob/main/README.md

## NGINX Gateway Fabric

[NGINX Gateway Fabric][nginx-gateway-fabric] is an open-source project that provides an implementation of the Gateway API using [NGINX][nginx] as the data plane. The goal of this project is to implement the core Gateway API to configure an HTTP or TCP/UDP load balancer, reverse-proxy, or API gateway for applications running on Kubernetes. You can find the comprehensive NGINX Gateway Fabric user documentation on the [NGINX Documentation][nginx-docs] website.

[nginx-gateway-fabric]: https://github.com/nginx/nginx-gateway-fabric
[nginx]:https://nginx.org/
[nginx-docs]:https://docs.nginx.com/nginx-gateway-fabric/