istio · wilsonwu · Nov 5, 2025
@@ -31,36 +31,43 @@ Istio {{< gloss "ambient" >}}Ambient 服务网格{{< /gloss >}}。
 #### 网络拓扑限制 {#network-topology-restrictions}
 
 **多集群单网络配置未经测试，可能会出现问题**
-- 在共享同一网络的集群之间部署 Ambient 时要小心
-- 仅支持多网络配置
+  - 在共享同一网络的集群之间部署 Ambient 时要小心
+  - 仅支持多网络配置
 
 #### 控制平面限制 {#control-plane-limitations}
 
 **目前不支持主集群远程配置**
-- 您只能拥有多个主集群
-- 具有一个或多个远程集群的配置将无法正常工作
+  - 您只能拥有多个主集群
+  - 具有一个或多个远程集群的配置将无法正常工作
 
 #### waypoint 要求 {#waypoint-requirements}
 
 **假设跨集群部署通用 waypoint**
-- 所有集群必须具有相同名称的 waypoint 部署
-- waypoint 配置必须跨集群手动同步（例如使用 Flux、ArgoCD 或类似工具）
-- 流量路由依赖于一致的 waypoint 命名约定
+  - 所有集群必须具有相同名称的 waypoint 部署
+  - waypoint 配置必须跨集群手动同步（例如使用 Flux、ArgoCD 或类似工具）
+  - 流量路由依赖于一致的 waypoint 命名约定
 
 #### 服务可见性和范围 {#service-visibility-and-scoping}
 
 **服务范围配置无法跨集群读取**
-- 仅使用本地集群的服务范围配置作为真实来源
-- 不遵循远程集群服务范围，这可能导致意外的流量行为
-- 跨集群服务发现可能不遵循预期的服务边界
+  - 仅使用本地集群的服务范围配置作为真实来源
+  - 不遵循远程集群服务范围，这可能导致意外的流量行为
+  - 跨集群服务发现可能不遵循预期的服务边界
 
 **如果服务的 waypoint 被标记为全局，则该服务也将是全局的**
-- 如果不仔细管理，这可能会导致意外的跨集群流量
+  - 如果不仔细管理，这可能会导致意外的跨集群流量
+  - 此问题的解决方案跟踪记录在[这里](https://github.com/istio/istio/issues/57710)
+
+#### 远程网络负载分配 {#load-distribution-on-remote-network}
+
+**流向远程网络的流量在各个端点之间的分配并不均匀**
+  - 当故障转移到远程网络时，由于 HTTP 请求多路复用和连接池机制，远程网络上的单个端点可能会收到不成比例的请求数量。
+  - 此问题的解决方案跟踪记录在[此处](https://github.com/istio/istio/issues/58039)
 
 #### 网关限制 {#gateway-limitations}
 
 **Ambient 东西网关目前仅支持网格内 mTLS 流量**
-- 目前无法使用 Ambient 东西向网关在网络上公开 `istiod`。您仍然可以使用经典的东西向网关来实现此目的。
+  - 目前无法使用 Ambient 东西向网关在网络上公开 `istiod`。您仍然可以使用经典的东西向网关来实现此目的。
 
 {{< tip >}}
 随着 Ambient 多集群技术的成熟，许多此类限制将得到解决。

@@ -0,0 +1,191 @@
+---
+title: 在多集群 Ambient 安装中配置故障转移行为
+description: 使用 waypoint 配置 Ambient 多集群网格中的异常值检测和故障转移行为。
+weight: 70
+keywords: [kubernetes,multicluster,ambient]
+test: yes
+owner: istio/wg-environments-maintainers
+prev: /zh/docs/ambient/install/multicluster/verify
+---
+按照本指南，使用 waypoint 代理自定义 Ambient 多集群 Istio 安装中的故障转移行为。
+
+在继续操作之前，请务必按照[多集群安装指南](/zh/docs/ambient/install/multicluster)之一完成
+Ambient 多集群 Istio 安装，并验证安装是否正常工作。
+
+在本指南中，我们将基于用于验证多集群安装的 `HelloWorld` 应用程序进行构建。
+我们将为 `HelloWorld` 服务配置本地故障转移，
+使其优先使用客户端所在集群中的端点（使用 `DestinationRule`），并部署一个 waypoint 代理来强制执行此配置。
+
+## 部署 waypoint 代理 {#deploy-waypoint-proxy}
+
+为了配置异常值检测并自定义服务的故障转移行为，我们需要一个 waypoint 代理。
+首先，将 waypoint 代理部署到网格中的每个集群：
+
+{{< text bash >}}
+$ istioctl --context "${CTX_CLUSTER1}" waypoint apply --name waypoint --for service -n sample --wait
+$ istioctl --context "${CTX_CLUSTER2}" waypoint apply --name waypoint --for service -n sample --wait
+{{< /text >}}
+
+确认集群 `cluster1` 上的 waypoint 代理部署状态：
+
+{{< text bash >}}
+$ kubectl --context "${CTX_CLUSTER1}" get deployment waypoint --namespace sample
+NAME       READY   UP-TO-DATE   AVAILABLE   AGE
+waypoint   1/1     1            1           137m
+{{< /text >}}
+
+确认集群 `cluster2` 上的 waypoint 代理部署状态：
+
+{{< text bash >}}
+$ kubectl --context "${CTX_CLUSTER2}" get deployment waypoint --namespace sample
+NAME       READY   UP-TO-DATE   AVAILABLE   AGE
+waypoint   1/1     1            1           138m
+{{< /text >}}
+
+请等待所有 waypoint 代理准备就绪。
+
+在每个集群中配置 `HelloWorld` 服务以使用 waypoint 代理：
+
+{{< text bash >}}
+$ kubectl --context "${CTX_CLUSTER1}" label svc helloworld -n sample istio.io/use-waypoint=waypoint
+$ kubectl --context "${CTX_CLUSTER2}" label svc helloworld -n sample istio.io/use-waypoint=waypoint
+{{< /text >}}
+
+最后，这一步专门针对 waypoint 代理的多集群部署，将每个集群中的 waypoint
+代理服务标记为全局服务，就像之前对 `HelloWorld` 服务所做的那样：
+
+{{< text bash >}}
+$ kubectl --context "${CTX_CLUSTER1}" label svc waypoint -n sample istio.io/global=true
+$ kubectl --context "${CTX_CLUSTER2}" label svc waypoint -n sample istio.io/global=true
+{{< /text >}}
+
+两个集群中的 `HelloWorld` 服务现在都配置为使用 waypoint 代理，
+但 waypoint 代理目前还没有任何实际作用。
+
+## 配置本地故障转移 {#configure-locality-failover}
+
+要配置本地故障转移，请在 `cluster1` 中创建并应用 `DestinationRule`：
+
+{{< text bash >}}
+$ kubectl --context "${CTX_CLUSTER1}" apply -n sample -f - <<EOF
+apiVersion: networking.istio.io/v1
+kind: DestinationRule
+metadata:
+  name: helloworld
+spec:
+  host: helloworld.sample.svc.cluster.local
+  trafficPolicy:
+    outlierDetection:
+      consecutive5xxErrors: 1
+      interval: 1s
+      baseEjectionTime: 1m
+    loadBalancer:
+      simple: ROUND_ROBIN
+      localityLbSetting:
+        enabled: true
+        failoverPriority:
+          - topology.istio.io/cluster
+EOF
+{{< /text >}}
+
+在 `cluster2` 中也应用相同的 `DestinationRule`：
+
+{{< text bash >}}
+$ kubectl --context "${CTX_CLUSTER2}" apply -n sample -f - <<EOF
+apiVersion: networking.istio.io/v1
+kind: DestinationRule
+metadata:
+  name: helloworld
+spec:
+  host: helloworld.sample.svc.cluster.local
+  trafficPolicy:
+    outlierDetection:
+      consecutive5xxErrors: 1
+      interval: 1s
+      baseEjectionTime: 1m
+    loadBalancer:
+      simple: ROUND_ROBIN
+      localityLbSetting:
+        enabled: true
+        failoverPriority:
+          - topology.istio.io/cluster
+EOF
+{{< /text >}}
+
+此 `DestinationRule` 配置以下内容：
+
+- 为 `HelloWorld` 服务配置[异常值检测](/zh/docs/reference/config/networking/destination-rule/#OutlierDetection)。
+  此规则指示 waypoint 代理如何识别服务的端点何时出现异常。这是故障转移正常运行所必需的。
+
+- [故障转移优先级](/zh/docs/reference/config/networking/destination-rule/#LocalityLoadBalancerSetting)指示
+  waypoint 代理在路由请求时如何确定端点的优先级。在本例中，
+  waypoint 代理将优先处理同一集群中的端点，而不是其他集群中的端点。
+
+有了这些策略，当端点与 waypoint 代理位于同一集群中且根据异常值检测配置被认为运行正常时，
+waypoint 代理将优先选择这些端点。
+
+## 验证流量是否保持在本地集群内 {#verify-traffic-stays-in-local-cluster}
+
+从 `cluster1` 上的 `curl` Pod 向 `HelloWorld` 服务发送请求：
+
+{{< text bash >}}
+$ kubectl exec --context "${CTX_CLUSTER1}" -n sample -c curl \
+    "$(kubectl get pod --context "${CTX_CLUSTER1}" -n sample -l \
+    app=curl -o jsonpath='{.items[0].metadata.name}')" \
+    -- curl -sS helloworld.sample:5000/hello
+{{< /text >}}
+
+现在，如果您多次重复此请求并验证 `HelloWorld` 版本应始终为 `v1`，
+因为流量始终位于 `cluster1` 中：
+
+{{< text plain >}}
+Hello version: v1, instance: helloworld-v1-954745fd-z6qcn
+Hello version: v1, instance: helloworld-v1-954745fd-z6qcn
+...
+{{< /text >}}
+
+同样地，从 `cluster2` 上的 `curl` Pod 多次发送请求：
+
+{{< text bash >}}
+$ kubectl exec --context "${CTX_CLUSTER2}" -n sample -c curl \
+    "$(kubectl get pod --context "${CTX_CLUSTER2}" -n sample -l \
+    app=curl -o jsonpath='{.items[0].metadata.name}')" \
+    -- curl -sS helloworld.sample:5000/hello
+{{< /text >}}
+
+通过查看响应中的版本信息，您应该可以看到所有请求都在 `cluster2` 中处理：
+
+{{< text plain >}}
+Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
+Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
+...
+{{< /text >}}
+
+## 验证故障转移到另一个集群 {#verify-failover-to-another-cluster}
+
+为了验证故障转移到从集群是否有效，通过缩减部署规模来模拟
+`cluster1` 中的 `HelloWorld` 服务中断：
+
+{{< text bash >}}
+$ kubectl --context "${CTX_CLUSTER1}" scale --replicas=0 deployment/helloworld-v1 -n sample
+{{< /text >}}
+
+再次从 `cluster1` 上的 `curl` Pod 向 `HelloWorld` 服务发送请求：
+
+{{< text bash >}}
+$ kubectl exec --context "${CTX_CLUSTER1}" -n sample -c curl \
+    "$(kubectl get pod --context "${CTX_CLUSTER1}" -n sample -l \
+    app=curl -o jsonpath='{.items[0].metadata.name}')" \
+    -- curl -sS helloworld.sample:5000/hello
+{{< /text >}}
+
+这次您应该会看到请求是由 `cluster2` 中的 `HelloWorld` 服务处理的，
+因为 `cluster1` 中没有可用的端点：
+
+{{< text plain >}}
+Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
+Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
+...
+{{< /text >}}
+
+**恭喜！**您已成功的在 Istio Ambient 多集群部署中配置本地故障转移！
@@ -204,4 +204,6 @@ Hello version: v1, instance: helloworld-v1-86f77cd7bd-cpxhv
 
 **恭喜！**您已成功在多个集群上安装并验证了 Istio！
 
-<!-- TODO: Link to guide for locality load balancing once we add waypoint instructions -->
+## 后续步骤 {#next-steps}
+
+为您的多集群部署配置[本地故障转移](/zh/docs/ambient/install/multicluster/failover)。