[doc]fix readme for kv pool user guide (#4271)

Pz1116 · web-flow · commit d43022f3ed28 · 2025-11-19T15:57:50.000+08:00
### What this PR does / why we need it? Add the parameter "register_buffer" for PD Aggregated Scenario in the given example. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: Pz1116 <zpbzpb123123@gmail.com>
diff --git a/docs/source/user_guide/feature_guide/kv_pool_mooncake.md b/docs/source/user_guide/feature_guide/kv_pool_mooncake.md
@@ -266,12 +266,15 @@ python3 -m vllm.entrypoints.openai.api_server \
     "kv_connector": "MooncakeConnectorStoreV1",
     "kv_role": "kv_both",
     "kv_connector_extra_config": {
+        "register_buffer": true,
         "use_layerwise": false,
         "mooncake_rpc_port":"0"
     }
 }' > mix.log 2>&1
 ```
 
+`register_buffer` is set to `false` by default and need to be set to `true` only in PD-mixed scenario.
+
 ### 2. Run Inference
 
 Configure the localhost, port, and model weight path in the command to your own settings. The requests sent will only go to the port where the mixed deployment script is located, and there is no need to start a separate proxy.