diff --git a/basic_demo/README_en.md b/basic_demo/README_en.md
index 07086d1..412ee90 100644
--- a/basic_demo/README_en.md
+++ b/basic_demo/README_en.md
@@ -117,6 +117,13 @@ python trans_batch_demo.py
 python vllm_cli_demo.py
 ```
 
++ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
+
+```python
+# vllm_cli_demo.py
+# add LORA_PATH = ''
+```
+
 + Build the server by yourself and use the request format of `OpenAI API` to communicate with the glm-4-9b model. This
   demo supports Function Call and All Tools functions.
 
@@ -132,17 +139,10 @@ Client request:
 python openai_api_request.py
 ```
 
-### LoRA adapters with vLLM
-+ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
-
-```shell
-python vllm_cli_lora_demo.py
-```
-
 ## Stress test
 
 Users can use this code to test the generation speed of the model on the transformers backend on their own devices:
 
 ```shell
 python trans_stress_test.py
-```
\ No newline at end of file
+```