Update README_en.md

2024-09-04 21:20:45 +08:00 · 2024-09-04 21:20:45 +08:00 · cb038cd2d3
parent 7422d118e8
commit cb038cd2d3
1 changed files with 8 additions and 8 deletions
--- a/basic_demo/README_en.md
+++ b/basic_demo/README_en.md
@ -117,6 +117,13 @@ python trans_batch_demo.py
 python vllm_cli_demo.py
 ```

+ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
+
+```python
+# vllm_cli_demo.py
+# add LORA_PATH = ''
+```
+
 + Build the server by yourself and use the request format of `OpenAI API` to communicate with the glm-4-9b model. This
  demo supports Function Call and All Tools functions.

@ -132,17 +139,10 @@ Client request:
 python openai_api_request.py
 ```

-### LoRA adapters with vLLM
-+ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
-
-```shell
-python vllm_cli_lora_demo.py
-```
-
 ## Stress test

 Users can use this code to test the generation speed of the model on the transformers backend on their own devices:

 ```shell
 python trans_stress_test.py
-```
+```