Update README_en.md

2024-09-04 21:20:45 +08:00 · 2024-09-04 21:20:45 +08:00 · cb038cd2d3
parent 7422d118e8
commit cb038cd2d3
1 changed files with 8 additions and 8 deletions
--- a/basic_demo/README_en.md
+++ b/basic_demo/README_en.md
@ -117,6 +117,13 @@ python trans_batch_demo.py
 python vllm_cli_demo.py
 ```
 + use LoRA adapters with vLLM on GLM-4-9B-Chat model.
 ```python
 # vllm_cli_demo.py
 # add LORA_PATH = ''
 ```
 + Build the server by yourself and use the request format of `OpenAI API` to communicate with the glm-4-9b model. This
  demo supports Function Call and All Tools functions.
@ -132,13 +139,6 @@ Client request:
 python openai_api_request.py
 ```
 ### LoRA adapters with vLLM
 + use LoRA adapters with vLLM on GLM-4-9B-Chat model.
 ```shell
 python vllm_cli_lora_demo.py
 ```
 ## Stress test
 Users can use this code to test the generation speed of the model on the transformers backend on their own devices: