Update README_en.md

This commit is contained in:
sixgod 2024-09-04 21:20:45 +08:00 committed by GitHub
parent 7422d118e8
commit cb038cd2d3
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 8 additions and 8 deletions

View File

@ -117,6 +117,13 @@ python trans_batch_demo.py
python vllm_cli_demo.py python vllm_cli_demo.py
``` ```
+ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
```python
# vllm_cli_demo.py
# add LORA_PATH = ''
```
+ Build the server by yourself and use the request format of `OpenAI API` to communicate with the glm-4-9b model. This + Build the server by yourself and use the request format of `OpenAI API` to communicate with the glm-4-9b model. This
demo supports Function Call and All Tools functions. demo supports Function Call and All Tools functions.
@ -132,13 +139,6 @@ Client request:
python openai_api_request.py python openai_api_request.py
``` ```
### LoRA adapters with vLLM
+ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
```shell
python vllm_cli_lora_demo.py
```
## Stress test ## Stress test
Users can use this code to test the generation speed of the model on the transformers backend on their own devices: Users can use this code to test the generation speed of the model on the transformers backend on their own devices: