Update README_en.md
This commit is contained in:
parent
7422d118e8
commit
cb038cd2d3
|
@ -117,6 +117,13 @@ python trans_batch_demo.py
|
||||||
python vllm_cli_demo.py
|
python vllm_cli_demo.py
|
||||||
```
|
```
|
||||||
|
|
||||||
|
+ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
|
||||||
|
|
||||||
|
```python
|
||||||
|
# vllm_cli_demo.py
|
||||||
|
# add LORA_PATH = ''
|
||||||
|
```
|
||||||
|
|
||||||
+ Build the server by yourself and use the request format of `OpenAI API` to communicate with the glm-4-9b model. This
|
+ Build the server by yourself and use the request format of `OpenAI API` to communicate with the glm-4-9b model. This
|
||||||
demo supports Function Call and All Tools functions.
|
demo supports Function Call and All Tools functions.
|
||||||
|
|
||||||
|
@ -132,17 +139,10 @@ Client request:
|
||||||
python openai_api_request.py
|
python openai_api_request.py
|
||||||
```
|
```
|
||||||
|
|
||||||
### LoRA adapters with vLLM
|
|
||||||
+ use LoRA adapters with vLLM on GLM-4-9B-Chat model.
|
|
||||||
|
|
||||||
```shell
|
|
||||||
python vllm_cli_lora_demo.py
|
|
||||||
```
|
|
||||||
|
|
||||||
## Stress test
|
## Stress test
|
||||||
|
|
||||||
Users can use this code to test the generation speed of the model on the transformers backend on their own devices:
|
Users can use this code to test the generation speed of the model on the transformers backend on their own devices:
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
python trans_stress_test.py
|
python trans_stress_test.py
|
||||||
```
|
```
|
||||||
|
|
Loading…
Reference in New Issue