comment with trust_remote_code=True
This commit is contained in:
parent
bca86f8c8e
commit
d71b8c2284
42
README.md
42
README.md
|
@ -11,38 +11,16 @@ Read this in [English](README_en.md)
|
|||
|
||||
## 项目更新
|
||||
|
||||
- 🔥🔥 **News**: ```2024/11/01```: 支持了 GLM-4-9B-Chat-hf 和 GLM-4v-9B 模型在 vLLM 0.6.3 以上版本和 transformers 4.46.0 以上版本运行
|
||||
- 🔥🔥 **News**: ```2024/10/25```: 我们开源了端到端中英语音对话模型 [GLM-4-Voice](https://github.com/THUDM/GLM-4-Voice)
|
||||
- 🔥 **News**: ```2024/10/12```: 增加了 GLM-4v-9B 模型对vllm框架的支持
|
||||
- 🔥 **News**: ```2024/09/06```: 增加了在 GLM-4v-9B 模型上构建OpenAI API兼容的服务端
|
||||
- 🔥 **News**: ```2024/09/05``` 我们开源了使LLMs能够在长上下文问答中生成细粒度引用的模型 [longcite-glm4-9b](https://huggingface.co/THUDM/LongCite-glm4-9b)
|
||||
以及数据集 [LongCite-45k](https://huggingface.co/datasets/THUDM/LongCite-45k),
|
||||
欢迎在 [Huggingface Space](https://huggingface.co/spaces/THUDM/LongCite) 在线体验。
|
||||
- 🔥**News**: ```2024/09/04```: 增加了在 GLM-4-9B-Chat 模型上使用带有 Lora adapter 的 vLLM 演示代码
|
||||
- 🔥**News**: ```2024/08/15```: 我们开源具备长文本输出能力(单轮对话大模型输出可超过1万token)
|
||||
的模型 [longwriter-glm4-9b](https://huggingface.co/THUDM/LongWriter-glm4-9b)
|
||||
以及数据集 [LongWriter-6k](https://huggingface.co/datasets/THUDM/LongWriter-6k),
|
||||
欢迎在 [Huggingface Space](https://huggingface.co/spaces/THUDM/LongWriter)
|
||||
或 [魔搭社区空间](https://modelscope.cn/studios/ZhipuAI/LongWriter-glm4-9b-demo) 在线体验。
|
||||
- 🔥 **News**: ```2024/08/12```: GLM-4-9B-Chat 模型依赖的`transformers`版本升级到 `4.44.0`,请重新拉取除模型权重(
|
||||
`*.safetensor` 文件 和 `tokenizer.model`)外的文件并参考 `basic_demo/requirements.txt` 严格更新依赖。
|
||||
- 🔥 **News**: ```2024/07/24```:
|
||||
我们发布了与长文本相关的最新技术解读,关注 [这里](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85)
|
||||
查看我们在训练 GLM-4-9B 开源模型中关于长文本技术的技术报告。
|
||||
- 🔥 **News**: ``2024/7/16``: GLM-4-9B-Chat 模型依赖的`transformers`版本升级到 `4.42.4`,
|
||||
请更新模型配置文件并参考 `basic_demo/requirements.txt` 更新依赖。
|
||||
- 🔥 **News**: ``2024/7/9``: GLM-4-9B-Chat
|
||||
模型已适配 [Ollama](https://github.com/ollama/ollama),[Llama.cpp](https://github.com/ggerganov/llama.cpp)
|
||||
,您可以在[PR](https://github.com/ggerganov/llama.cpp/pull/8031) 查看具体的细节。
|
||||
- 🔥 **News**: ``2024/7/1``: 我们更新了 GLM-4V-9B 的微调,您需要更新我们的模型仓库的运行文件和配置文件,
|
||||
以支持这个功能,更多微调细节 (例如数据集格式,显存要求),请前往 [查看](finetune_demo)。
|
||||
- 🔥 **News**: ``2024/6/28``: 我们与英特尔技术团队合作,改进了 GLM-4-9B-Chat 的 ITREX 和 OpenVINO 部署教程。您可以使用英特尔
|
||||
CPU/GPU 设备高效部署 GLM-4-9B 开源模型。欢迎访问 [查看](intel_device_demo)。
|
||||
- 🔥 **News**: ``2024/6/24``: 我们更新了模型仓库的运行文件和配置文件,支持 Flash Attention 2,
|
||||
请更新模型配置文件并参考 `basic_demo/trans_cli_demo.py` 中的示例代码。
|
||||
- 🔥 **News**: ``2024/6/19``: 我们更新了模型仓库的运行文件和配置文件,修复了部分已知的模型推理的问题,欢迎大家克隆最新的模型仓库。
|
||||
- 🔥 **News**: ``2024/6/18``: 我们发布 [技术报告](https://arxiv.org/pdf/2406.12793), 欢迎查看。
|
||||
- 🔥 **News**: ``2024/6/05``: 我们发布 GLM-4-9B 系列开源模型
|
||||
- 🔥🔥 **News**: ```2024/11/01```: 本仓库依赖进行升级,请更新`requirements.txt`中的依赖以保证正常运行模型。[glm-4-9b-chat-hf](https://huggingface.co/THUDM/glm-4-9b-chat-hf) 是适配 `transformers>=4.46` 的模型权重,使用 transforemrs 库中的 `GlmModel` 类实现。
|
||||
同时,[glm-4-9b-chat](https://huggingface.co/THUDM/glm-4-9b-chat), [glm-4v-9b](https://huggingface.co/THUDM/glm-4v-9b) 中的 `tokenzier_chatglm.py` 已经更新以适配最新版本的 `transforemrs`库。请前往 HuggingFace 更新文件。
|
||||
- 🔥 **News**: ```2024/10/27```: 我们开源了 [LongReward](https://github.com/THUDM/LongReward),这是一个使用 AI 反馈改进长上下文大型语言模型。
|
||||
- 🔥 **News**: ```2024/10/25```: 我们开源了端到端中英语音对话模型 [GLM-4-Voice](https://github.com/THUDM/GLM-4-Voice)。
|
||||
- 🔥 **News**: ```2024/09/05``` 我们开源了使LLMs能够在长上下文问答中生成细粒度引用的模型 [longcite-glm4-9b](https://huggingface.co/THUDM/LongCite-glm4-9b) 以及数据集 [LongCite-45k](https://huggingface.co/datasets/THUDM/LongCite-45k), 欢迎在 [Huggingface Space](https://huggingface.co/spaces/THUDM/LongCite) 在线体验。
|
||||
- 🔥**News**: ```2024/08/15```: 我们开源具备长文本输出能力(单轮对话大模型输出可超过1万token) 的模型 [longwriter-glm4-9b](https://huggingface.co/THUDM/LongWriter-glm4-9b) 以及数据集 [LongWriter-6k](https://huggingface.co/datasets/THUDM/LongWriter-6k), 欢迎在 [Huggingface Space](https://huggingface.co/spaces/THUDM/LongWriter) 或 [魔搭社区空间](https://modelscope.cn/studios/ZhipuAI/LongWriter-glm4-9b-demo) 在线体验。
|
||||
- 🔥 **News**: ```2024/07/24```: 我们发布了与长文本相关的最新技术解读,关注 [这里](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85) 查看我们在训练 GLM-4-9B 开源模型中关于长文本技术的技术报告。
|
||||
- 🔥 **News**: ``2024/07/09``: GLM-4-9B-Chat 模型已适配 [Ollama](https://github.com/ollama/ollama), [Llama.cpp](https://github.com/ggerganov/llama.cpp),您可以在 [PR](https://github.com/ggerganov/llama.cpp/pull/8031) 查看具体的细节。
|
||||
- 🔥 **News**: ``2024/06/18``: 我们发布 [技术报告](https://arxiv.org/pdf/2406.12793), 欢迎查看。
|
||||
- 🔥 **News**: ``2024/06/05``: 我们发布 GLM-4-9B 系列开源模型。
|
||||
|
||||
## 模型介绍
|
||||
|
||||
|
|
86
README_en.md
86
README_en.md
|
@ -8,47 +8,36 @@
|
|||
</p>
|
||||
|
||||
## Update
|
||||
- 🔥🔥 **News**: ```2024/11/01```: Support for GLM-4-9B-Chat-hf and GLM-4v-9B models on vLLM >= 0.6.3 and transformers >= 4.46.0
|
||||
- 🔥🔥 **News**: ```2024/10/25```: We have open-sourced the end-to-end Chinese-English voice dialogue model [GLM-4-Voice](https://github.com/THUDM/GLM-4-Voice).
|
||||
- 🔥 **News**: ```2024/10/12```: Add GLM-4v-9B model support for vllm framework.
|
||||
- 🔥 **News**: ```2024/09/06```: Add support for OpenAI API server on the GLM-4v-9B model.
|
||||
- 🔥 **News**: ```2024/09/05```: We open-sourced a model enabling LLMs to generate fine-grained citations in
|
||||
long-context Q&A: [longcite-glm4-9b](https://huggingface.co/THUDM/LongCite-glm4-9b), along with the
|
||||
dataset [LongCite-14k](https://huggingface.co/datasets/THUDM/LongCite-45k). You are welcome to experience it online
|
||||
|
||||
- 🔥🔥 **News**: ```2024/11/01```: Dependencies have been updated in this repository. Please update the dependencies in
|
||||
`requirements.txt` to ensure the model runs correctly. The model weights
|
||||
for [glm-4-9b-chat-hf](https://huggingface.co/THUDM/glm-4-9b-chat-hf) are compatible with `transformers>=4.46` and can
|
||||
be implemented using the `GlmModel` class in the transformers library. Additionally, `tokenizer_chatglm.py`
|
||||
in [glm-4-9b-chat](https://huggingface.co/THUDM/glm-4-9b-chat) and [glm-4v-9b](https://huggingface.co/THUDM/glm-4v-9b)
|
||||
has been updated for the latest version of `transformers`. Please update the files on HuggingFace.
|
||||
- 🔥 **News**: ```2024/10/27```: We have open-sourced [LongReward](https://github.com/THUDM/LongReward), a model that
|
||||
uses AI feedback to enhance long-context large language models.
|
||||
- 🔥 **News**: ```2024/10/25```: We have open-sourced the end-to-end Mandarin-English voice dialogue
|
||||
model [GLM-4-Voice](https://github.com/THUDM/GLM-4-Voice).
|
||||
- 🔥 **News**: ```2024/09/05```: We have open-sourced [longcite-glm4-9b](https://huggingface.co/THUDM/LongCite-glm4-9b),
|
||||
a model enabling LLMs to produce fine-grained citations in long-context Q&A, along with the
|
||||
dataset [LongCite-45k](https://huggingface.co/datasets/THUDM/LongCite-45k). Try it out online
|
||||
at [Huggingface Space](https://huggingface.co/spaces/THUDM/LongCite).
|
||||
- 🔥 **News**: ```2024/09/04```: Add demo code for using vLLM with LoRA adapter on the GLM-4-9B-Chat model.
|
||||
- 🔥 **News**: ```2024/08/15```: We have open-sourced a model with long-text output capability (single turn LLM output
|
||||
can exceed
|
||||
10K tokens) [longwriter-glm4-9b](https://huggingface.co/THUDM/LongWriter-glm4-9b) and the
|
||||
dataset [LongWriter-6k](https://huggingface.co/datasets/THUDM/LongWriter-6k). You're welcome
|
||||
to [try it online](https://huggingface.co/spaces/THUDM/LongWriter).
|
||||
- 🔥 **News**: ```2024/08/12```: The `transformers` version required for the GLM-4-9B-Chat model has been upgraded
|
||||
to `4.44.0`. Please pull all files again except for the model weights (`*.safetensor` files and `tokenizer.model`),
|
||||
and strictly update the dependencies as per `basic_demo/requirements.txt`.
|
||||
- 🔥 **News**: ```2024/07/24```: we released the latest technical interpretation related to long texts. Check
|
||||
out [here](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85) to view
|
||||
our
|
||||
technical report on long context technology in the training of the open-source GLM-4-9B model.
|
||||
- 🔥 **News**: ``2024/7/16``: The `transformers` version that the GLM-4-9B-Chat model depends on has been upgraded
|
||||
to `4.42.4`. Please update the model configuration file and refer to `basic_demo/requirements.txt` to update the
|
||||
dependencies.
|
||||
- 🔥 **News**: ``2024/7/9``: The GLM-4-9B-Chat model has been adapted to [Ollama](https://github.com/ollama/ollama)
|
||||
and [Llama.cpp](https://github.com/ggerganov/llama.cpp), you can check the specific details
|
||||
in [PR](https://github.com/ggerganov/llama.cpp/pull/8031).
|
||||
- 🔥 **News**: ``2024/7/1``: We have updated the multimodal fine-tuning of GLM-4V-9B. You need to update the run file and
|
||||
configuration file of our model repository to support this feature. For more fine-tuning details (such as dataset
|
||||
format, video memory requirements), please go to [view](finetune_demo).
|
||||
- 🔥 **News**: ``2024/6/28``: We have worked with the Intel technical team to improve the ITREX and OpenVINO deployment
|
||||
tutorials for GLM-4-9B-Chat. You can use Intel CPU/GPU devices to efficiently deploy the GLM-4-9B open source model.
|
||||
Welcome to [view](intel_device_demo).
|
||||
- 🔥 **News**: ``2024/6/24``: We have updated the running files and configuration files of the model repository to
|
||||
support Flash Attention 2, Please update the model configuration file and refer to the sample code
|
||||
in `basic_demo/trans_cli_demo.py`.
|
||||
- 🔥 **News**: ``2024/6/19``: We updated the running files and configuration files of the model repository and fixed some
|
||||
model inference issues. Welcome to clone the latest model repository.
|
||||
- 🔥 **News**: ``2024/6/18``: We released a [technical report](https://arxiv.org/pdf/2406.12793), welcome to check it
|
||||
out.
|
||||
- 🔥 **News**: ``2024/6/05``: We released the GLM-4-9B series of open source models
|
||||
- 🔥 **News**: ```2024/08/15```: We have
|
||||
open-sourced [longwriter-glm4-9b](https://huggingface.co/THUDM/LongWriter-glm4-9b), a model capable of generating over
|
||||
10,000 tokens in single-turn dialogue, along with the
|
||||
dataset [LongWriter-6k](https://huggingface.co/datasets/THUDM/LongWriter-6k). Experience it online
|
||||
at [Huggingface Space](https://huggingface.co/spaces/THUDM/LongWriter) or
|
||||
the [ModelScope Community Space](https://modelscope.cn/studios/ZhipuAI/LongWriter-glm4-9b-demo).
|
||||
- 🔥 **News**: ```2024/07/24```: We published the latest technical insights on long-text processing. Check out our
|
||||
technical report on training the open-source GLM-4-9B model for long
|
||||
texts [here](https://medium.com/@ChatGLM/glm-long-scaling-pre-trained-model-contexts-to-millions-caa3c48dea85).
|
||||
- 🔥 **News**: ```2024/07/09```: The GLM-4-9B-Chat model is now compatible
|
||||
with [Ollama](https://github.com/ollama/ollama) and [Llama.cpp](https://github.com/ggerganov/llama.cpp). See detailed
|
||||
information in this [PR](https://github.com/ggerganov/llama.cpp/pull/8031).
|
||||
- 🔥 **News**: ```2024/06/18```: We have released a [technical report](https://arxiv.org/pdf/2406.12793), available for
|
||||
viewing.
|
||||
- 🔥 **News**: ```2024/06/05```: We released the GLM-4-9B series of open-source models.
|
||||
|
||||
## Model Introduction
|
||||
|
||||
|
@ -67,15 +56,14 @@ GPT-4-turbo-2024-04-09, Gemini 1.0 Pro, Qwen-VL-Max, and Claude 3 Opus.
|
|||
|
||||
## Model List
|
||||
|
||||
| Model | Type | Seq Length | Transformers | vLLM | Download | Online Demo |
|
||||
|:-------------------:|:----:|:----------:|:------------:|:--------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
|
||||
| GLM-4-9B | Base | 8K | <= 4.45 | <= 0.6.2 | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/glm-4-9b) | / |
|
||||
| GLM-4-9B-Chat | Chat | 128K | <= 4.45 | <= 0.6.2 | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/GLM-4-9B-Chat) | [🤖 ModelScope CPU](https://modelscope.cn/studios/dash-infer/GLM-4-Chat-DashInfer-Demo/summary)<br> [🤖 ModelScope vLLM](https://modelscope.cn/studios/ZhipuAI/glm-4-9b-chat-vllm/summary) |
|
||||
| GLM-4-9B-Chat-HF | Chat | 128K | >= 4.46 | <= 0.6.2 | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat-hf)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-hf) | [🤖 ModelScope CPU](https://modelscope.cn/studios/dash-infer/GLM-4-Chat-DashInfer-Demo/summary)<br> [🤖 ModelScope vLLM](https://modelscope.cn/studios/ZhipuAI/glm-4-9b-chat-vllm/summary) |
|
||||
| GLM-4-9B-Chat-1M | Chat | 1M | <= 4.45 | <= 0.6.2 | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat-1m)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-1m)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/GLM-4-9B-Chat-1M) | / |
|
||||
| GLM-4-9B-Chat-1M-HF | Chat | 1M | >= 4.46 | <= 0.6.2 | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat-1m-hf)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-1m-hf) | / |
|
||||
| GLM-4V-9B | Chat | 8K | >= 4.46 | >= 0.6.3 | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4v-9b)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4v-9b)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/GLM-4V-9B) | [🤖 ModelScope](https://modelscope.cn/studios/ZhipuAI/glm-4v-9b-Demo/summary) |
|
||||
|
||||
| Model | Type | Seq Length | Transformers | Download | Online Demo |
|
||||
|:-------------------:|:----:|:----------:|:-------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
|
||||
| GLM-4-9B | Base | 8K | `4.44 - 4.45` | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/glm-4-9b) | / |
|
||||
| GLM-4-9B-Chat | Chat | 128K | `4.44 - 4.45` | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/GLM-4-9B-Chat) | [🤖 ModelScope CPU](https://modelscope.cn/studios/dash-infer/GLM-4-Chat-DashInfer-Demo/summary)<br> [🤖 ModelScope vLLM](https://modelscope.cn/studios/ZhipuAI/glm-4-9b-chat-vllm/summary) |
|
||||
| GLM-4-9B-Chat-HF | Chat | 128K | `>= 4.46.0` | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat-hf)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-hf) | [🤖 ModelScope CPU](https://modelscope.cn/studios/dash-infer/GLM-4-Chat-DashInfer-Demo/summary)<br> [🤖 ModelScope vLLM](https://modelscope.cn/studios/ZhipuAI/glm-4-9b-chat-vllm/summary) |
|
||||
| GLM-4-9B-Chat-1M | Chat | 1M | `4.44 - 4.45` | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat-1m)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-1m)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/GLM-4-9B-Chat-1M) | / |
|
||||
| GLM-4-9B-Chat-1M-HF | Chat | 1M | `>= 4.46.0` | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4-9b-chat-1m-hf)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat-1m-hf) | / |
|
||||
| GLM-4V-9B | Chat | 8K | `>= 4.46.0` | [🤗 Huggingface](https://huggingface.co/THUDM/glm-4v-9b)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/glm-4v-9b)<br> [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/GLM-4V-9B) | [🤖 ModelScope](https://modelscope.cn/studios/ZhipuAI/glm-4v-9b-Demo/summary) |
|
||||
|
||||
## BenchMarkß
|
||||
|
||||
|
|
|
@ -11,21 +11,24 @@ ensuring that the CLI interface displays formatted text correctly.
|
|||
|
||||
If you use flash attention, you should install the flash-attn and add attn_implementation="flash_attention_2" in model loading.
|
||||
|
||||
Note:
|
||||
Using with glm-4-9b-chat-hf will require `transformers>=4.46.0".
|
||||
"""
|
||||
|
||||
import torch
|
||||
from threading import Thread
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
|
||||
|
||||
MODEL_PATH = "THUDM/glm-4-9b-chat-hf"
|
||||
MODEL_PATH = "THUDM/glm-4-9b-chat"
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
|
||||
# trust_remote_code=True is needed if you using with `glm-4-9b-chat`
|
||||
# Not use if you using with `glm-4-9b-chat-hf`
|
||||
# both tokenizer and model should consider with this issue.
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
|
||||
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
MODEL_PATH, # attn_implementation="flash_attention_2", # Use Flash Attention
|
||||
torch_dtype=torch.bfloat16, # using flash-attn must use bfloat16 or float16
|
||||
trust_remote_code=True,
|
||||
device_map="auto").eval()
|
||||
|
||||
|
||||
|
|
|
@ -10,19 +10,20 @@ Note: The script includes a modification to handle markdown to plain text conver
|
|||
ensuring that the CLI interface displays formatted text correctly.
|
||||
"""
|
||||
|
||||
import os
|
||||
import torch
|
||||
from threading import Thread
|
||||
from transformers import (
|
||||
AutoTokenizer,
|
||||
StoppingCriteria,
|
||||
StoppingCriteriaList,
|
||||
TextIteratorStreamer, AutoModel, BitsAndBytesConfig
|
||||
TextIteratorStreamer,
|
||||
AutoModel,
|
||||
BitsAndBytesConfig
|
||||
)
|
||||
|
||||
from PIL import Image
|
||||
|
||||
MODEL_PATH = os.environ.get('MODEL_PATH', 'THUDM/glm-4v-9b')
|
||||
MODEL_PATH = "THUDM/glm-4v-9b"
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(
|
||||
MODEL_PATH,
|
||||
|
|
Loading…
Reference in New Issue