first commit

This commit is contained in:
xxl 2025-03-04 11:45:59 +08:00
parent 27d74e4a10
commit 09063ce8ab
15 changed files with 148756 additions and 1 deletions

187
README.md
View File

@ -1,3 +1,188 @@
---
license: apache-2.0
language:
- en
base_model:
- ibm-granite/granite-3.1-2b-instruct
library_name: transformers
---
# granite-vision-3.2-2b # granite-vision-3.2-2b
granite-vision-3.2-2b **Model Summary:**
granite-vision-3.2-2b is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more. The model was trained on a meticulously curated instruction-following dataset, comprising diverse public datasets and synthetic datasets tailored to support a wide range of document understanding and general image tasks. It was trained by fine-tuning a Granite large language model with both image and text modalities.
**Evaluations:**
We evaluated Granite Vision 3.2 alongside other vision-language models (VLMs) in the 1B-4B parameter range using the standard llms-eval benchmark. The evaluation spanned multiple public benchmarks, with particular emphasis on document understanding tasks while also including general visual question-answering benchmarks.
| | Molmo-E | InternVL2 | Phi3v | Phi3.5v | Granite Vision |
|-----------|--------------|----------------|-------------|------------|------------|
| **Document benchmarks** |
| DocVQA | 0.66 | 0.87 | 0.87 | 0.88 | **0.89** |
| ChartQA | 0.60 | 0.75 | 0.81 | 0.82 | **0.87** |
| TextVQA | 0.62 | 0.72 | 0.69 | 0.7 | **0.78** |
| AI2D | 0.63 | 0.74 | **0.79** | **0.79** | 0.76 |
| InfoVQA | 0.44 | 0.58 | 0.55 | 0.61 | **0.64** |
| OCRBench | 0.65 | 0.75 | 0.64 | 0.64 | **0.77** |
| LiveXiv VQA | 0.47 | 0.51 | **0.61** | - | **0.61** |
| LiveXiv TQA | 0.36 | 0.38 | 0.48 | - | **0.57** |
| **Other benchmarks** |
| MMMU | 0.32 | 0.35 | 0.42 | **0.44** | 0.37 |
| VQAv2 | 0.57 | 0.75 | 0.76 | 0.77 | **0.78** |
| RealWorldQA | 0.55 | 0.34 | 0.60 | 0.58 | **0.63** |
| VizWiz VQA | 0.49 | 0.46 | 0.57 | 0.57 | **0.63** |
| OK VQA | 0.40 | 0.44 | 0.51 | 0.53 | **0.56** |
- **Paper:** [Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence](https://arxiv.org/abs/2502.09927)
- **Release Date**: Feb 26th, 2025
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
**Supported Input Format:**
Currently the model supports English instructions and images (png, jpeg, etc.) as input format.
**Intended Use:**
The model is intended to be used in enterprise applications that involve processing visual and text data. In particular, the model is well-suited for a range of visual document understanding tasks, such as analyzing tables and charts, performing optical character recognition (OCR), and answering questions based on document content. Additionally, its capabilities extend to general image understanding, enabling it to be applied to a broader range of business applications. For tasks that exclusively involve text-based input, we suggest using our Granite large language models, which are optimized for text-only processing and offer superior performance compared to this model.
## Generation:
Granite Vision model is supported natively `transformers>=4.49`. Below is a simple example of how to use the `granite-vision-3.2-2b` model.
### Usage with `transformers`
First, make sure to build the latest verions of transormers:
```shell
pip install transformers>=4.49
```
Then run the code:
```python
from transformers import AutoProcessor, AutoModelForVision2Seq
from huggingface_hub import hf_hub_download
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model_path = "ibm-granite/granite-vision-3.2-2b"
processor = AutoProcessor.from_pretrained(model_path)
model = AutoModelForVision2Seq.from_pretrained(model_path).to(device)
# prepare image and text prompt, using the appropriate prompt template
img_path = hf_hub_download(repo_id=model_path, filename='example.png')
conversation = [
{
"role": "user",
"content": [
{"type": "image", "url": img_path},
{"type": "text", "text": "What is the highest scoring model on ChartQA and what is its score?"},
],
},
]
inputs = processor.apply_chat_template(
conversation,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt"
).to(device)
# autoregressively complete prompt
output = model.generate(**inputs, max_new_tokens=100)
print(processor.decode(output[0], skip_special_tokens=True))
```
### Usage with vLLM
The model can also be loaded with `vLLM`. First make sure to install the following libraries:
```shell
pip install torch torchvision torchaudio
pip install vllm==0.6.6
```
Then, copy the snippet from the section that is relevant for your use case.
```python
from vllm import LLM, SamplingParams
from vllm.assets.image import ImageAsset
from huggingface_hub import hf_hub_download
from PIL import Image
model_path = "ibm-granite/granite-vision-3.2-2b"
model = LLM(
model=model_path,
limit_mm_per_prompt={"image": 1},
)
sampling_params = SamplingParams(
temperature=0.2,
max_tokens=64,
)
# Define the question we want to answer and format the prompt
image_token = "<image>"
system_prompt = "<|system|>\nA chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\n"
question = "What is the highest scoring model on ChartQA and what is its score?"
prompt = f"{system_prompt}<|user|>\n{image_token}\n{question}\n<|assistant|>\n"
img_path = hf_hub_download(repo_id=model_path, filename='example.png')
image = Image.open(img_path).convert("RGB")
print(image)
# Build the inputs to vLLM; the image is passed as `multi_modal_data`.
inputs = {
"prompt": prompt,
"multi_modal_data": {
"image": image,
}
}
outputs = model.generate(inputs, sampling_params=sampling_params)
print(f"Generated text: {outputs[0].outputs[0].text}")
```
### Fine-tuning
For an example of fine-tuning Granite Vision for new tasks refer to [this notebook](https://huggingface.co/learn/cookbook/en/fine_tuning_granite_vision_sft_trl).
### Use Granite Vision for MM RAG
For an example of MM RAG using granite vision refer to [this notebook](https://github.com/ibm-granite-community/granite-snack-cookbook/blob/main/recipes/RAG/Granite_Multimodal_RAG.ipynb).
**Model Architecture:**
The architecture of granite-vision-3.2-2b consists of the following components:
(1) Vision encoder: SigLIP (https://huggingface.co/docs/transformers/en/model_doc/siglip).
(2) Vision-language connector: two-layer MLP with gelu activation function.
(3) Large language model: granite-3.1-2b-instruct with 128k context length (https://huggingface.co/ibm-granite/granite-3.1-2b-instruct).
We built upon LLaVA (https://llava-vl.github.io) to train our model. We use multi-layer encoder features and a denser grid resolution in AnyRes to enhance the model's ability to understand nuanced visual content, which is essential for accurately interpreting document images.
**Training Data:**
Overall, our training data is largely comprised of two key sources: (1) publicly available datasets (2) internally created synthetic data targeting specific capabilities including document understanding tasks. A detailed attribution of datasets can be found in the [technical report](https://arxiv.org/abs/2502.09927).
**Infrastructure:**
We train Granite Vision using IBM's super computing cluster, Blue Vela, which is outfitted with NVIDIA H100 GPUs. This cluster provides a scalable and efficient infrastructure for training our models over thousands of GPUs.
**Ethical Considerations and Limitations:**
The use of Large Vision and Language Models involves risks and ethical considerations people must be aware of, including but not limited to: bias and fairness, misinformation, and autonomous decision-making. granite-vision-3.2-2b is not the exception in this regard. Although our alignment processes include safety considerations, the model may in some cases produce inaccurate, biased, or unsafe responses to user prompts.
Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in generation scenarios due to their reduced sizes, which could limit their ability to generate coherent and contextually accurate responses.
This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain. Regarding ethics, a latent risk associated with all Large Language Models is their malicious utilization. We urge the community to use granite-vision-3.2-2b with ethical intentions and in a responsible way. We recommend using this model for document understanding tasks, and note that more general vision tasks may pose higher inherent risks of triggering biased or harmful output.
To enhance safety, we recommend using granite-vision-3.2-2b alongside Granite Guardian. Granite Guardian is a fine-tuned instruct model designed to detect and flag risks in prompts and responses across key dimensions outlined in the IBM AI Risk Atlas. Its training, which includes both human-annotated and synthetic data informed by internal red-teaming, enables it to outperform similar open-source models on standard benchmarks, providing an additional layer of safety.
**Resources**
- 📄 Read the full technical report [here](https://arxiv.org/abs/2502.09927)
- ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
- 🚀 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
- 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources

6
added_tokens.json Normal file
View File

@ -0,0 +1,6 @@
{
"<|end_of_role|>": 49153,
"<|start_of_role|>": 49152,
"<|tool_call|>": 49154,
"<image>": "49155"
}

3
chat_template.json Normal file
View File

@ -0,0 +1,3 @@
{
"chat_template": "{%- if tools %}\n {{- '<|start_of_role|>available_tools<|end_of_role|>\n' }}\n {%- for tool in tools %}\n {{- tool | tojson(indent=4) }}\n {%- if not loop.last %}\n {{- '\n\n' }}\n {%- endif %}\n {%- endfor %}\n {{- '<|end_of_text|>\n' }}\n{%- endif %}\n{%- for message in messages if message['role'] == 'system'%}{% else %}<|system|>\nA chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\n{% endfor %}{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n {{- '<|system|>\n' + message['content'][0]['text'] + '\n' }}\n {%- elif message['role'] == 'user' %}<|user|>\n {# Render all images first #}{% for content in message['content'] | selectattr('type', 'equalto', 'image') %}{{ '<image>\n' }}{% endfor %}{# Render all text next #}{% for content in message['content'] | selectattr('type', 'equalto', 'text') %}{{ content['text'] + '\n' }}{% endfor %}\n{%- elif message['role'] == 'assistant' %}\n {{- '<|assistant|>\n' + message['content'][0]['text'] + '<|end_of_text|>' }}\n {%- elif message['role'] == 'assistant_tool_call' %}\n {{- '<|start_of_role|>assistant<|end_of_role|><|tool_call|>' + message['content'][0]['text'] + '<|end_of_text|>\n' }}\n {%- elif message['role'] == 'tool_response' %}\n {{- '<|start_of_role|>tool_response<|end_of_role|>' + message['content'][0]['text'] + '<|end_of_text|>\n' }}\n {%- endif %}\n {%- if loop.last and add_generation_prompt %}\n {{- '<|assistant|>\n' }}\n {%- endif %}\n{%- endfor %}"
}

168
config.json Normal file
View File

@ -0,0 +1,168 @@
{
"image_grid_pinpoints": [
[
384,
384
],
[
384,
768
],
[
384,
1152
],
[
384,
1536
],
[
384,
1920
],
[
384,
2304
],
[
384,
2688
],
[
384,
3072
],
[
384,
3456
],
[
384,
3840
],
[
768,
384
],
[
768,
768
],
[
768,
1152
],
[
768,
1536
],
[
768,
1920
],
[
1152,
384
],
[
1152,
768
],
[
1152,
1152
],
[
1536,
384
],
[
1536,
768
],
[
1920,
384
],
[
1920,
768
],
[
2304,
384
],
[
2688,
384
],
[
3072,
384
],
[
3456,
384
],
[
3840,
384
]
],
"tie_word_embeddings": true,
"transformers_version": "4.45.0.dev0",
"architectures": [
"LlavaNextForConditionalGeneration"
],
"model_type": "llava_next",
"use_image_newline_parameter": true,
"vision_feature_layer": [
-24,
-20,
-12,
-1
],
"vision_feature_select_strategy": "full",
"text_config": {
"architectures": [
"GraniteForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.1,
"attention_multiplier": 0.015625,
"bos_token_id": 0,
"embedding_multiplier": 12.0,
"eos_token_id": 0,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 8192,
"logits_scaling": 8.0,
"max_position_embeddings": 16384,
"mlp_bias": false,
"model_type": "granite",
"num_attention_heads": 32,
"num_hidden_layers": 40,
"num_key_value_heads": 8,
"pad_token_id": 0,
"residual_multiplier": 0.22,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 300000,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.0.dev0",
"use_cache": true,
"vocab_size": 49156
},
"image_token_index": 49155,
"vision_config": {
"hidden_size": 1152,
"image_size": 384,
"intermediate_size": 4304,
"model_type": "siglip_vision_model",
"num_attention_heads": 16,
"num_hidden_layers": 27,
"patch_size": 14
}
}

BIN
example.png (Stored with Git LFS) Normal file

Binary file not shown.

7
generation_config.json Normal file
View File

@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 0,
"eos_token_id": 0,
"pad_token_id": 0,
"transformers_version": "4.45.0.dev0"
}

48892
merges.txt Normal file

File diff suppressed because it is too large Load Diff

BIN
model-00002-of-00002.safetensors (Stored with Git LFS) Normal file

Binary file not shown.

View File

@ -0,0 +1,822 @@
{
"metadata": {
"total_size": 5950789760
},
"weight_map": {
"language_model.model.embed_tokens.weight": "model-00001-of-00002.safetensors",
"image_newline": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.13.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.14.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.15.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.16.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.17.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.18.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.19.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.20.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.21.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.22.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.23.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.24.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.25.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.26.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.27.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.28.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.29.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.30.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.31.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.32.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.33.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.34.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.35.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.36.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.37.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.38.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.39.input_layernorm.weight": "model-00002-of-00002.safetensors",
"language_model.model.layers.39.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
"language_model.model.layers.39.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.39.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
"language_model.model.layers.39.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
"language_model.model.layers.39.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.39.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.39.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.39.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
"language_model.model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
"multi_modal_projector.linear_1.bias": "model-00002-of-00002.safetensors",
"multi_modal_projector.linear_1.weight": "model-00002-of-00002.safetensors",
"multi_modal_projector.linear_2.bias": "model-00002-of-00002.safetensors",
"multi_modal_projector.linear_2.weight": "model-00002-of-00002.safetensors",
"language_model.model.norm.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.embeddings.patch_embedding.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.embeddings.patch_embedding.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.embeddings.position_embedding.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.layer_norm1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.layer_norm1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.layer_norm2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.layer_norm2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.attention.in_proj_bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.attention.in_proj_weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.attention.out_proj.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.attention.out_proj.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.layernorm.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.layernorm.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.mlp.fc1.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.mlp.fc1.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.mlp.fc2.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.mlp.fc2.weight": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.head.probe": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.post_layernorm.bias": "model-00002-of-00002.safetensors",
"vision_tower.vision_model.post_layernorm.weight": "model-00002-of-00002.safetensors"
}
}

138
preprocessor_config.json Normal file
View File

@ -0,0 +1,138 @@
{
"crop_size": {
"height": 384,
"width": 384
},
"do_convert_rgb": null,
"do_normalize": true,
"do_rescale": true,
"do_resize": true,
"image_mean": [
0.5,
0.5,
0.5
],
"image_processor_type": "LlavaNextImageProcessor",
"image_std": [
0.5,
0.5,
0.5
],
"processor_class": "LlavaNextProcessor",
"resample": 3,
"rescale_factor": 0.00392156862745098,
"size": {
"height": 384,
"width": 384
},
"image_grid_pinpoints": [
[
384,
384
],
[
384,
768
],
[
384,
1152
],
[
384,
1536
],
[
384,
1920
],
[
384,
2304
],
[
384,
2688
],
[
384,
3072
],
[
384,
3456
],
[
384,
3840
],
[
768,
384
],
[
768,
768
],
[
768,
1152
],
[
768,
1536
],
[
768,
1920
],
[
1152,
384
],
[
1152,
768
],
[
1152,
1152
],
[
1536,
384
],
[
1536,
768
],
[
1920,
384
],
[
1920,
768
],
[
2304,
384
],
[
2688,
384
],
[
3072,
384
],
[
3456,
384
],
[
3840,
384
]
]
}

5
processor_config.json Normal file
View File

@ -0,0 +1,5 @@
{
"patch_size": 14,
"processor_class": "LlavaNextProcessor",
"vision_feature_select_strategy": "full"
}

35
special_tokens_map.json Normal file
View File

@ -0,0 +1,35 @@
{
"additional_special_tokens": [
"<|start_of_role|>",
"<|end_of_role|>",
"<|tool_call|>"
],
"bos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

98281
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

206
tokenizer_config.json Normal file
View File

@ -0,0 +1,206 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"0": {
"content": "<|end_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<fim_prefix>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "<fim_middle>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"3": {
"content": "<fim_suffix>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"4": {
"content": "<fim_pad>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"5": {
"content": "<filename>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"6": {
"content": "<gh_stars>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"7": {
"content": "<issue_start>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"8": {
"content": "<issue_comment>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"9": {
"content": "<issue_closed>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"10": {
"content": "<jupyter_start>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"11": {
"content": "<jupyter_text>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"12": {
"content": "<jupyter_code>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"13": {
"content": "<jupyter_output>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"14": {
"content": "<empty_output>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"15": {
"content": "<commit_before>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"16": {
"content": "<commit_msg>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"17": {
"content": "<commit_after>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"18": {
"content": "<reponame>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"49152": {
"content": "<|start_of_role|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"49153": {
"content": "<|end_of_role|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"49154": {
"content": "<|tool_call|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"49155": {
"content": "<image>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<|start_of_role|>",
"<|end_of_role|>",
"<|tool_call|>"
],
"bos_token": "<|end_of_text|>",
"chat_template": "{%- if tools %}\n {{- '<|start_of_role|>available_tools<|end_of_role|>\n' }}\n {%- for tool in tools %}\n {{- tool | tojson(indent=4) }}\n {%- if not loop.last %}\n {{- '\n\n' }}\n {%- endif %}\n {%- endfor %}\n {{- '<|end_of_text|>\n' }}\n{%- endif %}\n{%- for message in messages if message['role'] == 'system'%}{% else %}<|system|>\nA chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.\n{% endfor %}{%- for message in messages %}\n {%- if message['role'] == 'system' %}\n {{- '<|system|>\n' + message['content'] + '\n' }}\n {%- elif message['role'] == 'user' %}\n {{- '<|user|>\n' + message['content'] + '\n' }}\n {%- elif message['role'] == 'assistant' %}\n {{- '<|assistant|>\n' + message['content'] + '<|end_of_text|>' }}\n {%- elif message['role'] == 'assistant_tool_call' %}\n {{- '<|start_of_role|>assistant<|end_of_role|><|tool_call|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- elif message['role'] == 'tool_response' %}\n {{- '<|start_of_role|>tool_response<|end_of_role|>' + message['content'] + '<|end_of_text|>\n' }}\n {%- endif %}\n {%- if loop.last and add_generation_prompt %}\n {{- '<|assistant|>\n' }}\n {%- endif %}\n{%- endfor %}",
"clean_up_tokenization_spaces": true,
"eos_token": "<|end_of_text|>",
"errors": "replace",
"model_max_length": 16384,
"pad_token": "<|end_of_text|>",
"padding_side": "right",
"tokenizer_class": "GPT2Tokenizer",
"unk_token": "<|end_of_text|>",
"vocab_size": 49152
}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long