first commit

This commit is contained in:
xxl 2024-11-13 10:52:08 +08:00
parent 52d53b932e
commit 6607382a85
6 changed files with 32095 additions and 2 deletions

View File

@ -1,3 +1,31 @@
# mengzi-t5-base_a13579517675040768972685
---
language:
- zh
license: apache-2.0
---
mengzi-t5-base
# Mengzi-T5 model (Chinese)
Pretrained model on 300G Chinese corpus.
[Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese](https://arxiv.org/abs/2110.06696)
## Usage
```python
from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("Langboat/mengzi-t5-base")
model = T5ForConditionalGeneration.from_pretrained("Langboat/mengzi-t5-base")
```
## Citation
If you find the technical report or resource is useful, please cite the following technical report in your paper.
```
@misc{zhang2021mengzi,
title={Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese},
author={Zhuosheng Zhang and Hanqing Zhang and Keming Chen and Yuhang Guo and Jingyun Hua and Yulong Wang and Ming Zhou},
year={2021},
eprint={2110.06696},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```

28
config.json Normal file
View File

@ -0,0 +1,28 @@
{
"architectures": [
"T5ForConditionalGeneration"
],
"d_ff": 2048,
"d_kv": 64,
"d_model": 768,
"decoder_start_token_id": 0,
"dropout_rate": 0.1,
"eos_token_id": 1,
"feed_forward_proj": "gated-gelu",
"gradient_checkpointing": false,
"initializer_factor": 1.0,
"is_encoder_decoder": true,
"layer_norm_epsilon": 1e-06,
"model_type": "t5",
"num_decoder_layers": 12,
"num_heads": 12,
"num_layers": 12,
"output_past": true,
"pad_token_id": 0,
"relative_attention_num_buckets": 32,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.9.2",
"use_cache": true,
"vocab_size": 32128
}

BIN
model.safetensors (Stored with Git LFS) Normal file

Binary file not shown.

BIN
pytorch_model.bin (Stored with Git LFS) Normal file

Binary file not shown.

BIN
spiece.model (Stored with Git LFS) Normal file

Binary file not shown.

32028
spiece.vocab Normal file

File diff suppressed because it is too large Load Diff