first commit

This commit is contained in:
xxl 2025-01-13 11:55:54 +08:00
parent e84cfb1c1e
commit 23a167f414
4 changed files with 102 additions and 2 deletions

View File

@ -1,3 +1,96 @@
# rwkv-7-world ---
language:
- en
- zh
- fr
- es
- de
- pt
- ru
- it
- ja
- ko
- vi
- ar
tags:
- pytorch
- text-generation
- causal-lm
- rwkv
license: apache-2.0
datasets:
- HuggingFaceFW/fineweb-edu
- mlfoundations/dclm-baseline-1.0
- cerebras/SlimPajama-627B
- EleutherAI/pile
- bigcode/starcoderdata
- oscar-corpus/OSCAR-2301
---
rwkv-7-world # RWKV-7 World
Use rwkv pip package 0.8.28+ for RWKV-7 inference: https://pypi.org/project/rwkv/
Evals and more information: https://www.rwkv.com/
For developers: https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v7
Chat demo: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py
## Model Description
RWKV-7 trained on 100+ world languages (80% English, 10% multilang, 10% code).
World-v3 = 3.1T tokens
World-v2.9 = subsampled 2T tokens
World-v2.8 = subsampled 1T tokens
Recommended fine-tuning format (use \n for newlines):
```
User: xxxxxxxxxxxxxxx
Assistant: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
User: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
Assistant: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
```
A good chat prompt (better replace \n\n in xxx to \n, such that there will never be extra \n\n in response):
```
User: hi
Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
User: xxx
Assistant:
```
QA prompt (better replace \n\n in xxx to \n, such that there will never be extra \n\n in response):
```
Question: xxx
Answer:
```
and
```
Instruction: xxx
Input: xxx
Response:
```
!!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!
!!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!
!!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!

BIN
RWKV-x070-World-0.1B-v2.8-20241210-ctx4096.pth (Stored with Git LFS) Normal file

Binary file not shown.

BIN
RWKV-x070-World-0.4B-v2.9-20250107-ctx4096.pth (Stored with Git LFS) Normal file

Binary file not shown.

1
configuration.json Normal file
View File

@ -0,0 +1 @@
{"framework":"Pytorch","task":"text-generation"}