Art-v0-3B

Go to file

xxl 6bb4db91f9 first commit		2025-01-23 14:55:09 +08:00
.gitattributes	Add .gitattributes	2025-01-23 13:57:51 +08:00
LICENSE	first commit	2025-01-23 14:30:47 +08:00
README.md	first commit	2025-01-23 14:30:47 +08:00
added_tokens.json	first commit	2025-01-23 14:30:47 +08:00
config.json	first commit	2025-01-23 14:30:47 +08:00
generation_config.json	first commit	2025-01-23 14:30:47 +08:00
merges.txt	first commit	2025-01-23 14:30:47 +08:00
model-00001-of-00002.safetensors	first commit	2025-01-23 14:55:09 +08:00
model-00002-of-00002.safetensors	first commit	2025-01-23 14:55:09 +08:00
model.safetensors.index.json	first commit	2025-01-23 14:30:47 +08:00
special_tokens_map.json	first commit	2025-01-23 14:30:47 +08:00
tokenizer.json	first commit	2025-01-23 14:30:47 +08:00
tokenizer_config.json	first commit	2025-01-23 14:30:47 +08:00
vocab.json	first commit	2025-01-23 14:30:47 +08:00

README.md

license
other

Auto Regressive Thinker (Art) v0 3B

Art v0 3B is our inaugural model in the Art series, fine-tuned from Qwen/Qwen2.5-3B-Instruct using a specialized dataset generated with Gemini 2.0 Flash Thinking. Read more about the Art series

Model Details

Base Model: Qwen2.5-3B-Instruct
Architecture: Transformer
Size: 3B parameters

Usage

The model incorporates a reasoning mechanism using specific tags:

<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response

Recommendations

Use the model without quantization
Use the tokenizer chat template
Use a low temperature 0.1-0.3 and repetition_penalty of 1.1

Training Details

This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.

About Us

We are a community-funded AI research lab focused on advancing open-source AGI development. Our community members support us through Patreon donations.

Community Access

Our supporters get exclusive access to:

Training dataset
Training code and methodology
Behind-the-scenes development insights
Future model previews

Join Our Community