Art-v0-3B/README.md

40 lines
1.3 KiB
Markdown

---
license: other
---
# Auto Regressive Thinker (Art) v0 3B
Art v0 3B is our inaugural model in the Art series, fine-tuned from **Qwen/Qwen2.5-3B-Instruct** using a specialized dataset generated with **Gemini 2.0 Flash Thinking**.
[Read more about the Art series](https://blog.agi-0.com/posts/art-series)
## Model Details
- **Base Model:** Qwen2.5-3B-Instruct
- **Architecture:** Transformer
- **Size:** 3B parameters
## Usage
The model incorporates a reasoning mechanism using specific tags:
```python
<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response
```
### Recommendations
- Use the model without quantization
- Use the tokenizer chat template
- Use a low temperature 0.1-0.3 and repetition_penalty of 1.1
## Training Details
This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.
## About Us
We are a community-funded AI research lab focused on advancing open-source AGI development. Our community members support us through Patreon donations.
## Community Access
Our supporters get exclusive access to:
- Training dataset
- Training code and methodology
- Behind-the-scenes development insights
- Future model previews
[Join Our Community](https://blog.agi-0.com/posts/join-us)