40 lines
1.3 KiB
Markdown
40 lines
1.3 KiB
Markdown
---
|
|
license: other
|
|
---
|
|
|
|
# Auto Regressive Thinker (Art) v0 3B
|
|
|
|
Art v0 3B is our inaugural model in the Art series, fine-tuned from **Qwen/Qwen2.5-3B-Instruct** using a specialized dataset generated with **Gemini 2.0 Flash Thinking**.
|
|
[Read more about the Art series](https://blog.agi-0.com/posts/art-series)
|
|
|
|
## Model Details
|
|
- **Base Model:** Qwen2.5-3B-Instruct
|
|
- **Architecture:** Transformer
|
|
- **Size:** 3B parameters
|
|
|
|
## Usage
|
|
|
|
The model incorporates a reasoning mechanism using specific tags:
|
|
```python
|
|
<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response
|
|
```
|
|
|
|
### Recommendations
|
|
- Use the model without quantization
|
|
- Use the tokenizer chat template
|
|
- Use a low temperature 0.1-0.3 and repetition_penalty of 1.1
|
|
|
|
## Training Details
|
|
This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.
|
|
|
|
## About Us
|
|
We are a community-funded AI research lab focused on advancing open-source AGI development. Our community members support us through Patreon donations.
|
|
|
|
## Community Access
|
|
Our supporters get exclusive access to:
|
|
- Training dataset
|
|
- Training code and methodology
|
|
- Behind-the-scenes development insights
|
|
- Future model previews
|
|
|
|
[Join Our Community](https://blog.agi-0.com/posts/join-us) |