first commit

This commit is contained in:
xxl 2025-03-17 10:04:14 +08:00
parent 5c779f0aa9
commit 586431db13
3 changed files with 52 additions and 0 deletions

View File

@ -105,6 +105,52 @@ The EuroBERT family exhibits strong multilingual performance across domains and
<img src="img/long_context.png" width="100%" alt="EuroBERT" />
</div>
### Suggested Fine-Tuning Hyperparameters
If you plan to fine-tune this model on some downstream tasks, you can follow the hyperparameters we found in our paper.
#### Base Hyperparameters (unchanged across tasks)
- Warmup Ratio: 0.1
- Learning Rate Scheduler: Linear
- Adam Beta 1: 0.9
- Adam Beta 2: 0.95
- Adam Epsilon: 1e-5
- Weight Decay: 0.1
#### Task-Specific Learning Rates
##### Sequence Classification:
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|--------------------------------------|----------------|----------------|----------------|
| XNLI | 3.6e-05 | 3.6e-05 | 2.8e-05 |
| PAWS-X | 3.6e-05 | 4.6e-05 | 3.6e-05 |
| QAM | 3.6e-05 | 2.8e-05 | 2.2e-05 |
| AmazonReviews | 3.6e-05 | 2.8e-05 | 3.6e-05 |
| MassiveIntent | 6.0e-05 | 4.6e-05 | 2.8e-05 |
| CodeDefect | 3.6e-05 | 2.8e-05 | 1.3e-05 |
| CodeComplexity | 3.6e-05 | 3.6e-05 | 1.0e-05 |
| MathShepherd | 7.7e-05 | 2.8e-05 | 1.7e-05 |
##### Sequence Regression:
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|--------------------------|----------------|----------------|----------------|
| SeaHorse | 3.6e-05 | 3.6e-05 | 2.8e-05 |
| SummevalMultilingual | 3.6e-05 | 2.8e-05 | 3.6e-05 |
| WMT | 2.8e-05 | 2.8e-05 | 1.3e-05 |
##### Retrieval:
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|-----------------------------------------|----------------|----------------|----------------|
| MIRACL | 4.6e-05 | 3.6e-05 | 2.8e-05 |
| MLDR | 2.8e-05 | 2.2e-05 | 4.6e-05 |
| CC-News | 4.6e-05 | 4.6e-05 | 3.6e-05 |
| Wikipedia | 2.8e-05 | 3.6e-05 | 2.8e-05 |
| CodeSearchNet | 4.6e-05 | 2.8e-05 | 3.6e-05 |
| CqaDupStackMath | 4.6e-05 | 2.8e-05 | 3.6e-05 |
| MathFormula | 1.7e-05 | 3.6e-05 | 3.6e-05 |
## License

BIN
model.safetensors (Stored with Git LFS) Normal file

Binary file not shown.

BIN
pytorch_model.bin (Stored with Git LFS) Normal file

Binary file not shown.