first commit
This commit is contained in:
parent
5c779f0aa9
commit
586431db13
46
README.md
46
README.md
|
@ -105,6 +105,52 @@ The EuroBERT family exhibits strong multilingual performance across domains and
|
||||||
<img src="img/long_context.png" width="100%" alt="EuroBERT" />
|
<img src="img/long_context.png" width="100%" alt="EuroBERT" />
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
### Suggested Fine-Tuning Hyperparameters
|
||||||
|
|
||||||
|
If you plan to fine-tune this model on some downstream tasks, you can follow the hyperparameters we found in our paper.
|
||||||
|
|
||||||
|
#### Base Hyperparameters (unchanged across tasks)
|
||||||
|
|
||||||
|
- Warmup Ratio: 0.1
|
||||||
|
- Learning Rate Scheduler: Linear
|
||||||
|
- Adam Beta 1: 0.9
|
||||||
|
- Adam Beta 2: 0.95
|
||||||
|
- Adam Epsilon: 1e-5
|
||||||
|
- Weight Decay: 0.1
|
||||||
|
|
||||||
|
#### Task-Specific Learning Rates
|
||||||
|
|
||||||
|
##### Sequence Classification:
|
||||||
|
|
||||||
|
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
||||||
|
|--------------------------------------|----------------|----------------|----------------|
|
||||||
|
| XNLI | 3.6e-05 | 3.6e-05 | 2.8e-05 |
|
||||||
|
| PAWS-X | 3.6e-05 | 4.6e-05 | 3.6e-05 |
|
||||||
|
| QAM | 3.6e-05 | 2.8e-05 | 2.2e-05 |
|
||||||
|
| AmazonReviews | 3.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||||
|
| MassiveIntent | 6.0e-05 | 4.6e-05 | 2.8e-05 |
|
||||||
|
| CodeDefect | 3.6e-05 | 2.8e-05 | 1.3e-05 |
|
||||||
|
| CodeComplexity | 3.6e-05 | 3.6e-05 | 1.0e-05 |
|
||||||
|
| MathShepherd | 7.7e-05 | 2.8e-05 | 1.7e-05 |
|
||||||
|
|
||||||
|
##### Sequence Regression:
|
||||||
|
|
||||||
|
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
||||||
|
|--------------------------|----------------|----------------|----------------|
|
||||||
|
| SeaHorse | 3.6e-05 | 3.6e-05 | 2.8e-05 |
|
||||||
|
| SummevalMultilingual | 3.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||||
|
| WMT | 2.8e-05 | 2.8e-05 | 1.3e-05 |
|
||||||
|
|
||||||
|
##### Retrieval:
|
||||||
|
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
||||||
|
|-----------------------------------------|----------------|----------------|----------------|
|
||||||
|
| MIRACL | 4.6e-05 | 3.6e-05 | 2.8e-05 |
|
||||||
|
| MLDR | 2.8e-05 | 2.2e-05 | 4.6e-05 |
|
||||||
|
| CC-News | 4.6e-05 | 4.6e-05 | 3.6e-05 |
|
||||||
|
| Wikipedia | 2.8e-05 | 3.6e-05 | 2.8e-05 |
|
||||||
|
| CodeSearchNet | 4.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||||
|
| CqaDupStackMath | 4.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||||
|
| MathFormula | 1.7e-05 | 3.6e-05 | 3.6e-05 |
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
|
|
Binary file not shown.
Binary file not shown.
Loading…
Reference in New Issue