first commit
This commit is contained in:
parent
5c779f0aa9
commit
586431db13
46
README.md
46
README.md
|
@ -105,6 +105,52 @@ The EuroBERT family exhibits strong multilingual performance across domains and
|
|||
<img src="img/long_context.png" width="100%" alt="EuroBERT" />
|
||||
</div>
|
||||
|
||||
### Suggested Fine-Tuning Hyperparameters
|
||||
|
||||
If you plan to fine-tune this model on some downstream tasks, you can follow the hyperparameters we found in our paper.
|
||||
|
||||
#### Base Hyperparameters (unchanged across tasks)
|
||||
|
||||
- Warmup Ratio: 0.1
|
||||
- Learning Rate Scheduler: Linear
|
||||
- Adam Beta 1: 0.9
|
||||
- Adam Beta 2: 0.95
|
||||
- Adam Epsilon: 1e-5
|
||||
- Weight Decay: 0.1
|
||||
|
||||
#### Task-Specific Learning Rates
|
||||
|
||||
##### Sequence Classification:
|
||||
|
||||
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
||||
|--------------------------------------|----------------|----------------|----------------|
|
||||
| XNLI | 3.6e-05 | 3.6e-05 | 2.8e-05 |
|
||||
| PAWS-X | 3.6e-05 | 4.6e-05 | 3.6e-05 |
|
||||
| QAM | 3.6e-05 | 2.8e-05 | 2.2e-05 |
|
||||
| AmazonReviews | 3.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||
| MassiveIntent | 6.0e-05 | 4.6e-05 | 2.8e-05 |
|
||||
| CodeDefect | 3.6e-05 | 2.8e-05 | 1.3e-05 |
|
||||
| CodeComplexity | 3.6e-05 | 3.6e-05 | 1.0e-05 |
|
||||
| MathShepherd | 7.7e-05 | 2.8e-05 | 1.7e-05 |
|
||||
|
||||
##### Sequence Regression:
|
||||
|
||||
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
||||
|--------------------------|----------------|----------------|----------------|
|
||||
| SeaHorse | 3.6e-05 | 3.6e-05 | 2.8e-05 |
|
||||
| SummevalMultilingual | 3.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||
| WMT | 2.8e-05 | 2.8e-05 | 1.3e-05 |
|
||||
|
||||
##### Retrieval:
|
||||
| Dataset | EuroBERT-210m | EuroBERT-610m | EuroBERT-2.1B |
|
||||
|-----------------------------------------|----------------|----------------|----------------|
|
||||
| MIRACL | 4.6e-05 | 3.6e-05 | 2.8e-05 |
|
||||
| MLDR | 2.8e-05 | 2.2e-05 | 4.6e-05 |
|
||||
| CC-News | 4.6e-05 | 4.6e-05 | 3.6e-05 |
|
||||
| Wikipedia | 2.8e-05 | 3.6e-05 | 2.8e-05 |
|
||||
| CodeSearchNet | 4.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||
| CqaDupStackMath | 4.6e-05 | 2.8e-05 | 3.6e-05 |
|
||||
| MathFormula | 1.7e-05 | 3.6e-05 | 3.6e-05 |
|
||||
|
||||
## License
|
||||
|
||||
|
|
Binary file not shown.
Binary file not shown.
Loading…
Reference in New Issue