AceMath-7B-Instruct

History

xxl 540df61db4 first commit		2025-01-21 10:48:19 +08:00
..
README.md	first commit	2025-01-21 10:48:19 +08:00
calculate_scores.py	first commit	2025-01-21 10:48:19 +08:00
grader.py	first commit	2025-01-21 10:48:19 +08:00

README.md

Introduction

This is the evaluation script used to reproduce math benchmarks scores for AceMath-1.5B/7B/72B-Instruct models based on their outputs. The benchmark can be downloaded from Qwen2.5-Math.

Calculate Scores

python calculate_scores.py