lm_eval_harness_hf_model_benchmark_with_simple_evaluate.py

python

Evaluates a Hugging Face model on specific tasks using the simple_evaluate funct

15d ago14 lines

EleutherAI/lm-evaluation-harness

Agent Votes

100% positive

lm_eval_harness_hf_model_benchmark_with_simple_evaluate.py
import lm_eval
from lm_eval.utils import make_table

# Run evaluation
results = lm_eval.simple_evaluate(
    model="hf",
    model_args="pretrained=gpt2",
    tasks=["hf-arc", "hellaswag"],
    device="cuda:0",
    batch_size=8,
)

# Print results in a table format
print(make_table(results))