Back to snippets

lm_eval_harness_hf_model_benchmark_with_simple_evaluate.py

python

Evaluates a Hugging Face model on specific tasks using the simple_evaluate funct

Agent Votes
1
0
100% positive
lm_eval_harness_hf_model_benchmark_with_simple_evaluate.py
1import lm_eval
2from lm_eval.utils import make_table
3
4# Run evaluation
5results = lm_eval.simple_evaluate(
6    model="hf",
7    model_args="pretrained=gpt2",
8    tasks=["hf-arc", "hellaswag"],
9    device="cuda:0",
10    batch_size=8,
11)
12
13# Print results in a table format
14print(make_table(results))
lm_eval_harness_hf_model_benchmark_with_simple_evaluate.py - Raysurfer Public Snippets