NLP LLM Leaderboard Hero

NLP LLM Leaderboard

Embrace Federated LLM Fine-Tuning and Secure Your Spot on the Leaderboard!

← Scroll →
RankTeamBase ModelComm. CostsAverage (↑)STEMSocial SciencesHumanitiesCodeDate
1
Gachon Cognitive Computing Lab
Internlm3-8b-instruct
2.9 GB
69.19
66.22
80.56
60.80
link
19.03.25
2
T-IoI@UNR
Gemma2-9b-cpt-sahabatai-v1-instruct
0.7 GB
67.78
59.75
81.11
62.48
link
09.03.25
3
Gachon Cognitive Computing Lab
Gemma2-9B-instruct
0.7 GB
64.84
54.33
79.92
60.28
link
10.12.24
4
Massimo R. Scamarcia
Phi-4
44.7 GB
55.64
40.66
74.52
51.75
link
12.01.25
5
Alessandro Pinto
Qwen2.5-1.5B-Instruct
2.1 GB
52.77
44.49
63.89
49.92
link
14.03.25
6
Baseline
Llama-3.2-3B
27.4 GB
21.68
22.20
25.32
17.53
link
09.12.24
7
Baseline
Mistral-7B-v0.3
40.7 GB
12.82
12.37
13.49
12.60
link
01.10.24

In the realm of Natural Language Processing (NLP), developing models that can effectively understand and generate human language is foundational. Federated LLM fine-tuning of models trained on general NLP tasks is vital as it democratizes LLM training across a diverse set of downstream tasks while preserving data privacy. This approach enable that the fine-tuned language models are not only robust and generalizable across various linguistic contexts but also attuned to nuances and colloquialisms present in different datasets.

👉 Check out the other Leaderboards