NLP LLM Leaderboard
Embrace Federated LLM Fine-Tuning and Secure Your Spot on the Leaderboard!
← Scroll →
Rank | Team | Base Model | Comm. Costs | Average (↑) | STEM | Social Sciences | Humanities | Code | Date |
---|---|---|---|---|---|---|---|---|---|
1 | Gachon Cognitive Computing Lab | Gemma2-9B-instruct | 0.7 GB | 64.84 | 54.33 | 79.92 | 60.28 | link | 10.12.24 |
2 | Massimo R. Scamarcia | Phi-4 | 44.7 GB | 55.64 | 40.66 | 74.52 | 51.75 | link | 12.01.25 |
3 | Baseline | Llama-3.2-3B | 27.4 GB | 21.68 | 22.20 | 25.32 | 17.53 | link | 09.12.24 |
4 | Baseline | Mistral-7B-v0.3 | 40.7 GB | 12.82 | 12.37 | 13.49 | 12.60 | link | 01.10.24 |
In the realm of Natural Language Processing (NLP), developing models that can effectively understand and generate human language is foundational. Federated LLM fine-tuning of models trained on general NLP tasks is vital as it democratizes LLM training across a diverse set of downstream tasks while preserving data privacy. This approach enable that the fine-tuned language models are not only robust and generalizable across various linguistic contexts but also attuned to nuances and colloquialisms present in different datasets.