TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SHAKTI: A 2.5 Billion Parameter Small Language Model Optim...

SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments

Syed Abdul Gaffar Shakhadri, Kruthika KR, Rakshit Aralimatti

2024-10-15Question AnsweringLanguage Modelling
PaperPDF

Abstract

We introduce Shakti, a 2.5 billion parameter language model specifically optimized for resource-constrained environments such as edge devices, including smartphones, wearables, and IoT systems. Shakti combines high-performance NLP with optimized efficiency and precision, making it ideal for real-time AI applications where computational resources and memory are limited. With support for vernacular languages and domain-specific tasks, Shakti excels in industries such as healthcare, finance, and customer service. Benchmark evaluations demonstrate that Shakti performs competitively against larger models while maintaining low latency and on-device efficiency, positioning it as a leading solution for edge AI.

Results

TaskDatasetMetricValueModel
Question AnsweringHellaSwagAccuracy52.4Shakti-LLM (2.5B)
Question AnsweringMedQAAccuracy60.3Shakti-LLM (2.5B)
Question AnsweringBBHAccuracy58.2Shakti-LLM (2.5B)
Question AnsweringMMLAccuracy71.8qwen-LLM 7B
Question AnsweringTruthfulQAAccuracy68.4Shakti-LLM (2.5B)
Question AnsweringPIQAAccuracy86.2Shakti-LLM (2.5B)
Question AnsweringBoolQAccuracy61.1Shakti-LLM (2.5B)
Question AnsweringTriviaQAEM58.2Shakti-LLM (2.5B)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17