SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments

Syed Abdul Gaffar Shakhadri, Kruthika KR, Rakshit Aralimatti

2024-10-15Question Answering Language Modelling

Abstract

We introduce Shakti, a 2.5 billion parameter language model specifically optimized for resource-constrained environments such as edge devices, including smartphones, wearables, and IoT systems. Shakti combines high-performance NLP with optimized efficiency and precision, making it ideal for real-time AI applications where computational resources and memory are limited. With support for vernacular languages and domain-specific tasks, Shakti excels in industries such as healthcare, finance, and customer service. Benchmark evaluations demonstrate that Shakti performs competitively against larger models while maintaining low latency and on-device efficiency, positioning it as a leading solution for edge AI.

Results

Task	Dataset	Metric	Value	Model
Question Answering	HellaSwag	Accuracy	52.4	Shakti-LLM (2.5B)
Question Answering	MedQA	Accuracy	60.3	Shakti-LLM (2.5B)
Question Answering	BBH	Accuracy	58.2	Shakti-LLM (2.5B)
Question Answering	MML	Accuracy	71.8	qwen-LLM 7B
Question Answering	TruthfulQA	Accuracy	68.4	Shakti-LLM (2.5B)
Question Answering	PIQA	Accuracy	86.2	Shakti-LLM (2.5B)
Question Answering	BoolQ	Accuracy	61.1	Shakti-LLM (2.5B)
Question Answering	TriviaQA	EM	58.2	Shakti-LLM (2.5B)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17 Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17 Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17 City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17 Making Language Model a Hierarchical Classifier and Generator2025-07-17 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17