Transformer-Based Named Entity Recognition for Automated Server Provisioning

Hossein Damavandi, Hasan Jalali, Boshra Pishgoo

2025-04-01Conference 2025 4Speech Recognition Speech-to-Text speech-recognition named-entity-recognition Named Entity Recognition NER Named Entity Recognition (NER)

Paper PDF Code

Abstract

This paper introduces a novel method for automated server provisioning by integrating Transformerbased Named Entity Recognition models with Automated Speech Detection using OpenAI's Whisper. Leveraging advanced Transformer architectures-BERT, RoBERTa, and DeBERTa-combined with robust speech-to-text capabilities, our approach enables IT professionals to provision cloud servers efficiently via natural spoken commands. A customannotated dataset containing real-world and AI-generated provisioning requests is presented, meticulously labeled using the BIO tagging scheme across fourteen critical entity categories relevant to cloud infrastructure provisioning. Comprehensive evaluations of model performance and robustness were conducted under realistic conditions, including controlled transcription noise to simulate practical speech recognition errors. While all tested models achieved high performance on clean test data, results from noisy test scenarios revealed notable disparities in model generalization capabilities. Specifically, DeBERTa exhibited exceptional resilience, maintaining an F1 score of 96.23 % under adverse conditions. These findings highlight the practicality and robustness of combining speech-to-text processing with advanced Named Entity Recognition models, significantly advancing real-time, voice-driven IT automation workflows.

Related Papers

Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17 NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech2025-07-17 WhisperKit: On-device Real-time ASR with Billion-Scale Transformers2025-07-14 An Empirical Evaluation of AI-Powered Non-Player Characters' Perceived Realism and Performance in Virtual Reality Environments2025-07-14 VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis2025-07-08 Flippi: End To End GenAI Assistant for E-Commerce2025-07-08 A Hybrid Machine Learning Framework for Optimizing Crop Selection via Agronomic and Economic Forecasting2025-07-06 First Steps Towards Voice Anonymization for Code-Switching Speech2025-07-02