Converting the Point of View of Messages Spoken to Virtual Assistants

Isabelle G. Lee, Vera Zu, Sai Srujana Buddi, Dennis Liang, Jack G. M. FitzGerald

2020-10-06Findings of the Association for Computational Linguistics 2020Text Classification Machine Translation NMT Part-Of-Speech Tagging Constituency Parsing Translation text-classification Language Modelling

Paper PDF Code Code(official)

Abstract

Virtual Assistants can be quite literal at times. If the user says "tell Bob I love him," most virtual assistants will extract the message "I love him" and send it to the user's contact named Bob, rather than properly converting the message to "I love you." We designed a system to allow virtual assistants to take a voice message from one user, convert the point of view of the message, and then deliver the result to its target user. We developed a rule-based model, which integrates a linear text classification model, part-of-speech tagging, and constituency parsing with rule-based transformation methods. We also investigated Neural Machine Translation (NMT) approaches, including LSTMs, CopyNet, and T5. We explored 5 metrics to gauge both naturalness and faithfulness automatically, and we chose to use BLEU plus METEOR for faithfulness and relative perplexity using a separately trained language model (GPT) for naturalness. Transformer-Copynet and T5 performed similarly on faithfulness metrics, with T5 achieving slight edge, a BLEU score of 63.8 and a METEOR score of 83.0. CopyNet was the most natural, with a relative perplexity of 1.59. CopyNet also has 37 times fewer parameters than T5. We have publicly released our dataset, which is composed of 46,565 crowd-sourced samples.

Results

Task	Dataset	Metric	Value	Model
Machine Translation	Alexa Point of View	BLEU	63	T5

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21 Making Language Model a Hierarchical Classifier and Generator2025-07-17 A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17 VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17 The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17 Assay2Mol: large language model-based drug design using BioAssay context2025-07-16 Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16