TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ConvFinQA: Exploring the Chain of Numerical Reasoning in C...

ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering

Zhiyu Chen, Shiyang Li, Charese Smiley, Zhiqiang Ma, Sameena Shah, William Yang Wang

2022-10-07Question AnsweringConversational Question Answering
PaperPDFCode(official)

Abstract

With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching. The community is experiencing the shift of the challenge from how to model language to the imitation of complex reasoning abilities like human beings. In this work, we investigate the application domain of finance that involves real-world, complex numerical reasoning. We propose a new large-scale dataset, ConvFinQA, aiming to study the chain of numerical reasoning in conversational question answering. Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations. We conduct comprehensive experiments and analyses with both the neural symbolic methods and the prompting-based methods, to provide insights into the reasoning mechanisms of these two divisions. We believe our new dataset should serve as a valuable resource to push forward the exploration of real-world, complex reasoning tasks as the next research focus. Our dataset and code is publicly available at https://github.com/czyssrs/ConvFinQA.

Results

TaskDatasetMetricValueModel
Question AnsweringConvFinQAExecution Accuracy68.9FinQANet (RoBERTa-large)
Question AnsweringConvFinQAExecution Accuracy68.9FinQANet (RoBERTa-large)
Question AnsweringConvFinQAProgram Accuracy68.24FinQANet (RoBERTa-large)
Conversational Question AnsweringConvFinQAExecution Accuracy68.9FinQANet (RoBERTa-large)
Conversational Question AnsweringConvFinQAProgram Accuracy68.24FinQANet (RoBERTa-large)

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility2025-07-16Warehouse Spatial Question Answering with LLM Agent2025-07-14Evaluating Attribute Confusion in Fashion Text-to-Image Generation2025-07-09