Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu
We present a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented Question-Answering (QA). Compared to the literature, CoA overcomes two major challenges of current QA applications: (i) unfaithful hallucination that is inconsistent with real-time or domain facts and (ii) weak reasoning performance over compositional information. Our key contribution is a novel reasoning-retrieval mechanism that decomposes a complex question into a reasoning chain via systematic prompting and pre-designed actions. Methodologically, we propose three types of domain-adaptable `Plug-and-Play' actions for retrieving real-time information from heterogeneous sources. We also propose a multi-reference faith score (MRFS) to verify and resolve conflicts in the answers. Empirically, we exploit both public benchmarks and a Web3 case study to demonstrate the capability of CoA over other methods.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Question Answering | StrategyQA | EM | 79.2 | CoA |
| Question Answering | StrategyQA | EM | 77 | SearchChain |
| Question Answering | StrategyQA | EM | 77 | SearchChain |
| Question Answering | StrategyQA | EM | 70.6 | CoA w/o actions |
| Question Answering | StrategyQA | EM | 65.8 | Least-to-Most |
| Question Answering | StrategyQA | EM | 65.8 | Least-to-Most |
| Question Answering | WebQuestions | EM | 70.7 | CoA |
| Question Answering | WebQuestions | EM | 64.7 | CoA w/o actions |
| Question Answering | WebQuestions | EM | 59.4 | DSP |
| Question Answering | WebQuestions | EM | 59.4 | DSP |
| Question Answering | WebQuestions | EM | 44.7 | Few-shot |
| Question Answering | WebQuestions | EM | 44.7 | Few-shot |
| Question Answering | WebQuestions | EM | 43 | Zero-shot |
| Question Answering | WebQuestions | EM | 43 | Zero-shot |
| Question Answering | WebQuestions | EM | 42.5 | CoT |
| Question Answering | WebQuestions | EM | 42.5 | CoT |
| Question Answering | WebQuestions | EM | 38.3 | React |
| Question Answering | WebQuestions | EM | 38.3 | React |
| Question Answering | WebQuestions | EM | 31.1 | Self-Ask |
| Question Answering | WebQuestions | EM | 31.1 | Self-Ask |
| Question Answering | WebQuestions | EM | 26.3 | ToT |
| Question Answering | WebQuestions | EM | 26.3 | ToT |
| Question Answering | TruthfulQA | EM | 67.3 | CoA |
| Question Answering | TruthfulQA | EM | 63.3 | CoA w/o actions |
| Question Answering | FEVER | EM | 68.9 | CoA |
| Question Answering | FEVER | EM | 64.2 | Self-Ask |
| Question Answering | FEVER | EM | 64.2 | Self-Ask |
| Question Answering | FEVER | EM | 62.2 | DSP |
| Question Answering | FEVER | EM | 62.2 | DSP |
| Question Answering | FEVER | EM | 54.2 | CoA w/o actions |
| Question Answering | FEVER | EM | 50 | Zero-shot |
| Question Answering | FEVER | EM | 50 | Zero-shot |