In this work, we present a simple yet better variant of Self-Critical Sequence Training. We make a simple change in the choice of baseline function in REINFORCE algorithm. The new baseline can bring better performance with no extra cost, compared to the greedy decoding baseline.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Captioning | COCO Captions | BLEU-1 | 80.7 | Transformer_NSC |
| Image Captioning | COCO Captions | BLEU-2 | 65.6 | Transformer_NSC |
| Image Captioning | COCO Captions | BLEU-3 | 51.3 | Transformer_NSC |
| Image Captioning | COCO Captions | BLEU-4 | 39.4 | Transformer_NSC |
| Image Captioning | COCO Captions | CIDER | 129.6 | Transformer_NSC |
| Image Captioning | COCO Captions | METEOR | 28.9 | Transformer_NSC |
| Image Captioning | COCO Captions | ROUGE-L | 58.7 | Transformer_NSC |
| Image Captioning | COCO Captions | SPICE | 22.8 | Transformer_NSC |