ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis

James P. Beno

2024-12-29Sentiment Analysis Sentiment Classification

Abstract

Bidirectional transformers excel at sentiment analysis, and Large Language Models (LLM) are effective zero-shot learners. Might they perform better as a team? This paper explores collaborative approaches between ELECTRA and GPT-4o for three-way sentiment classification. We fine-tuned (FT) four models (ELECTRA Base/Large, GPT-4o/4o-mini) using a mix of reviews from Stanford Sentiment Treebank (SST) and DynaSent. We provided input from ELECTRA to GPT as: predicted label, probabilities, and retrieved examples. Sharing ELECTRA Base FT predictions with GPT-4o-mini significantly improved performance over either model alone (82.50 macro F1 vs. 79.14 ELECTRA Base FT, 79.41 GPT-4o-mini) and yielded the lowest cost/performance ratio (\$0.12/F1 point). However, when GPT models were fine-tuned, including predictions decreased performance. GPT-4o FT-M was the top performer (86.99), with GPT-4o-mini FT close behind (86.70) at much less cost (\$0.38 vs. \$1.59/F1 point). Our results show that augmenting prompts with predictions from fine-tuned encoders is an efficient way to boost performance, and a fine-tuned GPT-4o-mini is nearly as good as GPT-4o FT at 76% less cost. Both are affordable options for projects with limited resources.

Results

Task	Dataset	Metric	Value	Model
Sentiment Analysis	SST-3	Macro F1	75.68	GPT-4o-mini Fine-Tuned
Sentiment Analysis	SST-3	Macro F1	73.99	GPT-4o Fine-Tuned (Minimal)
Sentiment Analysis	SST-3	Macro F1	72.94	GPT-4o + ELECTRA Large FT
Sentiment Analysis	SST-3	Macro F1	72.2	GPT-4o (Prompt)
Sentiment Analysis	SST-3	Macro F1	72.06	GPT-4o + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	SST-3	Macro F1	71.98	GPT-4o-mini + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	SST-3	Macro F1	71.72	GPT-4o-mini + ELECTRA Base FT
Sentiment Analysis	SST-3	Macro F1	70.99	GPT-4o-mini + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	SST-3	Macro F1	70.9	ELECTRA Large Fine-Tuned
Sentiment Analysis	SST-3	Macro F1	70.67	GPT-4o-mini (Prompt)
Sentiment Analysis	SST-3	Macro F1	69.95	ELECTRA Base Fine-Tuned
Sentiment Analysis	Sentiment Merged	Macro F1	86.99	GPT-4o Fine-Tuned (Minimal)
Sentiment Analysis	Sentiment Merged	Macro F1	86.77	GPT-4o-mini Fine-Tuned
Sentiment Analysis	Sentiment Merged	Macro F1	83.49	GPT-4o-mini + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	Sentiment Merged	Macro F1	83.09	GPT-4o + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	Sentiment Merged	Macro F1	82.74	GPT-4o-mini + ELECTRA Base FT (Prompt, Label)
Sentiment Analysis	Sentiment Merged	Macro F1	82.36	ELECTRA Large Fine-Tuned
Sentiment Analysis	Sentiment Merged	Macro F1	81.57	GPT-4o + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	Sentiment Merged	Macro F1	80.14	GPT-4o (Prompt)
Sentiment Analysis	Sentiment Merged	Macro F1	79.52	GPT-4o-mini (Prompt)
Sentiment Analysis	Sentiment Merged	Macro F1	79.29	ELECTRA Base Fine-Tuned
Sentiment Analysis	DynaSent	Macro F1	89	GPT-4o Fine-Tuned (Minimal)
Sentiment Analysis	DynaSent	Macro F1	86.9	GPT-4o-mini Fine-Tuned
Sentiment Analysis	DynaSent	Macro F1	81.53	GPT-4o + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	DynaSent	Macro F1	80.22	GPT-4o (Prompt)
Sentiment Analysis	DynaSent	Macro F1	79.72	GPT-4o-mini + ELECTRA Large FT (Prompt, Label, Probabilities)
Sentiment Analysis	DynaSent	Macro F1	77.94	GPT-4o-mini + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	DynaSent	Macro F1	77.69	GPT-4o + ELECTRA Large FT
Sentiment Analysis	DynaSent	Macro F1	77.35	GPT-4o-mini (Prompt)
Sentiment Analysis	DynaSent	Macro F1	76.29	ELECTRA Large Fine-Tuned
Sentiment Analysis	DynaSent	Macro F1	76.19	GPT-4o-mini + ELECTRA Base FT
Sentiment Analysis	DynaSent	Macro F1	71.83	ELECTRA Base Fine-Tuned

Abstract

Results

Task	Dataset	Metric	Value	Model
Sentiment Analysis	SST-3	Macro F1	75.68	GPT-4o-mini Fine-Tuned
Sentiment Analysis	SST-3	Macro F1	73.99	GPT-4o Fine-Tuned (Minimal)
Sentiment Analysis	SST-3	Macro F1	72.94	GPT-4o + ELECTRA Large FT
Sentiment Analysis	SST-3	Macro F1	72.2	GPT-4o (Prompt)
Sentiment Analysis	SST-3	Macro F1	72.06	GPT-4o + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	SST-3	Macro F1	71.98	GPT-4o-mini + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	SST-3	Macro F1	71.72	GPT-4o-mini + ELECTRA Base FT
Sentiment Analysis	SST-3	Macro F1	70.99	GPT-4o-mini + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	SST-3	Macro F1	70.9	ELECTRA Large Fine-Tuned
Sentiment Analysis	SST-3	Macro F1	70.67	GPT-4o-mini (Prompt)
Sentiment Analysis	SST-3	Macro F1	69.95	ELECTRA Base Fine-Tuned
Sentiment Analysis	Sentiment Merged	Macro F1	86.99	GPT-4o Fine-Tuned (Minimal)
Sentiment Analysis	Sentiment Merged	Macro F1	86.77	GPT-4o-mini Fine-Tuned
Sentiment Analysis	Sentiment Merged	Macro F1	83.49	GPT-4o-mini + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	Sentiment Merged	Macro F1	83.09	GPT-4o + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	Sentiment Merged	Macro F1	82.74	GPT-4o-mini + ELECTRA Base FT (Prompt, Label)
Sentiment Analysis	Sentiment Merged	Macro F1	82.36	ELECTRA Large Fine-Tuned
Sentiment Analysis	Sentiment Merged	Macro F1	81.57	GPT-4o + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	Sentiment Merged	Macro F1	80.14	GPT-4o (Prompt)
Sentiment Analysis	Sentiment Merged	Macro F1	79.52	GPT-4o-mini (Prompt)
Sentiment Analysis	Sentiment Merged	Macro F1	79.29	ELECTRA Base Fine-Tuned
Sentiment Analysis	DynaSent	Macro F1	89	GPT-4o Fine-Tuned (Minimal)
Sentiment Analysis	DynaSent	Macro F1	86.9	GPT-4o-mini Fine-Tuned
Sentiment Analysis	DynaSent	Macro F1	81.53	GPT-4o + ELECTRA Large FT (Prompt, Label, Examples)
Sentiment Analysis	DynaSent	Macro F1	80.22	GPT-4o (Prompt)
Sentiment Analysis	DynaSent	Macro F1	79.72	GPT-4o-mini + ELECTRA Large FT (Prompt, Label, Probabilities)
Sentiment Analysis	DynaSent	Macro F1	77.94	GPT-4o-mini + ELECTRA Large FT (Prompt, Label)
Sentiment Analysis	DynaSent	Macro F1	77.69	GPT-4o + ELECTRA Large FT
Sentiment Analysis	DynaSent	Macro F1	77.35	GPT-4o-mini (Prompt)
Sentiment Analysis	DynaSent	Macro F1	76.29	ELECTRA Large Fine-Tuned
Sentiment Analysis	DynaSent	Macro F1	76.19	GPT-4o-mini + ELECTRA Base FT
Sentiment Analysis	DynaSent	Macro F1	71.83	ELECTRA Base Fine-Tuned

ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis

Abstract

Results

Related Papers

ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis

Abstract

Results

Related Papers