TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Mixing ADAM and SGD: a Combined Optimization Method

Mixing ADAM and SGD: a Combined Optimization Method

Nicola Landro, Ignazio Gallo, Riccardo La Grassa

2020-11-16Document ClassificationStochastic Optimization
PaperPDFCode(official)

Abstract

Optimization methods (optimizers) get special attention for the efficient training of neural networks in the field of deep learning. In literature there are many papers that compare neural models trained with the use of different optimizers. Each paper demonstrates that for a particular problem an optimizer is better than the others but as the problem changes this type of result is no longer valid and we have to start from scratch. In our paper we propose to use the combination of two very different optimizers but when used simultaneously they can overcome the performances of the single optimizers in very different problems. We propose a new optimizer called MAS (Mixing ADAM and SGD) that integrates SGD and ADAM simultaneously by weighing the contributions of both through the assignment of constant weights. Rather than trying to improve SGD or ADAM we exploit both at the same time by taking the best of both. We have conducted several experiments on images and text document classification, using various CNNs, and we demonstrated by experiments that the proposed MAS optimizer produces better performance than the single SGD or ADAM optimizers. The source code and all the results of the experiments are available online at the following link https://gitlab.com/nicolalandro/multi\_optimizer

Results

TaskDatasetMetricValueModel
Stochastic OptimizationAG NewsAccuracy (max)93.99Bert
Stochastic OptimizationAG NewsAccuracy (mean)93.86Bert
Stochastic OptimizationCIFAR-10Accuracy (max)86.85Resnet18
Stochastic OptimizationCIFAR-10Accuracy (mean)85.89Resnet18
Stochastic OptimizationCIFAR-10Accuracy (max)86.14Resnet34
Stochastic OptimizationCIFAR-10Accuracy (mean)85.75Resnet34
Stochastic OptimizationCIFAR-100Accuracy (max)58.48Resnet18
Stochastic OptimizationCIFAR-100Accuracy (mean)58.01Resnet18
Stochastic OptimizationCIFAR-100Accuracy (max)54.5Resnet34
Stochastic OptimizationCIFAR-100Accuracy (mean)53.06Resnet34
Stochastic OptimizationCoLAAccuracy (max)86.34Bert
Stochastic OptimizationCoLAAccuracy (mean)87.66Bert

Related Papers

First-order methods for stochastic and finite-sum convex optimization with deterministic constraints2025-06-25Convergence of Momentum-Based Optimization Algorithms with Time-Varying Parameters2025-06-13Underage Detection through a Multi-Task and MultiAge Approach for Screening Minors in Unconstrained Imagery2025-06-12The Sample Complexity of Parameter-Free Stochastic Convex Optimization2025-06-12"What are my options?": Explaining RL Agents with Diverse Near-Optimal Alternatives (Extended)2025-06-11PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning2025-05-28Online distributed optimization for spatio-temporally constrained real-time peer-to-peer energy trading2025-05-28Distribution free M-estimation2025-05-28