TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/0/1 Deep Neural Networks via Block Coordinate Descent

0/1 Deep Neural Networks via Block Coordinate Descent

HUI ZHANG, Shenglong Zhou, Geoffrey Ye Li, Naihua Xiu

2022-06-19Speech RecognitionAsthmatic Lung Sound ClassificationVirtual Try-onKeyword SpottingDenoisingZero-Shot Video Question AnswerMachine TranslationEEG 4 classesDeblurring3D Hand Pose EstimationQuestion AnsweringText-to-Image GenerationVisual Object TrackingObject RearrangementUniversal Domain AdaptationFace RecognitionRailway Track Image Classification3D Facial Expression RecognitionHateful Meme ClassificationWeakly Supervised Action Localization16kStyle TransferLicense Plate DetectionHandwritten Mathmatical Expression RecognitionNovel View SynthesisFracture detectionImage Classification3D Instance SegmentationAudio Classification3D dense captioningGloss-free Sign Language TranslationClick-Through Rate PredictionObject Detection In Aerial ImagesAbstractive Text SummarizationRgb-T TrackingCommon Sense ReasoningMultimodal Intent RecognitionDrug DiscoveryMulti-Object TrackingDomain GeneralizationDeepFake DetectionColor Image DenoisingAnomaly DetectionImage Dehazing3D Lane DetectionLong-range modelingVideo Question AnsweringReal-Time Object DetectionSemantic SegmentationTable-to-Text GenerationMultimodal Emotion Recognition3D Face AlignmentMedical Image SegmentationPose EstimationImage CaptioningGraph ClassificationHighlight DetectionVideo derainingObject TrackingDepth EstimationPerson Re-IdentificationArithmetic ReasoningText to 3D2D Object DetectionRobot ManipulationCode GenerationAction RecognitionRobot Task PlanningImage GenerationChange DetectionClassification10-shot image generation3D Multi-Object TrackingMeme ClassificationUnsupervised Domain AdaptationMulti-Label Classification3D Place RecognitionCross-Domain Few-Shot Object DetectionTemporal Relation ExtractionSpeech EnhancementMusic Source SeparationFraud DetectionFine-Grained Image ClassificationObject DetectionMonocular Depth EstimationLanguage ModellingFake Image Detection3D Facial Landmark LocalizationFace DetectionPhone-level pronunciation scoringLow-Light Image EnhancementVideo GenerationRobot Manipulation Generalization
PaperPDF

Abstract

The step function is one of the simplest and most natural activation functions for deep neural networks (DNNs). As it counts 1 for positive variables and 0 for others, its intrinsic characteristics (e.g., discontinuity and no viable information of subgradients) impede its development for several decades. Even if there is an impressive body of work on designing DNNs with continuous activation functions that can be deemed as surrogates of the step function, it is still in the possession of some advantageous properties, such as complete robustness to outliers and being capable of attaining the best learning-theoretic guarantee of predictive accuracy. Hence, in this paper, we aim to train DNNs with the step function used as an activation function (dubbed as 0/1 DNNs). We first reformulate 0/1 DNNs as an unconstrained optimization problem and then solve it by a block coordinate descend (BCD) method. Moreover, we acquire closed-form solutions for sub-problems of BCD as well as its convergence properties. Furthermore, we also integrate $\ell_{2,0}$-regularization into 0/1 DNN to accelerate the training process and compress the network scale. As a result, the proposed algorithm has a high performance on classifying MNIST and Fashion-MNIST datasets. As a result, the proposed algorithm has a desirable performance on classifying MNIST, FashionMNIST, Cifar10, and Cifar100 datasets.

Results

TaskDatasetMetricValueModel
Facial Recognition and Modelling!(()&&!|*|*|0L100nyenye
Domain AdaptationOffice-HomeAverage Accuracy71.4DisClusterDA
Image EnhancementLOLBSQ-rate over MS-SSIM0.2rr
3D Reconstruction10L99STYLE
Question AnsweringMultiTQHits@172.8TimeR4
Question AnsweringNewsQAEM81.44OpenAI/o1-2024-12-17-high
Question AnsweringNewsQAF188.72OpenAI/o1-2024-12-17-high
Emotion RecognitionIEMOCAP-4Weighted F174.1bc-LSTM
Object DetectionCOCO (Common Objects in Context)box AP57.1D-FINE-L+
Object DetectionGRAZPEDWRI-DXFracture Sensitivity91YOLOv5s
Object DetectionGRAZPEDWRI-DXFracture Sensitivity89YOLOv6s
Image ClassificationCUB-200-2011Accuracy91.8IELT
Face Reconstruction!(()&&!|*|*|0L100nyenye
Facial Expression Recognition (FER)!(()&&!|*|*|0L100nyenye
3DCOCO (Common Objects in Context)box AP57.1D-FINE-L+
3DGRAZPEDWRI-DXFracture Sensitivity91YOLOv5s
3DGRAZPEDWRI-DXFracture Sensitivity89YOLOv6s
3D10L99STYLE
3DT$^3$BenchAvg43.3ProlificDreamer
3D!(()&&!|*|*|0L100nyenye
3DFaceWarehouse0..5sec1face
DeepFake Detection10L99STYLE
Fine-Grained Image ClassificationCUB-200-2011Accuracy91.8IELT
3D Face Modelling!(()&&!|*|*|0L100nyenye
Contrastive Learning10,000 People - Human Pose Recognition Data0..5sec11
3D Face Reconstruction!(()&&!|*|*|0L100nyenye
Unsupervised Domain AdaptationOffice-HomeAverage Accuracy71.4DisClusterDA
2D ClassificationCOCO (Common Objects in Context)box AP57.1D-FINE-L+
2D ClassificationGRAZPEDWRI-DXFracture Sensitivity91YOLOv5s
2D ClassificationGRAZPEDWRI-DXFracture Sensitivity89YOLOv6s
2D Object DetectionCOCO (Common Objects in Context)box AP57.1D-FINE-L+
2D Object DetectionGRAZPEDWRI-DXFracture Sensitivity91YOLOv5s
2D Object DetectionGRAZPEDWRI-DXFracture Sensitivity89YOLOv6s
Robot ManipulationThe COLOSSEUMAverage decrease average across all perturbations-14.5RVT
Text to Image GenerationT$^3$BenchAvg43.3ProlificDreamer
Text to 3DT$^3$BenchAvg43.3ProlificDreamer
Multimodal Emotion RecognitionIEMOCAP-4Weighted F174.1bc-LSTM
10-shot image generationFlyingThings3D0..5sec11
3D Shape Reconstruction from Videos10L99STYLE
16kCOCO (Common Objects in Context)box AP57.1D-FINE-L+
16kGRAZPEDWRI-DXFracture Sensitivity91YOLOv5s
16kGRAZPEDWRI-DXFracture Sensitivity89YOLOv6s

Related Papers

ProxyFusion: Face Feature Aggregation Through Sparse Experts2025-09-24Multi-Stage Prompt Inference Attacks on Enterprise LLM Systems2025-07-21SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning2025-07-18Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech2025-07-17