Methods

5,489 machine learning methods and techniques

All Audio Computer Vision General Graphs Natural Language Processing Reinforcement Learning Sequential

Meta-augmentation

Meta-augmentation helps generate more varied tasks for a single example in meta-learning. It can be distinguished from data augmentation in classic machine learning as follows. For data augmentation in classical machine learning, the aim is to generate more varied examples, within a single task. Meta-augmentation has the exact opposite aim: we wish to generate more varied tasks, for a single example, to force the learner to quickly learn a new task from feedback. In meta-augmentation, adding randomness discourages the base learner and model from learning trivial solutions that do not generalize to new tasks.

GeneralIntroduced 20004 papers

Adan

Adaptive Nesterov Momentum

Please enter a description about the method here

GeneralIntroduced 20004 papers

Teacher-Tutor-Student Knowledge Distillation

Teacher-Tutor-Student Knowledge Distillation is a method for image virtual try-on models. It treats fake images produced by the parser-based method as "tutor knowledge", where the artifacts can be corrected by real "teacher knowledge", which is extracted from the real person images in a self-supervised way. Other than using real images as supervisions, knowledge distillation is formulated in the try-on problem as distilling the appearance flows between the person image and the garment image, enabling the finding of dense correspondences between them to produce high-quality results.

GeneralIntroduced 20004 papers

Decorrelated Batch Normalization

Decorrelated Batch Normalization (DBN) is a normalization technique which not just centers and scales activations but whitens them. ZCA whitening instead of PCA whitening is employed since PCA whitening causes a problem called stochastic axis swapping, which is detrimental to learning.

GeneralIntroduced 20004 papers

SimCLRv2

SimCLRv2 is a semi-supervised learning method for learning from few labeled examples while making best use of a large amount of unlabeled data. It is a modification of a recently proposed contrastive learning framework, SimCLR. It improves upon it in three major ways: 1. To fully leverage the power of general pre-training, larger ResNet models are explored. Unlike SimCLR and other previous work, whose largest model is ResNet-50 (4×), SimCLRv2 trains models that are deeper but less wide. The largest model trained is a 152 layer ResNet with 3× wider channels and selective kernels (SK), a channel-wise attention mechanism that improves the parameter efficiency of the network. By scaling up the model from ResNet-50 to ResNet-152 (3×+SK), a 29% relative improvement is obtained in top-1 accuracy when fine-tuned on 1% of labeled examples. 2. The capacity of the non-linear network (a.k.a. projection head) is increased, by making it deeper. Furthermore, instead of throwing away entirely after pre-training as in SimCLR, fine-tuning occurs from a middle layer. This small change yields a significant improvement for both linear evaluation and fine-tuning with only a few labeled examples. Compared to SimCLR with 2-layer projection head, by using a 3-layer projection head and fine-tuning from the 1st layer of projection head, it results in as much as 14% relative improvement in top-1 accuracy when fine-tuned on 1% of labeled examples. 3. The memory mechanism of MoCo v2 is incorporated, which designates a memory network (with a moving average of weights for stabilization) whose output will be buffered as negative examples. Since training is based on large mini-batch which already supplies many contrasting negative examples, this change yields an improvement of ∼1% for linear evaluation as well as when fine-tuning on 1% of labeled examples.

GeneralIntroduced 20004 papers

RotNet

RotNet is a self-supervision approach that relies on predicting image rotations as the pretext task in order to learn image representations.

GeneralIntroduced 20004 papers

m-arcsinh

modified arcsinh

Methods

Meta-augmentation

Adan

Teacher-Tutor-Student Knowledge Distillation

Decorrelated Batch Normalization

SimCLRv2

RotNet

m-arcsinh

Rational Activation function

Tree-structured Parzen Estimator Approach (TPE)

LIVE~AGENT|||How do I get to Expedia agent?

GPFL

How do I resolve a dispute with Expedia?*ResolveFastService

EESP

Conditional Instance Normalization

Spatial & Temporal Attention

RReLU

TaBERT

TURL

Polyak Averaging

DetNAS

Attention-augmented Convolution

KIP

Leverage Learning

QuantTree

Batch Nuclear-norm Maximization

SCAN-clustering

CIDA

Hermite Activation

GradDrop

CRISS

Hopfield Layer

Source Hypothesis Transfer

CVRL

TILDEv2

ACGPN

Channel Squeeze and Spatial Excitation

ComiRec

Online Normalization

DCN-V2

Attention Free Transformer

ShakeDrop

[Booking~Human~Exedia]How do I get a human at Expedia?

PowerSGD

Aging Evolution

Accuracy-Robustness Area (ARA)

Playstyle Distance

GPSA

Seesaw Loss

PipeDream-2BW

Branch attention

Methods

Meta-augmentation

Adan

Teacher-Tutor-Student Knowledge Distillation

Decorrelated Batch Normalization

SimCLRv2

RotNet

m-arcsinh

Rational Activation function

Tree-structured Parzen Estimator Approach (TPE)

LIVE~AGENT|||How do I get to Expedia agent?

GPFL

How do I resolve a dispute with Expedia?*ResolveFastService

EESP

Conditional Instance Normalization

Spatial & Temporal Attention

RReLU

TaBERT

TURL

Polyak Averaging

DetNAS

Attention-augmented Convolution

KIP

Leverage Learning

QuantTree

Batch Nuclear-norm Maximization

SCAN-clustering

CIDA

Hermite Activation