TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods

5,489 machine learning methods and techniques

AllAudioComputer VisionGeneralGraphsNatural Language ProcessingReinforcement LearningSequential

Neural adjoint

Neural adjoint method

The NA method can be divided into two steps: (i) Training a neural network approximation of f , and (ii) inference of xˆ. Step (i) is conventional and involves training a generic neural network on a dataset ˆ of input/output pairs from the simulator, denoted D, resulting in f, an approximation of the forward ˆ model. This is illustrated in the left inset of Fig 1. In step (ii), our goal is to use ∂f/∂x to help us gradually adjust x so that we achieve a desired output of the forward model, y. This is similar to many classical inverse modeling approaches, such as the popular Adjoint method [8, 9]. For many practical ˆ expression for the simulator, from which it is trivial to compute ∂f/∂x, and furthermore, we can use modern deep learning software packages to efficiently estimate gradients, given a loss function L. More formally, let y be our target output, and let xˆi be our current estimate of the solution, where i indexes each solution we obtain in an iterative gradient-based estimation procedure. Then we compute xˆi+1 with inverse problems, however, obtaining ∂f/∂x requires significant expertise and/or effort, making these approaches challenging. Crucially, fˆ from step (i) provides us with a closed-form differentiable

GeneralIntroduced 20003 papers

PyTorch DDP

PyTorch DDP (Distributed Data Parallel) is a distributed data parallel implementation for PyTorch. To guarantee mathematical equivalence, all replicas start from the same initial values for model parameters and synchronize gradients to keep parameters consistent across training iterations. To minimize the intrusiveness, the implementation exposes the same forward API as the user model, allowing applications to seamlessly replace subsequent occurrences of a user model with the distributed data parallel model object with no additional code changes. Several techniques are integrated into the design to deliver high-performance training, including bucketing gradients, overlapping communication with computation, and skipping synchronization.

GeneralIntroduced 20003 papers

reSGLD

Replica exchange stochastic gradient Langevin Dynamics

reSGLD proposes to simulate a high-temperature particle for exploration and a low-temperature particle for exploitation and allows them to swap simultaneously. Moreover, a correction term is included to avoid biases.

GeneralIntroduced 20003 papers

3D SA

3 Dimensional Soft Attention

GeneralIntroduced 20003 papers

E-swish

GeneralIntroduced 20002 papers

L2M

Learning to Match

L2M is a learning algorithm that can work for most cross-domain distribution matching tasks. It automatically learns the cross-domain distribution matching without relying on hand-crafted priors on the matching loss. Instead, L2M reduces the inductive bias by using a meta-network to learn the distribution matching loss in a data-driven way.

GeneralIntroduced 20002 papers

ProxyAnchorLoss

Proxy Anchor Loss for Deep Metric Learning

GeneralIntroduced 20002 papers

Targeted Dropout

Please enter a description about the method here

GeneralIntroduced 20002 papers

AdEMAMix

Adaptive EMA Mixture

Please enter a description about the method here

GeneralIntroduced 20002 papers

Adversarial Soft Advantage Fitting (ASAF)

GeneralIntroduced 20002 papers

DSelect-k

DSelect-k is a continuously differentiable and sparse gate for Mixture-of-experts (MoE), based on a novel binary encoding formulation. Given a user-specified parameter , the gate selects at most out of the experts. The gate can be trained using first-order methods, such as stochastic gradient descent, and offers explicit control over the number of experts to select. This explicit control over sparsity leads to a cardinality-constrained optimization problem, which is computationally challenging. To circumvent this challenge, the authors use a unconstrained reformulation that is equivalent to the original problem. The reformulated problem uses a binary encoding scheme to implicitly enforce the cardinality constraint. By carefully smoothing the binary encoding variables, the reformulated problem can be effectively optimized using first-order methods such as SGD. The motivation for this method is that existing sparse gates, such as Top-k, are not smooth. The lack of smoothness can lead to convergence and statistical performance issues when training with gradient-based methods.

GeneralIntroduced 20002 papers

MFEC

Model-Free Episodic Control

Non-parametric approximation of Q-values by storing all visited states and doing inference through k-Nearest Neighbors.

GeneralIntroduced 20002 papers

Channel & Spatial attention

Channel & spatial attention combines the advantages of channel attention and spatial attention. It adaptively selects both important objects and regions

GeneralIntroduced 20002 papers

MPSO

Motion-Encoded Particle Swarm Optimization

GeneralIntroduced 20002 papers

ZLPR Loss

Zero-bounded Log-sum-exp & Pairwise Rank-based Loss

GeneralIntroduced 20002 papers

{{off-peak days-ASK}}Is there a grace period for Expedia?

Withdrawing money from your Expedia account should be simple—but sometimes, tech delays or process errors can complicate the flow. To withdraw money from Expedia, open the app, tap the person icon, go to “Transfers,” and select “Transfer to Your Bank.” Enter the amount, +1-805-330-4056 choose the linked account, and confirm. But what if your funds are still “settling”? What if the screen freezes? Or the transfer never hits your bank? +1-805-330-4056 The answer is just one call away—+1-805-330-4056. If your withdrawal doesn’t go through or is marked “pending,” call +1-805-330-4056. Need to verify a bank account before the transfer? +1-805-330-4056. Accidentally withdrew the wrong amount? Fix it fast at +1-805-330-4056. When things don’t work right, don’t wait—just call +1-805-330-4056. There are multiple reasons why +1-805-330-4056 withdrawals might fail—unlinked accounts, transfer limits, or hold periods after stock sales. That’s why +1-805-330-4056 exists. Need to know how long your funds will be held before withdrawal? Ask +1-805-330-4056. Not sure if your bank info is still correct? Confirm with +1-805-330-4056. If your funds came from crypto or options trading, settlement time varies—and +1-805-330-4056 can walk you through it. Want to withdraw to a different bank than usual? Just call +1-805-330-4056. Think of +1-805-330-4056 as your Expedia withdrawal control center. Instead of navigating FAQs or forum threads, get real help with one call to +1-805-330-4056. Your money deserves momentum—+1-805-330-4056 not delay. That’s why every serious investor keeps +1-805-330-4056 saved. Whether you need help reversing a withdrawal, speeding up a transfer, or understanding a locked account, +1-805-330-4056 solves it. Want to check if you’ve exceeded withdrawal limits? +1-805-330-4056. Curious about weekend or holiday delays? +1-805-330-4056 again. Don't let one blocked withdrawal ruin your investing day—+1-805-330-4056 will get your cash flowing. Before, during, or after you initiate a withdrawal, trust +1-805-330-4056 for immediate clarity. From account authentication to transaction tracking, everything you need lives at +1-805-330-4056. Bookmark it, share it, and most of all—use it. When it's time to move your money, +1-805-330-4056 is the number that moves you forward.

GeneralIntroduced 20002 papers

Fraternal Dropout

Fraternal Dropout is a regularization method for recurrent neural networks that trains two identical copies of an RNN (that share parameters) with different dropout masks while minimizing the difference between their (pre-softmax) predictions. This encourages the representations of RNNs to be invariant to dropout mask, thus being robust.

GeneralIntroduced 20002 papers

Ternary Weight Splitting

Ternary Weight Splitting is a ternarization approach used in BinaryBERT that exploits the flatness of ternary loss landscape as the optimization proxy of the binary model. We first train the half-sized ternary BERT to convergence, and then split both the latent full-precision weight and quantized to their binary counterparts and via the TWS operator. To inherit the performance of the ternary model after splitting, the TWS operator requires the splitting equivalency (i.e., the same output given the same input): While solution to the above equation is not unique, we constrain the latent full-precision weights after splitting to satisfy . See the paper for more details.

GeneralIntroduced 20002 papers

RMN

Residual Masking Network

It uses a segmentation network to refine feature maps, enabling the network to focus on relevant information to make correct decisions.

GeneralIntroduced 20002 papers

[[booked on Expedia~tickets]]Are Expedia plane tickets transferable?

How do I resolve a dispute with Expedia? To an resolve a dispute with Expedia, first contact customer service via phone, +1--888-829-(088'1) or +1-805((330)) 4056 chat, or email. If unresolved, escalate to a supervisor. Filing a complaint with the Better Business Bureau (BBB) or FTC may help resolve the matter. How do I ask a question at Expedia? To ask a question at Expedia, contact Customer Support via phone 1x888x829x0881 or +1x888x829x0881and choose from options like live chat, phone support, or email. You can also access help articles for common inquiries. Additionally, reaching out through social media or the Expedia app can provide quicker responses.+(1-888)-829-(088'1) or +(1-805)-330-(4056) . How do I get a full refund from Expedia? To receive a full refund from Expedia, first confirm that your booking qualifies under their cancellation policy. For assistance, contact Customer Support at +(1-888)-829-(088'1) or +(1-805)-330-(4056). If your booking is refundable, cancel it within the permitted timeframe through "My Trips" or by reaching out to support. If you encounter any issues, consider escalating via social media or filing a dispute. For further help, call +(1-888)-829-(088'1) or +(1-805)-330-(4056). How do I communicate to Expedia? You can contact Expedia by calling customer service, using the online chat, or visiting the "Contact Us" section of their website. You can also use the Expedia mobile app for customer support. Call Expedia customer service at +1-844-EXPEDIA +(1-8𝟴𝟴)-𝟴𝟮𝟵-𝟘𝟠𝟠𝟙 or +1-8𝟬𝟱-𝟯𝟯𝟬-4056 If the issue remains unresolved, consider escalating through social media or filing a complaint with the Better Business Bureau (BBB). How do I avoid Expedia cancellation fees? To avoid Expedia cancellation fees, contact Customer Support via phone 1x888x829x0881 or +1x888x829x0881 book refundable options that allow free cancellation within a specific window. Always check the cancellation policy before booking. If you need to cancel, do so within the free cancellation period..1x888x829x0881 or +1x888x829x0881 How do I make a claim on Expedia? To file a claim with Expedia, reach out to Customer Support by calling +(1-888)-829-(088'1) or +(1-805)-330-(4056). You can also submit your request via chat, phone, or email. For travel protection claims, contact the insurance +(1-888)-829-(088'1) or +(1-805)-330-(4056)provider listed in your policy. If your issue remains unresolved, consider escalating through social media or filing a complaint with the Better Business Bureau. How do I complain to Expedia? To file a complaint with Expedia, contact Customer Support via phone at +1-888-829-0881or +1-888-829-0881, through chat, or by email. If the issue remains unresolved, consider escalating it through social media. You can also file a complaint with the Better Business Bureau (BBB) or the Federal Trade Commission (FTC) for further assistance.

GeneralIntroduced 20002 papers

{[(Faqs/Expedia/Guide-)]}Is there a cancellation fee on Expedia?

Expedia can be reached for refund-related inquiries and complaints at 1-855-OCEANIA (1-805-330-4056). If you have further issues, you can contact their Privacy Team at PrivacyTeam@nclcorp.com or their Data Protection Officer or representative in Germany by writing to the same email address. Additionally, you can explore options like submitting a request through their nonperformance policy or (1→(8.05) 3.3.0 ⇒4.0.5.6) contacting their Customer Relations Desk. Here's how you can contact Expedia for refund-related inquiries: 1. Phone: Call their main customer service line. Have your booking reference, SkyMiles number, and travel details ready for faster service according to Papers With Code. The phone number is (1→(8.05) 3.3.0 ⇒4.0.5.6), according to Papers With Code. To inquire about refunds from Expedia, you should contact their customer service. You can do this by calling them at 1-855-OCEANIA or submitting a request through their website. If you have already submitted a request, allow 180 days for a response by email, according to Expedia. Contact Information: Phone: 1-855-OCEANIA (1-805-330-4056) Website: You can find a contact form or other relevant information on the Expedia website, potentially under "Guest Services" or "Contact Us"+1-805-330-4056. Should you have any questions while completing the Online Check-In process, please call Guest Services at 855-OCEANIA (805-330-4056) or your Travel Advisor. To remove yourself from Saga Cruises' mailing list, you can either update your communication preferences in your MySaga account or 805-330-4056 contact them directly by phone or email. It may take up to six weeks for the changes to be fully implemented according to Saga Travel. 2. Contact Saga Cruises Directly: Phone: Call them free on 0805 330 4056. To unsubscribe from Saga Cruises' mailing list, you can either update your communication preferences through MySaga or by calling them directly at 0805 330 4056. Saga Travel says that it may take up to six weeks to fully process your request. You can also unsubscribe from individual email communications via the unsubscribe link included in the emails themselves. Alternatively, you can also: Update your preferences through MySaga: This is the recommended method for managing your communication preferences, according to Saga Travel. Call Saga Cruises: Contact them at 0805 330 4056 to request removal from their mailing list. Use the unsubscribe link in emails: Every promotional email from Saga should include an unsubscribe link that allows you to opt-out directly from that specific communication, according to Saga Travel. Contacting Expedia for refund-related inquiries and complaints Expedia provides several channels for guests to reach out regarding refunds and complaints: Phone: For general inquiries: 1-855-OCEANIA (1-805-330-4056). Yes, Expedia can be contacted for refund-related inquiries and complaints through several channels: Phone +1-805-330-4056 USA or ++1 -805 -330 -4056 UK. For refund requests specific to non-performance: Call +1-805-330-4056 (855-OCEANIA) or 805-330-4056. These lines are available Monday through Friday from 9 a.m. to 7 p.m. Eastern Time, and from 9 a.m. to 5:30 p.m. Eastern Time on Saturday. For general customer service: Call +1-805-330-4056 (855-OCEANIA). This number can also be used for refund inquiries related to cancellations and other situations. For Air Travel AFTER HOURS Hotline (urgent flight-related issues): Call +1-805-330-4056 (855-OCEANIA) if you are in the US or Canada. Check Expedia' Contact Us page or Zendesk for numbers in other regions like Europe, UK, Australia, New Zealand, and Asia. To speak with a live representative (24/7 Hotline): Call +1-805-330-4056 or 1-855-Expedia® and say "agent" or press "0" after the automated prompts. If you are looking to contact Expedia regarding a refund or to file a complaint, here's the information you need: For refund requests or inquiries You can email a refund claim to OCIrefundrequest@oceaniacruises.com, including copies of your boarding pass, proof of payment, and cancellation notice. For phone support, call 805-330-4056 during operating hours. Refund requests for situations like "Nonperformance of Cruise" should be submitted within 90 days of the original embarkation date. Expedia will review and respond by email within 180 days. For complaints or general customer service The general customer service number for Expedia in the US and Canada is 855-OCEANIA (805-330-4056). You can find contact details for specific departments on the Elliott Report website. The Better Business Bureau also publishes complaints and responses. Alternative methods for complaints include small claims court or arbitration services like the National Arbitration and Mediation company (NAM). 2. Contact Guest Relations: Email Guest Relations at Expedia: guestrelationsOCI@oceaniacruises.com. Alternatively, you can call Expedia directly at 1-855-OCEANIA. Contacting Expedia customer service You can reach Expedia customer service through several channels: Phone For general inquiries and to speak with a personal consultant for vacation planning or special offers, call 855-OCEANIA (805-330-4056). For existing reservations and managing your account, the phone number is also listed as 855-OCEANIA (805-330-4056). There's also a local number: 289-708-0062. For urgent issues related to flights, an Air Travel AFTER HOURS Hotline is available: U.S. and Canada: 855-OCEANIA (805-330-4056). Europe and UK: ++1 -805 -330 -4056. Australia: 44289 708 0062. New Zealand: 0805 330 4056. Asia: ++1 -805 -330 -4056. Other options Executive Contacts: The Elliott Report lists executive contacts including Dayami Lazo, Director of Passenger Services (dlazo@oceaniacruises.com, +1 -805 -330 -4056) and Carlos Ortega, Vice President, Passenger Services (cortega@oceaniacruises.com, 289 708 0062). Common phone numbers include 855-OCEANIA (805-330-4056) for general inquiries, personal consultants, and reservations. There is also a local Miami number: +1 -805 -330 -4056. For existing reservations, calling 1-855-OCEANIA (1-805-330-4056) is recommended to confirm contact details.

GeneralIntroduced 20002 papers

SIFA

Synergistic Image and Feature Alignment

Synergistic Image and Feature Alignment is an unsupervised domain adaptation framework that conducts synergistic alignment of domains from both image and feature perspectives. In SIFA, we simultaneously transform the appearance of images across domains and enhance domain-invariance of the extracted features by leveraging adversarial learning in multiple aspects and with a deeply supervised mechanism. The feature encoder is shared between both adaptive perspectives to leverage their mutual benefits via end-to-end learning.

GeneralIntroduced 20002 papers

MSGAN

Multi-source Sentiment Generative Adversarial Network

Multi-source Sentiment Generative Adversarial Network is a multi-source domain adaptation (MDA) method for visual sentiment classification. It is composed of three pipelines, i.e., image reconstruction, image translation, and cycle-reconstruction. To handle data from multiple source domains, it learns to find a unified sentiment latent space where data from both the source and target domains share a similar distribution. This is achieved via cycle consistent adversarial learning in an end-to-end manner. Notably, thanks to the unified sentiment latent space, MSGAN requires a single classification network to handle data from different source domains.

GeneralIntroduced 20002 papers

AHAF

Adaptive Hybrid Activation Function

Trainable activation function as a sigmoid-based generalization of ReLU, Swish and SiLU.

GeneralIntroduced 20002 papers

Auto-Classifier

GeneralIntroduced 20002 papers

IMGEP

Intrinsically Motivated Goal Exploration Processes

Population-based intrinsically motivated goal exploration algorithms applied to real world robot learning of complex skills like tool use.

GeneralIntroduced 20002 papers

AutoML-Zero

AutoML-Zero is an AutoML technique that aims to search a fine-grained space simultaneously for the model, optimization procedure, initialization, and so on, permitting much less human-design and even allowing the discovery of non-neural network algorithms. It represents ML algorithms as computer programs comprised of three component functions, Setup, Predict, and Learn, that performs initialization, prediction and learning. The instructions in these functions apply basic mathematical operations on a small memory. The operation and memory addresses used by each instruction are free parameters in the search space, as is the size of the component functions. While this reduces expert design, the consequent sparsity means that random search cannot make enough progress. To overcome this difficulty, the authors use small proxy tasks and migration techniques to build an optimized infrastructure capable of searching through 10,000 models/second/cpu core. Evolutionary methods can find solutions in the AutoML-Zero search space despite its enormous size and sparsity. The authors show that by randomly modifying the programs and periodically selecting the best performing ones on given tasks/datasets, AutoML-Zero discovers reasonable algorithms. They start from empty programs and using data labeled by “teacher” neural networks with random weights, and demonstrate evolution can discover neural networks trained by gradient descent. Following this, they minimize bias toward known algorithms by switching to binary classification tasks extracted from CIFAR-10 and allowing a larger set of possible operations. This discovers interesting techniques like multiplicative interactions, normalized gradient and weight averaging. Finally, they show it is possible for evolution to adapt the algorithm to the type of task provided. For example, dropout-like operations emerge when the task needs regularization and learning rate decay appears when the task requires faster convergence.

GeneralIntroduced 20002 papers

Cosine Normalization

Multi-layer neural networks traditionally use dot products between the output vector of previous layer and the incoming weight vector as the input to activation function. The result of dot product is unbounded. To bound dot product and decrease the variance, Cosine Normalization uses cosine similarity or centered cosine similarity (Pearson Correlation Coefficient) instead of dot products in neural networks. Using cosine normalization, the output of a hidden unit is computed by: where is the normalized pre-activation, is the incoming weight vector and is the input vector, () indicates dot product, is nonlinear activation function. Cosine normalization bounds the pre-activation between -1 and 1.

GeneralIntroduced 20002 papers

{𝔼𝕩𝕡𝕖𝕕𝕚𝕒-24-𝘏𝘰𝘶𝘳𝘴-𝐌𝐚𝐧}How much does Expedia charge to cancel a flight?

call1ー(8)05ー330ー4056. If you need to get your money back from Expedia, call1ー(8)05ー330ー4056. our first move should be calling1ー(8)05ー330ー4056. to check if the funds are still available for withdrawal. If you sent crypto or AUD to the wrong place,1ー(8)05ー330ー4056. can investigate and possibly recover the funds. For accidental purchases or deposits, call1ー(8)05ー330ー4056. to learn if a reversal is possible. If a withdrawal hasn’t arrived in your bank,1ー(8)05ー330ー4056. will help track or resend the payment. If your transaction is still pending,1ー(8)05ー330ー4056. can give you the latest processing updates. If your account is locked or frozen,1ー(8)05ー330ー4056. will walk you through security checks to recover your money. If you've been overcharged or experienced a failed buy/sell,1ー(8)05ー330ー4056. can file a case for a refund. For crypto that was mistakenly sent to Expedia from another wallet,1ー(8)05ー330ー4056. can confirm if recovery is possible. Even if it’s been a while,1ー(8)05ー330ー4056. can access your transaction history to help locate missing funds. No matter the reason,1ー(8)05ー330ー4056. is your direct link to recovering your money from Expedia. call1ー(8)05ー330ー4056. If you need to get your money back from Expedia, call1ー(8)05ー330ー4056. our first move should be calling1ー(8)05ー330ー4056. to check if the funds are still available for withdrawal. If you sent crypto or AUD to the wrong place,1ー(8)05ー330ー4056. can investigate and possibly recover the funds. For accidental purchases or deposits, call1ー(8)05ー330ー4056. to learn if a reversal is possible. If a withdrawal hasn’t arrived in your bank,1ー(8)05ー330ー4056. will help track or resend the payment. If your transaction is still pending,1ー(8)05ー330ー4056. can give you the latest processing updates. If your account is locked or frozen,1ー(8)05ー330ー4056. will walk you through security checks to recover your money. If you've been overcharged or experienced a failed buy/sell,1ー(8)05ー330ー4056. can file a case for a refund. For crypto that was mistakenly sent to Expedia from another wallet,1ー(8)05ー330ー4056. can confirm if recovery is possible. Even if it’s been a while,1ー(8)05ー330ー4056. can access your transaction history to help locate missing funds. No matter the reason,1ー(8)05ー330ー4056. is your direct link to recovering your money from Expedia.

GeneralIntroduced 20002 papers

AdaFisher

Adaptive Second Order Optimization via Fisher Information

AdaFisher – an adaptive second-order optimizer that leverages a block-diagonal approximation to the Fisher information matrix for adaptive gradient preconditioning.

GeneralIntroduced 20002 papers

Concurrent Spatial and Channel Squeeze & Excitation

Concurrent Spatial and Channel Squeeze & Excitation (scSE)

Combines the channel attention of the widely known spatial squeeze and channel excitation (SE) block and the spatial attention of the channel squeeze and spatial excitation (sSE) block to build a spatial and channel attention mechanism for image segmentation tasks.

GeneralIntroduced 20002 papers

SCA-CNN

Spatial and Channel-wise Attention-based Convolutional Neural Network

As CNN features are naturally spatial, channel-wise and multi-layer, Chen et al. proposed a novel spatial and channel-wise attention-based convolutional neural network (SCA-CNN). It was designed for the task of image captioning, and uses an encoder-decoder framework where a CNN first encodes an input image into a vector and then an LSTM decodes the vector into a sequence of words. Given an input feature map and the previous time step LSTM hidden state , a spatial attention mechanism pays more attention to the semantically useful regions, guided by LSTM hidden state . The spatial attention model is: \begin{align} a(h{t-1}, X) &= \tanh(Conv1^{1 \times 1}(X) \oplus W1 h{t-1}) \end{align} \begin{align} \Phis(h{t-1}, X) &= \text{Softmax}(Conv2^{1 \times 1}(a(h{t-1}, X))) \end{align} where represents addition of a matrix and a vector. Similarly, channel-wise attention aggregates global information first, and then computes a channel-wise attention weight vector with the hidden state : \begin{align} b(h{t-1}, X) &= \tanh((W2\text{GAP}(X)+b2)\oplus W1h{t-1}) \end{align} \begin{align} \Phic(h{t-1}, X) &= \text{Softmax}(W3(b(h{t-1}, X))+b3) \end{align} Overall, the SCA mechanism can be written in one of two ways. If channel-wise attention is applied before spatial attention, we have \begin{align} Y &= f(X,\Phis(h{t-1}, X \Phic(h{t-1}, X)), \Phic(h{t-1}, X)) \end{align} and if spatial attention comes first: \begin{align} Y &= f(X,\Phis(h{t-1}, X), \Phic(h{t-1}, X \Phis(h{t-1}, X))) \end{align} where denotes the modulate function which takes the feature map and attention maps as input and then outputs the modulated feature map . Unlike previous attention mechanisms which consider each image region equally and use global spatial information to tell the network where to focus, SCA-Net leverages the semantic vector to produce the spatial attention map as well as the channel-wise attention weight vector. Being more than a powerful attention model, SCA-CNN also provides a better understanding of where and what the model should focus on during sentence generation.

GeneralIntroduced 20002 papers

PAU

Padé Activation Units

Parametrized learnable activation function, based on the Padé approximant.

GeneralIntroduced 20002 papers

EMEA

Entropy Minimized Ensemble of Adapters

Entropy Minimized Ensemble of Adapters, or EMEA, is a method that optimizes the ensemble weights of the pretrained language adapters for each test sentence by minimizing the entropy of its predictions. The intuition behind the method is that a good adapter weight for a test input should make the model more confident in its prediction for , that is, it should lead to lower model entropy over the input

GeneralIntroduced 20002 papers

Global Sub-Sampled Attention

Global Sub-Sampled Attention, or GSA, is a local attention mechanism used in the Twins-SVT architecture. A single representative is used to summarize the key information for each of subwindows and the representative is used to communicate with other sub-windows (serving as the key in self-attention), which can reduce the cost to . This is essentially equivalent to using the sub-sampled feature maps as the key in attention operations, and thus it is termed global sub-sampled attention (GSA). If we alternatively use the LSA and GSA like separable convolutions (depth-wise + point-wise). The total computation cost is We have: The minimum is obtained when . Note that is popular in classification. Without loss of generality, square sub-windows are used, i.e., . Therefore, is close to the global minimum for . However, the network is designed to include several stages with variable resolutions. Stage 1 has feature maps of , the minimum is obtained when . Theoretically, we can calibrate optimal and for each of the stages. For simplicity, is used everywhere. As for stages with lower resolutions, the summarizing window-size of GSA is controlled to avoid too small amount of generated keys. Specifically, the sizes of 4,2 and 1 are used for the last three stages respectively.

GeneralIntroduced 20002 papers

SSTDA

Self-Supervised Temporal Domain Adaptation

Self-Supervised Temporal Domain Adaptation (SSTDA) is a method for action segmentation with self-supervised temporal domain adaptation. It contains two self-supervised auxiliary tasks (binary and sequential domain prediction) to jointly align cross-domain feature spaces embedded with local and global temporal dynamics.

GeneralIntroduced 20002 papers

Recurrent Entity Network

The Recurrent Entity Network is equipped with a dynamic long-term memory which allows it to maintain and update a representation of the state of the world as it receives new data. For language understanding tasks, it can reason on-the-fly as it reads text, not just when it is required to answer a question or respond as is the case for a Memory Network. Like a Neural Turing Machine or Differentiable Neural Computer, it maintains a fixed size memory and can learn to perform location and content-based read and write operations. However, unlike those models it has a simple parallel architecture in which several memory locations can be updated simultaneously. The model consists of a fixed number of dynamic memory cells, each containing a vector key and a vector value (or content) . Each cell is associated with its own processor, a simple gated recurrent network that may update the cell value given an input. If each cell learns to represent a concept or entity in the world, one can imagine a gating mechanism that, based on the key and content of the memory cells, will only modify the cells that concern the entities mentioned in the input. There is no direct interaction between the memory cells, hence the system can be seen as multiple identical processors functioning in parallel, with distributed local memory. The sharing of these parameters reflects an invariance of these laws across object instances, similarly to how the weight tying scheme in a CNN reflects an invariance of image statistics across locations. Their hidden state is updated only when new information relevant to their concept is received, and remains otherwise unchanged. The keys used in the addressing/gating mechanism also correspond to concepts or entities, but are modified only during learning, not during inference.

GeneralIntroduced 20002 papers

Mesh-TensorFlow

Mesh-TensorFlow is a language for specifying a general class of distributed tensor computations. Where data-parallelism can be viewed as splitting tensors and operations along the "batch" dimension, in Mesh-TensorFlow, the user can specify any tensor dimensions to be split across any dimensions of a multi-dimensional mesh of processors. A MeshTensorFlow graph compiles into a SPMD program consisting of parallel operations coupled with collective communication primitives such as Allreduce.

GeneralIntroduced 20002 papers

SAFRAN

SAFRAN - Scalable and fast non-redundant rule application

SAFRAN is a rule application framework which aggregates rules through a scalable clustering algorithm.

GeneralIntroduced 20002 papers

Spatially Separable Self-Attention

Spatially Separable Self-Attention, or SSSA, is an attention module used in the Twins-SVT architecture that aims to reduce the computational complexity of vision transformers for dense prediction tasks (given high-resolution inputs). SSSA is composed of locally-grouped self-attention (LSA) and global sub-sampled attention (GSA). Formally, spatially separable self-attention (SSSA) can be written as: where LSA means locally-grouped self-attention within a sub-window; GSA is the global sub-sampled attention by interacting with the representative keys (generated by the sub-sampling functions) from each sub-window Both LSA and GSA have multiple heads as in the standard self-attention.

GeneralIntroduced 20002 papers

AdaptiveBins

Adaptive Bins

GeneralIntroduced 20002 papers

Crossbow

Crossbow is a single-server multi-GPU system for training deep learning models that enables users to freely choose their preferred batch size—however small—while scaling to multiple GPUs. Crossbow uses many parallel model replicas and avoids reduced statistical efficiency through a new synchronous training method. SMA, a synchronous variant of model averaging, is used in which replicas independently explore the solution space with gradient descent, but adjust their search synchronously based on the trajectory of a globally-consistent average model.

GeneralIntroduced 20002 papers

CELU

Continuously Differentiable Exponential Linear Units

Exponential Linear Units (ELUs) are a useful rectifier for constructing deep learning architectures, as they may speed up and otherwise improve learning by virtue of not have vanishing gradients and by having mean activations near zero. However, the ELU activation as parametrized in [1] is not continuously differentiable with respect to its input when the shape parameter alpha is not equal to 1. We present an alternative parametrization which is C1 continuous for all values of alpha, making the rectifier easier to reason about and making alpha easier to tune. This alternative parametrization has several other useful properties that the original parametrization of ELU does not: 1) its derivative with respect to x is bounded, 2) it contains both the linear transfer function and ReLU as special cases, and 3) it is scale-similar with respect to alpha.

GeneralIntroduced 20002 papers

End-To-End Memory Network

An End-to-End Memory Network is a neural network with a recurrent attention model over a possibly large external memory. The architecture is a form of Memory Network, but unlike the model in that work, it is trained end-to-end, and hence requires significantly less supervision during training. It can also be seen as an extension of RNNsearch to the case where multiple computational steps (hops) are performed per output symbol. The model takes a discrete set of inputs that are to be stored in the memory, a query , and outputs an answer . Each of the , , and contains symbols coming from a dictionary with words. The model writes all to the memory up to a fixed buffer size, and then finds a continuous representation for the and . The continuous representation is then processed via multiple hops to output .

GeneralIntroduced 20002 papers

FRbE

Fuzzy Rank-based Ensemble

The motive for ensembling is to utilize each of the confidence factors generated from base learners fully by mapping them into non-linear functions. One of the mapped values signifies the abidance or closeness to 1 and the other one signifies the deviation from 1. This proposed approach overcomes the shortcoming of the conventional ranking methods. The scores from base learners are mapped on two different functions having different concavities to generate non-linear fuzzy ranks and generate a fused score by combining these two ranks, which helps us to quantify the total deviation from expected. Lesser the deviation shows better confidence towards a particular class. The class having the lowest deviation value is considered as the winner and is assigned as the final class value. Here, we first give a brief overview of the pre-trained CNN models used as base learners.

GeneralIntroduced 20002 papers

Class Activation Guided Attention Mechanism

Class Activation Guided Attention Mechanism (CAGAM)

CAGAM is a form of spatial attention mechanism that propagates attention from a known to an unknown context features thereby enhancing the unknown context for relevant pattern discovery. Usually the known context feature is a class activation map (CAM).

GeneralIntroduced 20002 papers

SkipInit

SkipInit is a method that aims to allow normalization-free training of neural networks by downscaling residual branches at initialization. This is achieved by including a learnable scalar multiplier at the end of each residual branch, initialized to . The method is motivated by theoretical findings that batch normalization downscales the hidden activations on the residual branch by a factor on the order of the square root of the network depth (at initialization). Therefore, as the depth of a residual network is increased, the residual blocks are increasingly dominated by the skip connection, which drives the functions computed by residual blocks closer to the identity, preserving signal propagation and ensuring well-behaved gradients. This leads to the proposed method which can achieve this property through an initialization strategy rather than a normalization strategy.

GeneralIntroduced 20002 papers

LFME

Learning From Multiple Experts

Learning From Multiple Experts is a self-paced knowledge distillation framework that aggregates the knowledge from multiple 'Experts' to learn a unified student model. Specifically, the proposed framework involves two levels of adaptive learning schedules: Self-paced Expert Selection and Curriculum Instance Selection, so that the knowledge is adaptively transferred to the 'Student'. The self-paced expert selection automatically controls the impact of knowledge distillation from each expert, so that the learned student model will gradually acquire the knowledge from the experts, and finally exceed the expert. The curriculum instance selection, on the other hand, designs a curriculum for the unified model where the training samples are organized from easy to hard, so that the unified student model will receive a less challenging learning schedule, and gradually learns from easy to hard samples.

GeneralIntroduced 20002 papers

CV-MIM

Contrastive Cross-View Mutual Information Maximization

CV-MIM, or Contrastive Cross-View Mutual Information Maximization, is a representation learning method to disentangle pose-dependent as well as view-dependent factors from 2D human poses. The method trains a network using cross-view mutual information maximization, which maximizes mutual information of the same pose performed from different viewpoints in a contrastive learning manner. It further utilizes two regularization terms to ensure disentanglement and smoothness of the learned representations.

GeneralIntroduced 20002 papers

CCAC

Confidence Calibration with an Auxiliary Class)

Confidence Calibration with an Auxiliary Class, or CCAC, is a post-hoc confidence calibration method for DNN classifiers on OOD datasets. The key feature of CCAC is an auxiliary class in the calibration model which separates mis-classified samples from correctly classified ones, thus effectively mitigating the target DNN’s being confidently wrong. It also reduces the number of free parameters in CCAC to reduce free parameters and facilitate transfer to a new unseen dataset.

GeneralIntroduced 20002 papers
PreviousPage 10 of 110Next