TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Chimera

Chimera

GeneralIntroduced 200016 papers
Source Paper

Description

Chimera is a pipeline model parallelism scheme which combines bidirectional pipelines for efficiently training large-scale models. The key idea of Chimera is to combine two pipelines in different directions (down and up pipelines).

Denote NNN as the number of micro-batches executed by each worker within a training iteration, and DDD the number of pipeline stages (depth), and PPP the number of workers.

The Figure shows an example with four pipeline stages (i.e. D=4D=4D=4). Here we assume there are DDD micro-batches executed by each worker within a training iteration, namely N=DN=DN=D, which is the minimum to keep all the stages active.

In the down pipeline, stage_0\_{0}_0∼stage_3\_{3}_3 are mapped to P_0∼P_3P\_{0}∼P\_{3}P_0∼P_3 linearly, while in the up pipeline the stages are mapped in a completely opposite order. The NNN (assuming an even number) micro-batches are equally partitioned among the two pipelines. Each pipeline schedules N/2N/2N/2 micro-batches using 1F1B strategy, as shown in the left part of the Figure. Then, by merging these two pipelines together, we obtain the pipeline schedule of Chimera. Given an even number of stages DDD (which can be easily satisfied in practice), it is guaranteed that there is no conflict (i.e., there is at most one micro-batch occupies the same time slot on each worker) during merging.

Papers Using This Method

CHIMERA: A Knowledge Base of Idea Recombination in Scientific Literature2025-05-27A CMOS Probabilistic Computing Chip With In-situ hardware Aware Learning2025-04-18Chimera: A Block-Based Neural Architecture Search Framework for Event-Based Object Detection2024-12-27Language model driven: a PROTAC generation pipeline with dual constraints of structure and property2024-12-12Chimera: Improving Generalist Model with Domain-Specific Experts2024-12-08Chimera: Accurate retrosynthesis prediction by ensembling models with diverse inductive biases2024-12-06Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models2024-06-06Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens2024-02-24Artificial Bee Colony optimization of Deep Convolutional Neural Networks in the context of Biomedical Imaging2024-02-23Spoofing-Resilient LiDAR-GPS Factor Graph Localization with Chimera Authentication2023-07-10Investigating the generative dynamics of energy-based neural networks2023-05-11Chimera: A Hybrid Machine Learning Driven Multi-Objective Design Space Exploration Tool for FPGA High-Level Synthesis2022-07-03Physics-inspired Ising Computing with Ring Oscillator Activated p-bits2022-05-15Complex dynamics of a heterogeneous network of Hindmarsh-Rose neurons2022-05-03Convex Non-negative Matrix Factorization Through Quantum Annealing2022-03-28Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines2021-07-14