Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Speech
/
Speech Separation
/
WSJ0-2mix
Speech Separation on WSJ0-2mix
Metric: SI-SDRi (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
SI-SDRi (best first)
SI-SDRi (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
SI-SDRi
▼
Extra Data
Paper
Date
↕
Code
1
TF-Locoformer (L) + DM
25.1
No
TF-Locoformer: Transformer with Local Modeling b...
2024-08-06
Code
2
SepReformer-L
25.1
No
Separate and Reconstruct: Asymmetric Encoder-Dec...
2024-06-10
Code
3
TF-Locoformer (M) + DM
24.6
No
TF-Locoformer: Transformer with Local Modeling b...
2024-08-06
Code
4
TF-Locoformer (L)
24.2
No
TF-Locoformer: Transformer with Local Modeling b...
2024-08-06
Code
5
MossFormer2 (L)
24.1
No
-
-
Code
6
SepTDA (L=12)
24
No
Boosting Unknown-number Speaker Separation with ...
2024-01-23
-
7
Separate And Diffuse
23.9
No
Separate And Diffuse: Using a Pretrained Diffusi...
2023-01-25
-
8
TF-Locoformer (M)
23.6
No
TF-Locoformer: Transformer with Local Modeling b...
2024-08-06
Code
9
TF-Locoformer (S) + DM
22.8
No
TF-Locoformer: Transformer with Local Modeling b...
2024-08-06
Code
10
MossFormer (L) + DM
22.8
No
MossFormer: Pushing the Performance Limit of Mon...
2023-02-23
Code
11
SepMamba + DM (M)
22.7
No
SepMamba: State-space models for speaker separat...
2024-10-28
Code
12
SPGM + DM
22.7
No
SPGM: Prioritizing Local Features for enhanced s...
2023-09-22
Code
13
MossFormer (M) + DM
22.5
No
MossFormer: Pushing the Performance Limit of Mon...
2023-02-23
Code
14
SepIt
22.4
No
SepIt: Approaching a Single Channel Speech Separ...
2022-05-24
-
15
SepFormer
22.3
No
Attention is All You Need in Speech Separation
2020-10-25
Code
16
Wavesplit v2
22.2
No
Wavesplit: End-to-End Speech Separation by Speak...
2020-02-20
-
17
SPGM
22.1
No
SPGM: Prioritizing Local Features for enhanced s...
2023-09-22
Code
18
TF-Locoformer (S)
22
No
TF-Locoformer: Transformer with Local Modeling b...
2024-08-06
Code
19
DPTNet (Libri1Mix speech enhancement pre-trained)
21.3
Yes
Stabilizing Label Assignment for Speech Separati...
2020-10-29
Code
20
SepMamba + DM (S)
21.2
No
SepMamba: State-space models for speaker separat...
2024-10-28
Code
21
TD-Conformer (XL) + DM
21.2
No
On Time Domain Conformer Models for Monaural Spe...
2023-10-09
Code
22
Sandglasset
21
No
Sandglasset: A Light Multi-Granularity Self-atte...
2021-03-01
Code
23
GALR
20.3
No
Effective Low-Cost Time-Domain Audio Separation ...
2021-01-13
Code
24
DPTNet
20.2
No
-
-
Code
25
Gated DualPathRNN
20.12
No
Voice Separation with an Unknown Number of Multi...
2020-02-29
Code
26
Sudo rm -rf (U=36)
19.5
No
Compute and memory efficient universal sound sou...
2021-03-03
Code
27
Wavesplit v1
19
No
Wavesplit: End-to-End Speech Separation by Speak...
2020-02-20
-
28
Sudo rm -rf XL
18.9
No
Sudo rm -rf: Efficient Networks for Universal Au...
2020-07-14
Code
29
Dual-path RNN
18.8
No
Dual-path RNN: efficient long sequence modeling ...
2019-10-14
Code
30
DeepCASA
17.7
No
Divide and Conquer: A Deep CASA Approach to Talk...
2019-04-25
Code
31
IAC-PIT Tasnet
17.5
No
Interrupted and cascaded permutation invariant t...
2019-10-28
Code
32
Deformable TCN + Dynamic Mixing
17.2
No
Deformable Temporal Convolutional Networks for M...
2022-10-27
Code
33
Hybrid-Tasnet
16.6
No
Improved Speech Separation with Time-and-Frequen...
2019-04-16
Code
34
Deformable TCN + Shared Weights + Dynamic Mixing
16.1
No
Deformable Temporal Convolutional Networks for M...
2022-10-27
Code
35
Two-step Conv-TasNet
16.1
No
Two-Step Sound Source Separation: Training on Le...
2019-10-22
Code
36
Conv-TasNet
15.3
No
Conv-TasNet: Surpassing Ideal Time-Frequency Mag...
2018-09-20
Code
37
TasNet v2
13.2
No
-
-
Code
38
Chimera++
11.5
No
-
-
Code
39
TasNet
10.8
No
TasNet: time-domain audio separation network for...
2017-11-01
Code
40
Deep Clustering ++
10.8
No
Deep clustering: Discriminative embeddings for s...
2015-08-18
Code
#1
TF-Locoformer (L) + DM
25.1
SI-SDRi
· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Code
#2
SepReformer-L
SOTA
25.1
SI-SDRi
· 2024-06-10
Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation
Code
#3
TF-Locoformer (M) + DM
24.6
SI-SDRi
· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Code
#4
TF-Locoformer (L)
24.2
SI-SDRi
· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Code
#5
MossFormer2 (L)
24.1
SI-SDRi
No paper
Code
#6
SepTDA (L=12)
SOTA
24
SI-SDRi
· 2024-01-23
Boosting Unknown-number Speaker Separation with Transformer Decoder-based Attractor
#7
Separate And Diffuse
SOTA
23.9
SI-SDRi
· 2023-01-25
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
#8
TF-Locoformer (M)
23.6
SI-SDRi
· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Code
#9
TF-Locoformer (S) + DM
22.8
SI-SDRi
· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Code
#10
MossFormer (L) + DM
22.8
SI-SDRi
· 2023-02-23
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Code
#11
SepMamba + DM (M)
22.7
SI-SDRi
· 2024-10-28
SepMamba: State-space models for speaker separation using Mamba
Code
#12
SPGM + DM
22.7
SI-SDRi
· 2023-09-22
SPGM: Prioritizing Local Features for enhanced speech separation performance
Code
#13
MossFormer (M) + DM
22.5
SI-SDRi
· 2023-02-23
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Code
#14
SepIt
SOTA
22.4
SI-SDRi
· 2022-05-24
SepIt: Approaching a Single Channel Speech Separation Bound
#15
SepFormer
SOTA
22.3
SI-SDRi
· 2020-10-25
Attention is All You Need in Speech Separation
Code
#16
Wavesplit v2
SOTA
22.2
SI-SDRi
· 2020-02-20
Wavesplit: End-to-End Speech Separation by Speaker Clustering
#17
SPGM
22.1
SI-SDRi
· 2023-09-22
SPGM: Prioritizing Local Features for enhanced speech separation performance
Code
#18
TF-Locoformer (S)
22
SI-SDRi
· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Code
#19
DPTNet (Libri1Mix speech enhancement pre-trained)
21.3
SI-SDRi
· Extra Data
· 2020-10-29
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Code
#20
SepMamba + DM (S)
21.2
SI-SDRi
· 2024-10-28
SepMamba: State-space models for speaker separation using Mamba
Code
#21
TD-Conformer (XL) + DM
21.2
SI-SDRi
· 2023-10-09
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
Code
#22
Sandglasset
21
SI-SDRi
· 2021-03-01
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
Code
#23
GALR
20.3
SI-SDRi
· 2021-01-13
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks
Code
#24
DPTNet
20.2
SI-SDRi
No paper
Code
#25
Gated DualPathRNN
20.12
SI-SDRi
· 2020-02-29
Voice Separation with an Unknown Number of Multiple Speakers
Code
#26
Sudo rm -rf (U=36)
19.5
SI-SDRi
· 2021-03-03
Compute and memory efficient universal sound source separation
Code
#27
Wavesplit v1
19
SI-SDRi
· 2020-02-20
Wavesplit: End-to-End Speech Separation by Speaker Clustering
#28
Sudo rm -rf XL
18.9
SI-SDRi
· 2020-07-14
Sudo rm -rf: Efficient Networks for Universal Audio Source Separation
Code
#29
Dual-path RNN
SOTA
18.8
SI-SDRi
· 2019-10-14
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation
Code
#30
DeepCASA
SOTA
17.7
SI-SDRi
· 2019-04-25
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Code
#31
IAC-PIT Tasnet
17.5
SI-SDRi
· 2019-10-28
Interrupted and cascaded permutation invariant training for speech separation
Code
#32
Deformable TCN + Dynamic Mixing
17.2
SI-SDRi
· 2022-10-27
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
Code
#33
Hybrid-Tasnet
SOTA
16.6
SI-SDRi
· 2019-04-16
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Code
#34
Deformable TCN + Shared Weights + Dynamic Mixing
16.1
SI-SDRi
· 2022-10-27
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
Code
#35
Two-step Conv-TasNet
16.1
SI-SDRi
· 2019-10-22
Two-Step Sound Source Separation: Training on Learned Latent Targets
Code
#36
Conv-TasNet
SOTA
15.3
SI-SDRi
· 2018-09-20
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Code
#37
TasNet v2
13.2
SI-SDRi
No paper
Code
#38
Chimera++
11.5
SI-SDRi
No paper
Code
#39
TasNet
10.8
SI-SDRi
· 2017-11-01
TasNet: time-domain audio separation network for real-time, single-channel speech separation
Code
#40
Deep Clustering ++
SOTA
10.8
SI-SDRi
· 2015-08-18
Deep clustering: Discriminative embeddings for segmentation and separation
Code