Speech Separation on WSJ0-2mix

Metric: SI-SDRi (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	SI-SDRi▼	Extra Data	Paper	Date↕	Code
1	TF-Locoformer (L) + DM	25.1	No	TF-Locoformer: Transformer with Local Modeling b...	2024-08-06	Code
2	SepReformer-L	25.1	No	Separate and Reconstruct: Asymmetric Encoder-Dec...	2024-06-10	Code
3	TF-Locoformer (M) + DM	24.6	No	TF-Locoformer: Transformer with Local Modeling b...	2024-08-06	Code
4	TF-Locoformer (L)	24.2	No	TF-Locoformer: Transformer with Local Modeling b...	2024-08-06	Code
5	MossFormer2 (L)	24.1	No	-	-	Code
6	SepTDA (L=12)	24	No	Boosting Unknown-number Speaker Separation with ...	2024-01-23	-
7	Separate And Diffuse	23.9	No	Separate And Diffuse: Using a Pretrained Diffusi...	2023-01-25	-
8	TF-Locoformer (M)	23.6	No	TF-Locoformer: Transformer with Local Modeling b...	2024-08-06	Code
9	TF-Locoformer (S) + DM	22.8	No	TF-Locoformer: Transformer with Local Modeling b...	2024-08-06	Code
10	MossFormer (L) + DM	22.8	No	MossFormer: Pushing the Performance Limit of Mon...	2023-02-23	Code
11	SepMamba + DM (M)	22.7	No	SepMamba: State-space models for speaker separat...	2024-10-28	Code
12	SPGM + DM	22.7	No	SPGM: Prioritizing Local Features for enhanced s...	2023-09-22	Code
13	MossFormer (M) + DM	22.5	No	MossFormer: Pushing the Performance Limit of Mon...	2023-02-23	Code
14	SepIt	22.4	No	SepIt: Approaching a Single Channel Speech Separ...	2022-05-24	-
15	SepFormer	22.3	No	Attention is All You Need in Speech Separation	2020-10-25	Code
16	Wavesplit v2	22.2	No	Wavesplit: End-to-End Speech Separation by Speak...	2020-02-20	-
17	SPGM	22.1	No	SPGM: Prioritizing Local Features for enhanced s...	2023-09-22	Code
18	TF-Locoformer (S)	22	No	TF-Locoformer: Transformer with Local Modeling b...	2024-08-06	Code
19	DPTNet (Libri1Mix speech enhancement pre-trained)	21.3	Yes	Stabilizing Label Assignment for Speech Separati...	2020-10-29	Code
20	SepMamba + DM (S)	21.2	No	SepMamba: State-space models for speaker separat...	2024-10-28	Code
21	TD-Conformer (XL) + DM	21.2	No	On Time Domain Conformer Models for Monaural Spe...	2023-10-09	Code
22	Sandglasset	21	No	Sandglasset: A Light Multi-Granularity Self-atte...	2021-03-01	Code
23	GALR	20.3	No	Effective Low-Cost Time-Domain Audio Separation ...	2021-01-13	Code
24	DPTNet	20.2	No	-	-	Code
25	Gated DualPathRNN	20.12	No	Voice Separation with an Unknown Number of Multi...	2020-02-29	Code
26	Sudo rm -rf (U=36)	19.5	No	Compute and memory efficient universal sound sou...	2021-03-03	Code
27	Wavesplit v1	19	No	Wavesplit: End-to-End Speech Separation by Speak...	2020-02-20	-
28	Sudo rm -rf XL	18.9	No	Sudo rm -rf: Efficient Networks for Universal Au...	2020-07-14	Code
29	Dual-path RNN	18.8	No	Dual-path RNN: efficient long sequence modeling ...	2019-10-14	Code
30	DeepCASA	17.7	No	Divide and Conquer: A Deep CASA Approach to Talk...	2019-04-25	Code
31	IAC-PIT Tasnet	17.5	No	Interrupted and cascaded permutation invariant t...	2019-10-28	Code
32	Deformable TCN + Dynamic Mixing	17.2	No	Deformable Temporal Convolutional Networks for M...	2022-10-27	Code
33	Hybrid-Tasnet	16.6	No	Improved Speech Separation with Time-and-Frequen...	2019-04-16	Code
34	Deformable TCN + Shared Weights + Dynamic Mixing	16.1	No	Deformable Temporal Convolutional Networks for M...	2022-10-27	Code
35	Two-step Conv-TasNet	16.1	No	Two-Step Sound Source Separation: Training on Le...	2019-10-22	Code
36	Conv-TasNet	15.3	No	Conv-TasNet: Surpassing Ideal Time-Frequency Mag...	2018-09-20	Code
37	TasNet v2	13.2	No	-	-	Code
38	Chimera++	11.5	No	-	-	Code
39	TasNet	10.8	No	TasNet: time-domain audio separation network for...	2017-11-01	Code
40	Deep Clustering ++	10.8	No	Deep clustering: Discriminative embeddings for s...	2015-08-18	Code

#1TF-Locoformer (L) + DM
25.1
SI-SDRi· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement Code
#2SepReformer-LSOTA
25.1
SI-SDRi· 2024-06-10
Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation Code
#3TF-Locoformer (M) + DM
24.6
SI-SDRi· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement Code
#4TF-Locoformer (L)
24.2
SI-SDRi· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement Code
#5MossFormer2 (L)
24.1
SI-SDRi
No paperCode
#6SepTDA (L=12)SOTA
24
SI-SDRi· 2024-01-23
Boosting Unknown-number Speaker Separation with Transformer Decoder-based Attractor
#7Separate And DiffuseSOTA
23.9
SI-SDRi· 2023-01-25
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
#8TF-Locoformer (M)
23.6
SI-SDRi· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement Code
#9TF-Locoformer (S) + DM
22.8
SI-SDRi· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement Code
#10MossFormer (L) + DM
22.8
SI-SDRi· 2023-02-23
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions Code
#11SepMamba + DM (M)
22.7
SI-SDRi· 2024-10-28
SepMamba: State-space models for speaker separation using Mamba Code
#12SPGM + DM
22.7
SI-SDRi· 2023-09-22
SPGM: Prioritizing Local Features for enhanced speech separation performance Code
#13MossFormer (M) + DM
22.5
SI-SDRi· 2023-02-23
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions Code
#14SepItSOTA
22.4
SI-SDRi· 2022-05-24
SepIt: Approaching a Single Channel Speech Separation Bound
#15SepFormerSOTA
22.3
SI-SDRi· 2020-10-25
Attention is All You Need in Speech Separation Code
#16Wavesplit v2SOTA
22.2
SI-SDRi· 2020-02-20
Wavesplit: End-to-End Speech Separation by Speaker Clustering
#17SPGM
22.1
SI-SDRi· 2023-09-22
SPGM: Prioritizing Local Features for enhanced speech separation performance Code
#18TF-Locoformer (S)
22
SI-SDRi· 2024-08-06
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement Code
#19DPTNet (Libri1Mix speech enhancement pre-trained)
21.3
SI-SDRi· Extra Data· 2020-10-29
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training Code
#20SepMamba + DM (S)
21.2
SI-SDRi· 2024-10-28
SepMamba: State-space models for speaker separation using Mamba Code
#21TD-Conformer (XL) + DM
21.2
SI-SDRi· 2023-10-09
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments Code
#22Sandglasset
21
SI-SDRi· 2021-03-01
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation Code
#23GALR
20.3
SI-SDRi· 2021-01-13
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks Code
#24DPTNet
20.2
SI-SDRi
No paperCode
#25Gated DualPathRNN
20.12
SI-SDRi· 2020-02-29
Voice Separation with an Unknown Number of Multiple Speakers Code
#26Sudo rm -rf (U=36)
19.5
SI-SDRi· 2021-03-03
Compute and memory efficient universal sound source separation Code
#27Wavesplit v1
19
SI-SDRi· 2020-02-20
Wavesplit: End-to-End Speech Separation by Speaker Clustering
#28Sudo rm -rf XL
18.9
SI-SDRi· 2020-07-14
Sudo rm -rf: Efficient Networks for Universal Audio Source Separation Code
#29Dual-path RNNSOTA
18.8
SI-SDRi· 2019-10-14
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation Code
#30DeepCASASOTA
17.7
SI-SDRi· 2019-04-25
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation Code
#31IAC-PIT Tasnet
17.5
SI-SDRi· 2019-10-28
Interrupted and cascaded permutation invariant training for speech separation Code
#32Deformable TCN + Dynamic Mixing
17.2
SI-SDRi· 2022-10-27
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation Code
#33Hybrid-TasnetSOTA
16.6
SI-SDRi· 2019-04-16
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering Code
#34Deformable TCN + Shared Weights + Dynamic Mixing
16.1
SI-SDRi· 2022-10-27
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation Code
#35Two-step Conv-TasNet
16.1
SI-SDRi· 2019-10-22
Two-Step Sound Source Separation: Training on Learned Latent Targets Code
#36Conv-TasNetSOTA
15.3
SI-SDRi· 2018-09-20
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Code
#37TasNet v2
13.2
SI-SDRi
No paperCode
#38Chimera++
11.5
SI-SDRi
No paperCode
#39TasNet
10.8
SI-SDRi· 2017-11-01
TasNet: time-domain audio separation network for real-time, single-channel speech separation Code
#40Deep Clustering ++SOTA
10.8
SI-SDRi· 2015-08-18
Deep clustering: Discriminative embeddings for segmentation and separation Code