TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A Novel Estimator of Mutual Information for Learning to Di...

A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations

Pierre Colombo, Chloe Clavel, Pablo Piantanida

2021-05-06ACL 2021 5Style TransferAttributeText Style TransferDisentanglement
PaperPDF

Abstract

Learning disentangled representations of textual data is essential for many natural language tasks such as fair classification, style transfer and sentence generation, among others. The existent dominant approaches in the context of text data {either rely} on training an adversary (discriminator) that aims at making attribute values difficult to be inferred from the latent code {or rely on minimising variational bounds of the mutual information between latent code and the value attribute}. {However, the available methods suffer of the impossibility to provide a fine-grained control of the degree (or force) of disentanglement.} {In contrast to} {adversarial methods}, which are remarkably simple, although the adversary seems to be performing perfectly well during the training phase, after it is completed a fair amount of information about the undesired attribute still remains. This paper introduces a novel variational upper bound to the mutual information between an attribute and the latent code of an encoder. Our bound aims at controlling the approximation error via the Renyi's divergence, leading to both better disentangled representations and in particular, a precise control of the desirable degree of disentanglement {than state-of-the-art methods proposed for textual data}. Furthermore, it does not suffer from the degeneracy of other losses in multi-class scenarios. We show the superiority of this method on fair classification and on textual style transfer tasks. Additionally, we provide new insights illustrating various trade-offs in style transfer when attempting to learn disentangled representations and quality of the generated sentence.

Results

TaskDatasetMetricValueModel
Text GenerationYelp Review Dataset (Large)BLEU30StyleEmb
Text Style TransferYelp Review Dataset (Large)BLEU30StyleEmb
2D Semantic SegmentationYelp Review Dataset (Large)BLEU30StyleEmb

Related Papers

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models2025-07-18MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16Non-Adaptive Adversarial Face Generation2025-07-16Attributes Shape the Embedding Space of Face Recognition Models2025-07-15COLIBRI Fuzzy Model: Color Linguistic-Based Representation and Interpretation2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models2025-07-13Model Parallelism With Subnetwork Data Parallelism2025-07-11