Weihua Hu, Yiwen Yuan, Zecheng Zhang, Akihiro Nitta, Kaidi Cao, Vid Kocijan, Jinu Sunil, Jure Leskovec, Matthias Fey
We present PyTorch Frame, a PyTorch-based framework for deep learning over multi-modal tabular data. PyTorch Frame makes tabular deep learning easy by providing a PyTorch-based data structure to handle complex tabular data, introducing a model abstraction to enable modular implementation of tabular models, and allowing external foundation models to be incorporated to handle complex columns (e.g., LLMs for text columns). We demonstrate the usefulness of PyTorch Frame by implementing diverse tabular models in a modular way, successfully applying these models to complex multi-modal tabular data, and integrating our framework with PyTorch Geometric, a PyTorch library for Graph Neural Networks (GNNs), to perform end-to-end learning over relational databases.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Text Classification | Civil Comments | AUROC | 0.97 | ResNet + RoBERTa finetune |
| Text Classification | Civil Comments | AUROC | 0.947 | Trompt + OpenAI embedding |
| Text Classification | Civil Comments | AUROC | 0.945 | ResNet + OpenAI embedding |
| Text Classification | Civil Comments | AUROC | 0.885 | Trompt + RoBERTa embedding |
| Text Classification | Civil Comments | AUROC | 0.882 | ResNet + RoBERTa embedding |
| Text Classification | Civil Comments | AUROC | 0.865 | LightGBM + RoBERTa embedding |
| Classification | Civil Comments | AUROC | 0.97 | ResNet + RoBERTa finetune |
| Classification | Civil Comments | AUROC | 0.947 | Trompt + OpenAI embedding |
| Classification | Civil Comments | AUROC | 0.945 | ResNet + OpenAI embedding |
| Classification | Civil Comments | AUROC | 0.885 | Trompt + RoBERTa embedding |
| Classification | Civil Comments | AUROC | 0.882 | ResNet + RoBERTa embedding |
| Classification | Civil Comments | AUROC | 0.865 | LightGBM + RoBERTa embedding |
| Binary Classification | kickstarter | AUROC | 0.81 | Trompt + OpenAI embedding |
| Binary Classification | kickstarter | AUROC | 0.786 | ResNet + RoBERTa finetune |
| Binary Classification | kickstarter | AUROC | 0.767 | LightGBM + RoBERTa embedding |
| Binary Classification | fake | AUROC | 0.979 | Trompt + OpenAI embedding |
| Binary Classification | fake | AUROC | 0.966 | LightGBM + OpenAI embedding |
| Binary Classification | fake | AUROC | 0.96 | FTTransformer + RoBERTa fintune |
| Binary Classification | fake | AUROC | 0.954 | LightGBM + RoBERTa embedding |
| Binary Classification | fake | AUROC | 0.936 | FTTransformer + RoBERTa embedding |
| Binary Classification | fake | AUROC | 0.934 | ResNet + RoBERTa embedding |
| Binary Classification | fake | AUROC | 0.923 | ResNet + OpenAI embedding |
| Binary Classification | fake | AUROC | 0.911 | FTTransformer + OpenAI embedding |