Jaesung Choe, Chunghyun Park, Francois Rameau, Jaesik Park, In So Kweon
MLP-Mixer has newly appeared as a new challenger against the realm of CNNs and transformer. Despite its simplicity compared to transformer, the concept of channel-mixing MLPs and token-mixing MLPs achieves noticeable performance in visual recognition tasks. Unlike images, point clouds are inherently sparse, unordered and irregular, which limits the direct use of MLP-Mixer for point cloud understanding. In this paper, we propose PointMixer, a universal point set operator that facilitates information sharing among unstructured 3D points. By simply replacing token-mixing MLPs with a softmax function, PointMixer can "mix" features within/between point sets. By doing so, PointMixer can be broadly used in the network as inter-set mixing, intra-set mixing, and pyramid mixing. Extensive experiments show the competitive or superior performance of PointMixer in semantic segmentation, classification, and point reconstruction against transformer-based methods.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | S3DIS Area5 | mAcc | 77.4 | PointMixer |
| Semantic Segmentation | S3DIS Area5 | mIoU | 71.4 | PointMixer |
| Shape Representation Of 3D Point Clouds | ModelNet40 | Mean Accuracy | 91.4 | PointMixer |
| Shape Representation Of 3D Point Clouds | ModelNet40 | Overall Accuracy | 93.6 | PointMixer |
| 3D Point Cloud Classification | ModelNet40 | Mean Accuracy | 91.4 | PointMixer |
| 3D Point Cloud Classification | ModelNet40 | Overall Accuracy | 93.6 | PointMixer |
| 10-shot image generation | S3DIS Area5 | mAcc | 77.4 | PointMixer |
| 10-shot image generation | S3DIS Area5 | mIoU | 71.4 | PointMixer |
| 3D Point Cloud Reconstruction | ModelNet40 | Mean Accuracy | 91.4 | PointMixer |
| 3D Point Cloud Reconstruction | ModelNet40 | Overall Accuracy | 93.6 | PointMixer |