Penghao Zhou, Chong Zhou, Pai Peng, Junlong Du, Xing Sun, Xiaowei Guo, Feiyue Huang
Greedy-NMS inherently raises a dilemma, where a lower NMS threshold will potentially lead to a lower recall rate and a higher threshold introduces more false positives. This problem is more severe in pedestrian detection because the instance density varies more intensively. However, previous works on NMS don't consider or vaguely consider the factor of the existent of nearby pedestrians. Thus, we propose Nearby Objects Hallucinator (NOH), which pinpoints the objects nearby each proposal with a Gaussian distribution, together with NOH-NMS, which dynamically eases the suppression for the space that might contain other objects with a high likelihood. Compared to Greedy-NMS, our method, as the state-of-the-art, improves by $3.9\%$ AP, $5.1\%$ Recall, and $0.8\%$ $\text{MR}^{-2}$ on CrowdHuman to $89.0\%$ AP and $92.9\%$ Recall, and $43.9\%$ $\text{MR}^{-2}$ respectively.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Autonomous Vehicles | CityPersons | Bare MR^-2 | 6.6 | NOH-NMS |
| Autonomous Vehicles | CityPersons | Heavy MR^-2 | 53 | NOH-NMS |
| Autonomous Vehicles | CityPersons | Partial MR^-2 | 11.2 | NOH-NMS |
| Autonomous Vehicles | CityPersons | Reasonable MR^-2 | 10.8 | NOH-NMS |
| Object Detection | CrowdHuman (full body) | AP | 89 | NOH-NMS |
| Object Detection | CrowdHuman (full body) | mMR | 43.9 | NOH-NMS |
| 3D | CrowdHuman (full body) | AP | 89 | NOH-NMS |
| 3D | CrowdHuman (full body) | mMR | 43.9 | NOH-NMS |
| 2D Classification | CrowdHuman (full body) | AP | 89 | NOH-NMS |
| 2D Classification | CrowdHuman (full body) | mMR | 43.9 | NOH-NMS |
| Pedestrian Detection | CityPersons | Bare MR^-2 | 6.6 | NOH-NMS |
| Pedestrian Detection | CityPersons | Heavy MR^-2 | 53 | NOH-NMS |
| Pedestrian Detection | CityPersons | Partial MR^-2 | 11.2 | NOH-NMS |
| Pedestrian Detection | CityPersons | Reasonable MR^-2 | 10.8 | NOH-NMS |
| 2D Object Detection | CrowdHuman (full body) | AP | 89 | NOH-NMS |
| 2D Object Detection | CrowdHuman (full body) | mMR | 43.9 | NOH-NMS |
| 16k | CrowdHuman (full body) | AP | 89 | NOH-NMS |
| 16k | CrowdHuman (full body) | mMR | 43.9 | NOH-NMS |