Gabriel Van Zandycke, Christophe De Vleeschouwer
This paper considers the task of detecting the ball from a single viewpoint in the challenging but common case where the ball interacts frequently with players while being poorly contrasted with respect to the background. We propose a novel approach by formulating the problem as a segmentation task solved by an efficient CNN architecture. To take advantage of the ball dynamics, the network is fed with a pair of consecutive images. Our inference model can run in real time without the delay induced by a temporal analysis. We also show that test-time data augmentation allows for a significant increase the detection accuracy. As an additional contribution, we publicly release the dataset on which this work is based.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Object Tracking | Tennis | Accuracy (%) | 57.5 | BallSeg |
| Object Tracking | Tennis | Average Precision (%) | 56.8 | BallSeg |
| Object Tracking | Tennis | F1 (%) | 71.7 | BallSeg |
| Object Tracking | Soccer | Accuracy (% ) | 92.6 | BallSeg |
| Object Tracking | Soccer | Average Precision (%) | 20 | BallSeg |
| Object Tracking | Soccer | F1 (%) | 36.1 | BallSeg |
| Object Tracking | Badminton | Accuracy (%) | 72.2 | BallSeg |
| Object Tracking | Badminton | Average Precision (%) | 68.4 | BallSeg |
| Object Tracking | Badminton | F1 (%) | 79.9 | BallSeg |
| Object Tracking | Volleyball | Accuracy (%) | 17.5 | BallSeg |
| Object Tracking | Volleyball | Average Precision (%) | 8.5 | BallSeg |
| Object Tracking | Volleyball | F1 (%) | 19.5 | BallSeg |
| Object Tracking | Basketball | Accuracy (%) | 20.5 | BallSeg |
| Object Tracking | Basketball | Average Precision (%) | 5.3 | BallSeg |
| Object Tracking | Basketball | F1 (%) | 16.8 | BallSeg |