Nam Vo, Nathan Jacobs, James Hays
Image geolocalization, inferring the geographic location of an image, is a challenging computer vision problem with many potential applications. The recent state-of-the-art approach to this problem is a deep image classification approach in which the world is spatially divided into cells and a deep network is trained to predict the correct cell for a given image. We propose to combine this approach with the original Im2GPS approach in which a query image is matched against a database of geotagged images and the location is inferred from the retrieved set. We estimate the geographic location of a query image by applying kernel density estimation to the locations of its nearest neighbors in the reference database. Interestingly, we find that the best features for our retrieval task are derived from networks trained with classification loss even though we do not use a classification approach at test time. Training with classification loss outperforms several deep feature learning methods (e.g. Siamese networks with contrastive of triplet loss) more typical for retrieval applications. Our simple approach achieves state-of-the-art geolocalization accuracy while also requiring significantly less training data.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Classification | Im2GPS3k | City level (25 km) | 19.4 | Im2GPS (kNN, sigma = 4) |
| Image Classification | Im2GPS3k | Continent level (2500 km) | 55.9 | Im2GPS (kNN, sigma = 4) |
| Image Classification | Im2GPS3k | Country level (750 km) | 38.9 | Im2GPS (kNN, sigma = 4) |
| Image Classification | Im2GPS3k | Region level (200 km) | 26.9 | Im2GPS (kNN, sigma = 4) |
| Image Classification | Im2GPS3k | Street level (1 km) | 7.2 | Im2GPS (kNN, sigma = 4) |
| Image Classification | Im2GPS3k | City level (25 km) | 14.8 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS3k | Continent level (2500 km) | 52.4 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS3k | Country level (750 km) | 32.6 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS3k | Region level (200 km) | 21.4 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS3k | Street level (1 km) | 4 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS3k | City level (25 km) | 14.2 | Im2GPS ([M] 7011C) |
| Image Classification | Im2GPS3k | Continent level (2500 km) | 52.7 | Im2GPS ([M] 7011C) |
| Image Classification | Im2GPS3k | Country level (750 km) | 33.5 | Im2GPS ([M] 7011C) |
| Image Classification | Im2GPS3k | Region level (200 km) | 21.3 | Im2GPS ([M] 7011C) |
| Image Classification | Im2GPS3k | Street level (1 km) | 3.7 | Im2GPS ([M] 7011C) |
| Image Classification | Im2GPS | City level (25 km) | 33.3 | Im2GPS (... 28m database) |
| Image Classification | Im2GPS | Continent level (2500 km) | 73.4 | Im2GPS (... 28m database) |
| Image Classification | Im2GPS | Country level (750 km) | 61.6 | Im2GPS (... 28m database) |
| Image Classification | Im2GPS | Region level (200 km) | 47.7 | Im2GPS (... 28m database) |
| Image Classification | Im2GPS | Street level (1 km) | 14.4 | Im2GPS (... 28m database) |
| Image Classification | Im2GPS | City level (25 km) | 33.3 | Im2GPS ([L] KNN, sigma=4) |
| Image Classification | Im2GPS | Continent level (2500 km) | 71.3 | Im2GPS ([L] KNN, sigma=4) |
| Image Classification | Im2GPS | Country level (750 km) | 57.4 | Im2GPS ([L] KNN, sigma=4) |
| Image Classification | Im2GPS | Region level (200 km) | 44.3 | Im2GPS ([L] KNN, sigma=4) |
| Image Classification | Im2GPS | Street level (1 km) | 12.2 | Im2GPS ([L] KNN, sigma=4) |
| Image Classification | Im2GPS | City level (25 km) | 21.9 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS | Continent level (2500 km) | 63.7 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS | Country level (750 km) | 49.4 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS | Region level (200 km) | 34.6 | Im2GPS ([L] 7011C) |
| Image Classification | Im2GPS | Street level (1 km) | 6.8 | Im2GPS ([L] 7011C) |
| Image Classification | YFCC4k | City (25 km) | 5.7 | [L]kNN, σ = 4 |
| Image Classification | YFCC4k | Continent (2500 km) | 42 | [L]kNN, σ = 4 |
| Image Classification | YFCC4k | Country (750 km) | 23.5 | [L]kNN, σ = 4 |
| Image Classification | YFCC4k | Region (200 km) | 11 | [L]kNN, σ = 4 |
| Image Classification | YFCC4k | Street (1 km) | 2.3 | [L]kNN, σ = 4 |
| 4K 60Fps | Im2GPS3k | City level (25 km) | 19.4 | Im2GPS (kNN, sigma = 4) |
| 4K 60Fps | Im2GPS3k | Continent level (2500 km) | 55.9 | Im2GPS (kNN, sigma = 4) |
| 4K 60Fps | Im2GPS3k | Country level (750 km) | 38.9 | Im2GPS (kNN, sigma = 4) |
| 4K 60Fps | Im2GPS3k | Region level (200 km) | 26.9 | Im2GPS (kNN, sigma = 4) |
| 4K 60Fps | Im2GPS3k | Street level (1 km) | 7.2 | Im2GPS (kNN, sigma = 4) |
| 4K 60Fps | Im2GPS3k | City level (25 km) | 14.8 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS3k | Continent level (2500 km) | 52.4 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS3k | Country level (750 km) | 32.6 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS3k | Region level (200 km) | 21.4 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS3k | Street level (1 km) | 4 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS3k | City level (25 km) | 14.2 | Im2GPS ([M] 7011C) |
| 4K 60Fps | Im2GPS3k | Continent level (2500 km) | 52.7 | Im2GPS ([M] 7011C) |
| 4K 60Fps | Im2GPS3k | Country level (750 km) | 33.5 | Im2GPS ([M] 7011C) |
| 4K 60Fps | Im2GPS3k | Region level (200 km) | 21.3 | Im2GPS ([M] 7011C) |
| 4K 60Fps | Im2GPS3k | Street level (1 km) | 3.7 | Im2GPS ([M] 7011C) |
| 4K 60Fps | Im2GPS | City level (25 km) | 33.3 | Im2GPS (... 28m database) |
| 4K 60Fps | Im2GPS | Continent level (2500 km) | 73.4 | Im2GPS (... 28m database) |
| 4K 60Fps | Im2GPS | Country level (750 km) | 61.6 | Im2GPS (... 28m database) |
| 4K 60Fps | Im2GPS | Region level (200 km) | 47.7 | Im2GPS (... 28m database) |
| 4K 60Fps | Im2GPS | Street level (1 km) | 14.4 | Im2GPS (... 28m database) |
| 4K 60Fps | Im2GPS | City level (25 km) | 33.3 | Im2GPS ([L] KNN, sigma=4) |
| 4K 60Fps | Im2GPS | Continent level (2500 km) | 71.3 | Im2GPS ([L] KNN, sigma=4) |
| 4K 60Fps | Im2GPS | Country level (750 km) | 57.4 | Im2GPS ([L] KNN, sigma=4) |
| 4K 60Fps | Im2GPS | Region level (200 km) | 44.3 | Im2GPS ([L] KNN, sigma=4) |
| 4K 60Fps | Im2GPS | Street level (1 km) | 12.2 | Im2GPS ([L] KNN, sigma=4) |
| 4K 60Fps | Im2GPS | City level (25 km) | 21.9 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS | Continent level (2500 km) | 63.7 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS | Country level (750 km) | 49.4 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS | Region level (200 km) | 34.6 | Im2GPS ([L] 7011C) |
| 4K 60Fps | Im2GPS | Street level (1 km) | 6.8 | Im2GPS ([L] 7011C) |
| 4K 60Fps | YFCC4k | City (25 km) | 5.7 | [L]kNN, σ = 4 |
| 4K 60Fps | YFCC4k | Continent (2500 km) | 42 | [L]kNN, σ = 4 |
| 4K 60Fps | YFCC4k | Country (750 km) | 23.5 | [L]kNN, σ = 4 |
| 4K 60Fps | YFCC4k | Region (200 km) | 11 | [L]kNN, σ = 4 |
| 4K 60Fps | YFCC4k | Street (1 km) | 2.3 | [L]kNN, σ = 4 |