Kite
ImagesSpeechTextsIntroduced 2019-07-02
The Kite database is a multi-modal dataset for the control of unmanned aerial vehicles (UAVs). There are three modalities present in the dataset:
- Language, represented by the commands issued to the UAV
- Audio, represented by the spoken instantiation of the commands
- Visual, represented by an image that is likely to be seen when the command is issued
The dataset was created by the members of the SpeeD team.
Source: Kite Dataset