Kite

ImagesSpeechTextsIntroduced 2019-07-02

The Kite database is a multi-modal dataset for the control of unmanned aerial vehicles (UAVs). There are three modalities present in the dataset:

  • Language, represented by the commands issued to the UAV
  • Audio, represented by the spoken instantiation of the commands
  • Visual, represented by an image that is likely to be seen when the command is issued

The dataset was created by the members of the SpeeD team.

Source: Kite Dataset