TREK-100

Videos

The dataset is composed of 100 video sequences densely annotated with 60K bounding boxes, 17 sequence attributes, 13 action verb attributes and 29 target object attributes.

Source: Is First Person Vision Challenging for Object Tracking? The TREK-100 Benchmark Dataset