Papers With Code 2 | ML Benchmarks, SotA Results & Code

The Dataset consists of the multimodal facial images of 52 people (14 females, 38 males) obtained by Kinect. The data is captured in two sessions happened at different time period (about half month). In each session, the dataset provides the facial images of each person in 9 states of different facial expressions, different lighting and occlusion conditions: neutral, smile, open mouth, left profile, right profile, occlusion eyes, occlusion mouth, occlusion paper and light on [Figure 1]. All the images are provided in three sources of information: the RGB color image, the depth map (provided in two forms of the bitmap depth image and the text file containing the original depth levels sensed by Kinect) as well as 3D. In addition, the dataset comes with the manual landmarks of 6 positions in the face: left eye, right eye, the tip of nose, left side of mouth, right side of mouth and the chin [Figure 2]. Other information of the person such as gender, year of birth, glasses (this person wears the glasses or not), capture time of each session are also available.

Source: KinectFaceDB