Surgical Scene Graph Generation

Introduced 2020-07-07

The training subset consists of 15 robotic nephrectomy procedures captured on the da Vinci X or Xi system. There are 149 frames per video sequence, and the dimension of each frame is 1280x1024. Segmentation annotations are provided with 10 different classes, including instruments, kidneys, and other objects in the surgical scenario. The main differences with the 2017 instrument segmentation dataset are annotation of kidney parenchyma, surgical objects such as suturing needles, Suturing thread clips, and additional instruments. We annotated the graphical representation of the interaction between the surgical instruments and the defective tissue in the surgical scene with the help of our clinical expertise with the da Vinci Xi robotic system. We also delineate the bounding box to identify all the surgical objects. Kidney and instruments are represented as nodes and active edges annotated as the interaction class in the graph. In total, 12 kinds of interactions were identified to generate the scene graph representation. The identified interactions are grasping, retraction, tissue manipulation, tool manipulation, cutting, cauterization, suction, looping, suturing, clipping, staple, ultrasound sensing. We split the newly annotated dataset into 12/3 video sequences (1788/449 frames) for training and cross-validation.