Challenge-Categories - Surgical Visual Understanding

This challenge is divided into two categories.

Category 1: Surgical tool classification and localization¶

This category will require the teams to train semi/weakly supervised models. The model should localize (with bounding boxes) and classify the tools present within each frame of the video clips in the test set by training on noisy tool presence labels provided in the training set and bounding box labels provided in the small validation set.

Category 2: Surgical visual question answering¶

This category will also require the teams to train weakly supervised models. Here the model should generate answers to open ended questions based on 30-second video clips. Training labels are tool presence labels, surgical steps and a description of the surgical step categories.