Cross-modal self-supervised representation learning for gesture and skill recognition in robotic surgery.
Jie Ying WuAniruddha TamhanePeter KazanzidesMathias UnberathPublished in: International journal of computer assisted radiology and surgery (2021)
From predicting the synchronous kinematics sequence, optical flow representations of surgical scenes emerge that separate well even for new tasks that the model had not seen before. While the representations are useful immediately for a variety of tasks, the self-supervised learning paradigm may enable research in lifelong and user-specific learning.