Fast and scalable search of whole-slide images via self-supervised deep learning
Menée à partir de séries de données du "The Cancer Genome Atlas", du "Clinical Proteomic Tumour Analysis Consortium" et du "Brigham and Women’s Hospital" portant sur des lames histologiques, cette étude évalue la performance d'un algorithme d'apprentissage auto-supervisé pour rechercher rapidement, indépendamment de la taille du référentiel, des images de lames histologiques présentant des caractéristiques communes et diagnostiquer différents types de maladies dont des types de cancers rares
The adoption of digital pathology has enabled the curation of large repositories of gigapixel whole-slide images (WSIs). Computationally identifying WSIs with similar morphologic features within large repositories without requiring supervised training can have significant applications. However, the retrieval speeds of algorithms for searching similar WSIs often scale with the repository size, which limits their clinical and research potential. Here we show that self-supervised deep learning can be leveraged to search for and retrieve WSIs at speeds that are independent of repository size. The algorithm, which we named SISH (for self-supervised image search for histology) and provide as an open-source package, requires only slide-level annotations for training, encodes WSIs into meaningful discrete latent representations and leverages a tree data structure for fast searching followed by an uncertainty-based ranking algorithm for WSI retrieval. We evaluated SISH on multiple tasks (including retrieval tasks based on tissue-patch queries) and on datasets spanning over 22,000 patient cases and 56 disease subtypes. SISH can also be used to aid the diagnosis of rare cancer types for which the number of available WSIs is often insufficient to train supervised deep-learning models.
Nature Biomedical Engineering , article en libre accès, 2022