Variational patches for high-resolution medical image segmentation with vision transformers
Author(s)
Enzhi Zhang | Hokkaido University
Isaac Lyngaas | Oak Ridge National Laboratory
Peng Chen | RIKEN Center for Computational Science
Xiao Wang | Oak Ridge National Laboratory
Yuankai Huo | Vanderbilt University
Masaharu Munetomo | RIKEN Center for Computational Science
Mohamed Wahib | RIKEN Center for Computational Science
Abstract
Attention-based models have become increasingly popular in image analytics tasks such as segmentation. Typically, images are divided into patches and fed into transformer encoders as linear sequences of tokens. However, for high-resolution images, such as microscopic pathology images, the quadratic growth of computational and memory requirements makes the use of attention-based models challenging, especially when smaller patch sizes are needed for accurate segmentation. Existing solutions involve either complex multi-resolution models or approximate attention mechanisms. In this work, we propose a novel approach inspired by Adaptive Mesh Refinement (AMR) methods used in high-performance computing. Our method adaptively selects patches based on image details, significantly reducing the number of patches while maintaining fine-grained segmentation accuracy. This pre-processing technique introduces minimal overhead and can be seamlessly integrated with any attention-based model. We demonstrate improved segmentation performance on real-world pathology datasets, achieving a geometric mean speedup of $6.9\times$ for images up to $64K^2$ resolution, utilizing up to $2,048$ GPUs, compared to state-of-the-art segmentation models.
Variational patches for high-resolution medical image segmentation with vision transformers
Description
Date and Location: 2/4/2025 | 11:00 AM - 11:20 AM | Regency APrimary Session Chair:
Yuankai Huo | Vanderbilt University
Session Co-Chair:
Paper Number: HPCI-182
Back to Session Gallery