Research Topics | Haoyu Dong

Foundation Models

Foundation models are large-scale, pre-trained networks that have proven effective across various applications. My focus in this direction is twofold: (1) adapting foundation models pre-trained on natural images to the medical domain, and (2) developing medical-specific foundation models. In the first direction, we have evaluated the performance of SAM [PDF] and SAM2 [PDF] in medical applications. For the second direction, we have fine-tuned foundation models for segmenting bones [PDF] and muscles [ongoing], provide a guideline to fine-tune foundation models [PDF], and are developing a new model that has shown promise for MRI-based segmentation (4% improvement in DSC on average compared to SAM-based finetuning) [ongoing].

Image Harmonization

Medical images of the same anatomical region can vary significantly in appearance due to differences in acquisition procedures, such as scanner type and protocol. One approach to addressing this variability is image harmonization, \textit{i.e.}, making the appearance of images from different domains similar. To achieve this goal, we have first developed a few intermediate steps, such as generating anatomically-controllable images [PDF] and adapting the networks to different domains during test-time (2.9% improvement in DSC when compared to runner-up) [PDF].

Anomaly Detection

Label acquisition for medical images can be costly. Anomaly detection addresses this challenge by training networks solely on normal images and classifying unseen patterns as abnormalities. In this direction, I have developed two methods to enhance detection performance: (1) sliding-window partitioning (outperforming runner-up method by 8.03% IOU) [PDF], and (2) pluralistic image completion [PDF].

Multi-Modal Learning

In this direction, I am particularly interested in the intersection between vision and language. I have developed a text-guided retrieval algorithm [PDF] that and a multi-modal agent that surpasses GPT-4o [PDF].