Jing HAO

I'm pursuing my Ph.D. degree in the University of HongKong, Faculty of Dentistry (Ranking 2nd in the world), specializing in Medical AI and MLLM, supervised by Prof. Kuo Feng Hung and Prof. Tsoi, James Kit Hon.

Previously, I worked as a Computer Vision Engineer on Baidu VIS from 2022.07 to 2024.08. I received my M.S. degree in Huazhong University of Science and Technology (HUST, 2022), and B.S. degree in Chinese University of Mining and Technology (CUMT, 2020).

My research interests span the area of computer vision, self-supervised pre-training, multimodal large language model (mllm), and AI4Science.

/ / / /

News

[Jun. 2026] Two papers have been accepted to MICCAI 2026 ! Many thanks to Jiamin Wu and Ming Hu.

[Feb. 2026] One paper OralGPT-Omni has been accepted to CVPR 2026. Congratulations 🎉🎉🎉

[Feb. 2026] One paper OralGPT-Plus has been accepted to CVPR 2026. Congratulations 🎉🎉🎉

[Feb. 2026] One paper Enhancing Descriptive Captions (EDC) has been accepted to CVPR 2026. Congratulations 🎉🎉🎉

[Nov. 2025] One paper HiFi-Mamba has been accepted to AAAI 2026 ! Many thanks to Yingxuan.

[Oct. 2025] One paper T-Mamaba has been accepted to IEEE TMM !

[Sep. 2025] One paper MMOral & OralGPT has been accepted to NeurIPS 2025. Congratulations 🎉🎉🎉

[Jul. 2025] One paper has been accepted to DMFR (IF=4.1). Many thanks to Joe.

[Jun. 2025] One paper has been accepted to npj Digital Medicine (IF=15.1). Congratulations 🎉🎉🎉

[Nov. 2024] One paper SemiT-SAM has been accepted to MICCAI 2024 Workshop !

[Oct. 2024] One paper GEM has been accepted to IEEE TMM !

[Oct. 2024] I got 6th place in ToothFairy2 : Semi-supervised Teeth Segmentation hold on MICCAI2024 !

[Sep. 2024] One paper FullAnno has been released in Arxiv !

[July. 2024] One paper METR has been accepted to Neural Networks !

[Aug. 2023] I have received my IELTS scores 7(6) !

Publications

See full list at Google Scholar. (* indicates equal contribution, # indicates corresponding author)

	OralAgent: Integrating Reasoning, Tools, and Knowledge for Interactive Dental Image Analysis Jing Hao, Siyuan Dai, Yongxin Zhang, ..., Linlin Shen, Junjun He, Kuo Feng Hung [paper] [code] \| Under Review We present OralAgent, the first dental-specialized AI agent that unifies multimodal reasoning, tool-based decision-making, and knowledge-grounded retrieval within an end-to-end automated framework.
	OralGPT-Omni: A Versatile Dental Multimodal Large Language Model Jing Hao, Yuci Liang, Lizhuo Lin, ..., Linlin Shen, Kuo Feng Hung [paper] [code] \| CVPR, 2025 (CCF-A) We present OralGPT-Omni, the first dental-specialized MLLM designed for comprehensive and trustworthy analysis across diverse dental imaging modalities and clinical tasks. We also introduce MMOral-Uni, the first unified multimodal benchmark for dental image analysis.
	OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray Analysis Yuxuan Fan, Jing Hao, Hong Chen, Jiahao Bao, ..., Hao Tang [paper] [code] \| CVPR, 2025 (CCF-A) We present OralGPT-Plus, an agentic vision–language model designed to perform iterative and symmetry-aware diagnostic reasoning for panoramic dental radiograph analysis.
	Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis Jing Hao, Yuxuan Fan, Yanpeng Sun, ..., Hao Tang, Kuo Feng Hung [paper] [Project Page] [code] \| NeurIPS, 2025 (CCF-A) We introduce MMOral, the first large-scale multimodal instruction dataset and benchmark tailored for panoramic X-ray interpretation. We also propose OralGPT, a multimodal vision-language model for panoramic X-ray analysis.
	Characteristics, licensing, and ethical considerations of openly accessible oral-maxillofacial imaging datasets: a systematic review Jing Hao, ..., Michael M. Bornstein, James Kit Hon Tsoi, Kuo Feng Hung [paper] npj Digital Medicine, 2025 (JCR Q1, IF=15.2) Open-source oral-maxillofacial imaging datasets were identified through electronic databases and dataset platforms. 105 datasets with 437538 images and 100 intraoral videos from patients across twenty-one countries were included.
	T-Mamba: A unified framework with Long-Range Dependency in dual-domain for 2D & 3D Tooth Segmentation Jing Hao, Yonghui Zhu, Lei He, Moyun Liu, Kuo Feng Hung [paper] [dataset] [code] \| IEEE Transactions on Multimedia (TMM), 2025 (JCR Q1, IF=9.7) T-Mamba is the first work to introduce frequency-based features into vision mamba, its flexibility allows it to process both 2D and 3D tooth data without the need for separate modules.
	GEM: Boost Simple Network for Glass Surface Segmentation via Vision Foundation Models Jing Hao, Moyun Liu, Jinrong Yang, Kuo Feng Hung [paper] [dataset] [code] \| IEEE Transactions on Multimedia (TMM), 2024 (JCR Q1, IF=9.7) The first to propose exploring to the solution of glass surface segmentation by fully harnessing the capabilities of existing VFMs.
	Language-aware Multiple Datasets Detection Pretraining for DETRs Jing Hao, Song Chen [paper] [code] \| Neural Networks, 2024 (JCR Q1, IF=6.3) A strong framework for utilizing Multiple datasets to pretrain DETR-like detectors without the need for manual label spaces integration.
	SemiT-SAM: Building a Visual Foundation Model for Tooth Instance Segmentation on Panoramic Radiographs Jing Hao, Moyun Liu, Lei He, Lei Yao, James Kit Hon Tsoi, Kuo Feng Hung [paper] [dataset] [code] \| MICCAI 2024 Workshop We participated in the challenge of “MICCAI STS 2024: Panoramic X-ray Images”, and ranked 6th among all submitted teams.
	FullAnno: A Data Engine for Enhancing Image Comprehension of MLLMs Jing Hao, Yuxiang Zhao, Song Chen, Yanpeng Sun, Qiang Chen, Jingdong Wang [paper] Arxiv. preprint We designed a FullAnno system, which is a data engine that can generate large-scale, high-quality, and fine-grained image caption datasets automatically.
	A semi-supervised transformer-based deep learning framework for automated tooth segmentation and identification on panoramic radiographs Jing Hao, Lun M Wong, Qi Yong H. Ai, ..., James Kit Hon Tsoi, Kuo Feng Hung # [paper] [code] \| Diagnostics, 2024 (JCR Q1, IF=3.0) This study proposed a novel semi-supervised transformer-based framework designed for automated tooth segmentation and identification on panoramic radiographs.
	Simple Parameter-free Self-attention Approximation YuwenZhai, Jing Hao, Liang Gao, Xinyu Li, Yiping Gao, Shumin Han [paper] ICLR Tiny Paper, 2023 A self-attention approximation without training parameters which captures global spatial features with linear complexity.
	A Stronger Stitching Algorithm for Fisheye Images based on Deblurring and Registration Jing Hao, Jingming Xie, Jinyuan Zhang, Moyun Liu [paper] IEEE Sensors Letters, 2023 (JCR Q3, IF=2.2) A stronger stitching algorithm for fisheye images by combining the traditional image processing method with deep learning.
	A Lightweight and Accurate Recognition Framework for Signs of X-ray Weld Images Moyun Liu, Jingming Xie, Jing Hao, Yang Zhang, Xuzhan Chen, Youping Chen [paper] Computers in Industry, 2022 (JCR Q1, IF=8.2) A signs recognition framework based on convolutional neural networks (CNNs) for weld images.