Education#

Shenzhen University

M.S. · Computer Science & Technology

2026.09 – 2029.06 Shenzhen, Guangdong

Guilin University of Electronic Technology

B.S. · Artificial Intelligence

2022.09 – 2026.06 Guilin, Guangxi

Research Interests#

Diffusion Models Flow Matching LLM VLM MLLM Medical Image Segmentation Agent

News#

Publications#

MSConvFormer: A Multi-Scale Depthwise Convolution Transformer for Medical Time Series Classification

Author list to be updated.

Digital Signal Processing (JCR-Q2), under review.

Details

Abstract

Abstract to be updated.

Hierarchical Progressive Cross-modal Information Interaction for Incomplete Multimodal Brain Tumor Segmentation

Author list to be updated.

MICCAI 2026 (CCF-B), under review.

Details

Abstract

Abstract to be updated.

DMKformer: A Dual-Path Routing Transformer with Hybrid Mamba and KAN for Medical Time Series

Zhizun Zeng, Qiuyuan Gan, Wei Zhang, Yujie Li

Signal, Image and Video Processing, 2026. (JCR-Q3)

Details

Abstract

Transformer has shown great potential in modeling global dependencies in sequential data, research on Transformer-based variant architectures for medical time series analysis has attracted growing attention. However, challenges remain in capturing complex local patterns and long-term dependencies in medical time series due to their high-dimensional nonlinearities and temporal correlations. To address this, we propose a Transformer model DMKformer which incorporates a new feature fusion strategy named 2R-Attention. Specifically, on the basis of the original Transformer, we adopt the STAR and the self-attention structure to enhance the ability of the attention module to capture and utilize the information of long-term dependencies in the global signal. Meanwhile, a one-dimensional DWC module is introduced to improve local information capture and reduce overfitting. Additionally, we employ Mamba to optimize FFN for long sequences and utilize a KAN network to strengthen classification capabilities. We conduct extensive experiments on three publicly available datasets under both subject-independent and subject-dependent settings. The experimental results demonstrate that our method outperforms 11 benchmark models, achieving an average F1 score improvement of 2%, validating its effectiveness.

Gaze Estimation based on Visual State Space Model with Hybrid Features

Yujie Li, Rongjie Liu, Zhizun Zeng, Ziwen Wang, Yuhang Hong, Benying Tan

ACM Transactions on Sensor Networks, 2025. (JCR-Q1/CCF-B)

Details

Abstract

Visual State Space Model (denoted as VMamba), a vision-based model proposed to introduce Mamba into computer vision, has shown strong performance in recent work on computer vision tasks. However, the performance of VMamba in gaze estimation is still to be explored. In this paper, we propose two VMamba-based gaze estimation approaches: GazeVM-Pure based on pure VMamba and GazeVM-Hybrid based on hybrid VMamba. GazeVM-Pure is used to estimate gaze direction according to the original VMamba structure. GazeVM-Hybrid combines Convolutional Neural Network (CNN) and VMamba, where the Visual State Space (VSS) Block (the core module of VMamba) is used as a complementary component of the CNN. In GazeVM-Hybrid, the convolutional layers of ResNet-34 are used to learn local feature maps from face images, and VSS Block is used to capture global relations from feature maps. The experimental results show that GazeVM-Hybrid exhibits superior performance compared with existing state-of-the-art techniques with a nearly 0.11 decrease in angle error compared with Static Transformer Temporal Differential Network (STTDN) on the EyeDiap dataset.

GA-YOLOv8: A Road Defect Detection Model Based on Improved YOLOv8

ZhiZun Zeng, RongJie Liu, YongHang Huang, YuJie Li, BenYing Tan

VSIP 2025. (EI-Core/ACM)

Details

Abstract

Timely and accurate detection of pavement defects is essential for maintaining road safety. Traditional manual detection methods are labor-intensive, costly, and inefficient. To address the challenges posed by various defect scales, complex backgrounds, and subtle visibility of defects, this paper proposes an enhanced pavement defect detection model, GA-YOLOv8s, based on YOLOv8s. Our approach contains several key improvements. First, we add an additional P6 layer to YOLOv8s, thus introducing a multi-scale detection strategy that enables the model to capture deeper semantic features and improve the detection of small targets such as cracks. Second, we propose the C2f-GDC module, which maintains the model’s ability to extract fine-grained features, especially for small defects such as cracks, while reducing GFLOPs. Finally, we integrate the Adaptive Graphic Channel Attention (AGCA) mechanism to improve cross-layer feature fusion to effectively balance deep semantic information and shallow geometric details. Experimental results show that the mAP value is improved by 3.5% compared to the YOLOv8s model. Compared with other algorithms, the method proposed in this paper performs better in small and mild damage detection.

MamKanformer: A hybrid Mamba and KAN model based on Medformer for Medical Time-Series Classification

Zhizun Zeng, Qiuyuan Gan, Zhoukun Yan, Wei Zhang

CAIT 2024. (EI-Core/IEEE)

Details

Abstract

The rapid advancements in sensing technology and artificial intelligence have spurred a substantial number of novel demands and challenges within the realm of medical time series analysis. Notably, the effective extraction and utilization of the long-term and short-term dependency relationships, along with the features at various granularities encapsulated within high-dimensional and multivariate time series, have emerged as one of the pressing issues that demand immediate resolution. To address these, we propose a deep learning model MamKanformer for multichannel EEG signal processing. The model firstly solves the problem of long-short term dependencies which are difficult to be modeled effectively by Feed Forward in Transformer by using Mamba architecture. Subsequently, Kan is utilized to capture the critical feature information of different granularities contained in the time series data to assist in enhancing the output of the model. Finally, extensive experiments conducted on three public datasets have effectively verified the efficacy of the method proposed in this paper. Specifically, the MamKanformer model has demonstrated a superior performance over the other 11 baselines. It has achieved an average top ranking across all six metrics within the three datasets, which unequivocally attests to the outstanding capabilities of our proposed method.

Awards & Honors#

Academic Excellence

  • GPA 3.84 / 5

    Top 2% of major

  • First-Class University Scholarship

    Annual award

Competitions

  • Huawei ICT Competition · Ascend AI Track

    Global · First Prize

  • American College Mathematical Modeling Competition

    Honorable Mention

  • Challenge Cup "Cluster Assault" Virtual-Real Linkage Competition

    National · Second Prize

  • "Blue Bridge Cup" National Software Competition (Group B)

    National · Third Prize

  • National Mathematical Modeling Contest

    National · Second Prize

  • National University Mathematics Competition

    Provincial · First Prize