关于 - Zhizun Zeng

Education#

Shenzhen University

M.S. · Computer Science & Technology

2026.09 – 2029.06 Shenzhen, Guangdong

Guilin University of Electronic Technology

B.S. · Artificial Intelligence

2022.09 – 2026.06 Guilin, Guangxi

Research Interests#

Diffusion Models Flow Matching LLM VLM MLLM Medical Image Segmentation Agent

News#

Publications#

MSConvFormer: A Multi-Scale Depthwise Convolution Transformer for Medical Time Series Classification

Author list to be updated.

Digital Signal Processing (JCR-Q2), under review.

Details

Abstract

Abstract to be updated.

Paper

BibTeX

@article{msconvformer2026,
  title = {MSConvFormer: A Multi-Scale Depthwise Convolution Transformer for Medical Time Series Classification},
  author = {To be updated},
  journal = {Digital Signal Processing},
  year = {2026},
  note = {Under review}
}

Hierarchical Progressive Cross-modal Information Interaction for Incomplete Multimodal Brain Tumor Segmentation

Author list to be updated.

MICCAI 2026 (CCF-B), under review.

Details

Abstract

Abstract to be updated.

Paper

BibTeX

@inproceedings{hierarchicalprogressive2026,
  title = {Hierarchical Progressive Cross-modal Information Interaction for Incomplete Multimodal Brain Tumor Segmentation},
  author = {To be updated},
  booktitle = {MICCAI},
  year = {2026},
  note = {Under review}
}

DMKformer: A Dual-Path Routing Transformer with Hybrid Mamba and KAN for Medical Time Series

Zhizun Zeng, Qiuyuan Gan, Wei Zhang, Yujie Li

Signal, Image and Video Processing, 2026. (JCR-Q3)

Details

Abstract

Transformer has shown great potential in modeling global dependencies in sequential data, research on Transformer-based variant architectures for medical time series analysis has attracted growing attention. However, challenges remain in capturing complex local patterns and long-term dependencies in medical time series due to their high-dimensional nonlinearities and temporal correlations. To address this, we propose a Transformer model DMKformer which incorporates a new feature fusion strategy named 2R-Attention. Specifically, on the basis of the original Transformer, we adopt the STAR and the self-attention structure to enhance the ability of the attention module to capture and utilize the information of long-term dependencies in the global signal. Meanwhile, a one-dimensional DWC module is introduced to improve local information capture and reduce overfitting. Additionally, we employ Mamba to optimize FFN for long sequences and utilize a KAN network to strengthen classification capabilities. We conduct extensive experiments on three publicly available datasets under both subject-independent and subject-dependent settings. The experimental results demonstrate that our method outperforms 11 benchmark models, achieving an average F1 score improvement of 2%, validating its effectiveness.

Paper

BibTeX

@article{Zeng2026DmkformerAD,
  title={Dmkformer: a dual-path routing Transformer with hybrid Mamba and KAN for medical time series},
  author={Zhizun Zeng and Qiuyuan Gan and Wei Zhang and Yujie Li},
  journal={Signal, Image and Video Processing},
  year={2026},
  url={https://api.semanticscholar.org/CorpusID:286478032}
}

Gaze Estimation based on Visual State Space Model with Hybrid Features

Yujie Li, Rongjie Liu, Zhizun Zeng, Ziwen Wang, Yuhang Hong, Benying Tan

ACM Transactions on Sensor Networks, 2025. (JCR-Q1/CCF-B)

Details

Abstract

Visual State Space Model (denoted as VMamba), a vision-based model proposed to introduce Mamba into computer vision, has shown strong performance in recent work on computer vision tasks. However, the performance of VMamba in gaze estimation is still to be explored. In this paper, we propose two VMamba-based gaze estimation approaches: GazeVM-Pure based on pure VMamba and GazeVM-Hybrid based on hybrid VMamba. GazeVM-Pure is used to estimate gaze direction according to the original VMamba structure. GazeVM-Hybrid combines Convolutional Neural Network (CNN) and VMamba, where the Visual State Space (VSS) Block (the core module of VMamba) is used as a complementary component of the CNN. In GazeVM-Hybrid, the convolutional layers of ResNet-34 are used to learn local feature maps from face images, and VSS Block is used to capture global relations from feature maps. The experimental results show that GazeVM-Hybrid exhibits superior performance compared with existing state-of-the-art techniques with a nearly 0.11 decrease in angle error compared with Static Transformer Temporal Differential Network (STTDN) on the EyeDiap dataset.

Paper

BibTeX

@article{10.1145/3765746,
author = {Li, Yujie and Liu, Rongjie and Zeng, Zhizun and Wang, Ziwen and Hong, Yuhang and Tan, Benying},
title = {Gaze Estimation Based on Visual State Space Model with Hybrid Features},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1550-4859},
url = {https://doi.org/10.1145/3765746},
doi = {10.1145/3765746},
note = {Just Accepted},
journal = {ACM Trans. Sen. Netw.},
month = sep,
keywords = {Gaze Estimation, State Space Model, VMamba, Hybrid Model}
}

GA-YOLOv8: A Road Defect Detection Model Based on Improved YOLOv8

ZhiZun Zeng, RongJie Liu, YongHang Huang, YuJie Li, BenYing Tan

VSIP 2025. (EI-Core/ACM)

Details

Abstract

Timely and accurate detection of pavement defects is essential for maintaining road safety. Traditional manual detection methods are labor-intensive, costly, and inefficient. To address the challenges posed by various defect scales, complex backgrounds, and subtle visibility of defects, this paper proposes an enhanced pavement defect detection model, GA-YOLOv8s, based on YOLOv8s. Our approach contains several key improvements. First, we add an additional P6 layer to YOLOv8s, thus introducing a multi-scale detection strategy that enables the model to capture deeper semantic features and improve the detection of small targets such as cracks. Second, we propose the C2f-GDC module, which maintains the model’s ability to extract fine-grained features, especially for small defects such as cracks, while reducing GFLOPs. Finally, we integrate the Adaptive Graphic Channel Attention (AGCA) mechanism to improve cross-layer feature fusion to effectively balance deep semantic information and shallow geometric details. Experimental results show that the mAP value is improved by 3.5% compared to the YOLOv8s model. Compared with other algorithms, the method proposed in this paper performs better in small and mild damage detection.

Paper

BibTeX

@inproceedings{10.1145/3708568.3708574,
author = {Zeng, ZhiZun and Liu, RongJie and Huang, YongHang and Li, YuJie and Tan, BenYing},
title = {GA-YOLOv8: A Road Defect Detection Model Based on Improved YOLOv8},
year = {2025},
isbn = {9798400709647},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3708568.3708574},
doi = {10.1145/3708568.3708574},
booktitle = {Proceedings of the 2024 6th International Conference on Video, Signal and Image Processing},
pages = {36–41},
numpages = {6},
keywords = {Object Detection, Cross-Layer Feature Fusion, C2f-GDC Module, Adaptive Graphic Channel Attention (AGCA)},
series = {VSIP '24}
}

MamKanformer: A hybrid Mamba and KAN model based on Medformer for Medical Time-Series Classification

Zhizun Zeng, Qiuyuan Gan, Zhoukun Yan, Wei Zhang

CAIT 2024. (EI-Core/IEEE)

Details

Abstract

The rapid advancements in sensing technology and artificial intelligence have spurred a substantial number of novel demands and challenges within the realm of medical time series analysis. Notably, the effective extraction and utilization of the long-term and short-term dependency relationships, along with the features at various granularities encapsulated within high-dimensional and multivariate time series, have emerged as one of the pressing issues that demand immediate resolution. To address these, we propose a deep learning model MamKanformer for multichannel EEG signal processing. The model firstly solves the problem of long-short term dependencies which are difficult to be modeled effectively by Feed Forward in Transformer by using Mamba architecture. Subsequently, Kan is utilized to capture the critical feature information of different granularities contained in the time series data to assist in enhancing the output of the model. Finally, extensive experiments conducted on three public datasets have effectively verified the efficacy of the method proposed in this paper. Specifically, the MamKanformer model has demonstrated a superior performance over the other 11 baselines. It has achieved an average top ranking across all six metrics within the three datasets, which unequivocally attests to the outstanding capabilities of our proposed method.

Paper

BibTeX

@INPROCEEDINGS{10962940,
  author={Zeng, Zhizun and Gan, Qiuyuan and Yan, Zhoukun and Zhang, Wei},
  booktitle={2024 5th International Conference on Computers and Artificial Intelligence Technology (CAIT)}, 
  title={MamKanformer: A hybrid Mamba and KAN model based on Medformer for Medical Time-Series Classification}, 
  year={2024},
  volume={},
  number={},
  pages={212-217},
  keywords={Deep learning;Time series analysis;Brain modeling;Transformers;Feature extraction;Electroencephalography;Stability analysis;Feeds;Artificial intelligence;Signal resolution;EEG;multivariate time series;Transformer;Mamba;Kan},
  doi={10.1109/CAIT64506.2024.10962940}}

Awards & Honors#

Academic Excellence

GPA 3.84 / 5

Top 2% of major
First-Class University Scholarship

Annual award

Competitions

Huawei ICT Competition · Ascend AI Track

Global · First Prize
American College Mathematical Modeling Competition

Honorable Mention
Challenge Cup "Cluster Assault" Virtual-Real Linkage Competition

National · Second Prize
"Blue Bridge Cup" National Software Competition (Group B)

National · Third Prize
National Mathematical Modeling Contest

National · Second Prize
National University Mathematics Competition

Provincial · First Prize

教育经历#

深圳大学

计算机科学与技术 · 硕士

2026.09 – 2029.06 广东深圳

桂林电子科技大学

人工智能 · 本科

2022.09 – 2026.06 广西桂林

研究兴趣#

Diffusion Models Flow Matching LLM VLM MLLM Medical Image Segmentation Agent

动态#

论文发表#

MSConvFormer: A Multi-Scale Depthwise Convolution Transformer for Medical Time Series Classification

作者列表待补充。

Digital Signal Processing（JCR-Q2），审稿中。

详情

Abstract

摘要待补充。

Paper

BibTeX

Hierarchical Progressive Cross-modal Information Interaction for Incomplete Multimodal Brain Tumor Segmentation

作者列表待补充。

MICCAI 2026（CCF-B），审稿中。

详情

Abstract

摘要待补充。

Paper

BibTeX

DMKformer: A Dual-Path Routing Transformer with Hybrid Mamba and KAN for Medical Time Series

Zhizun Zeng, Qiuyuan Gan, Wei Zhang, Yujie Li

Signal, Image and Video Processing，2026。（JCR-Q3）

详情

Abstract

Paper

BibTeX

@article{Zeng2026DmkformerAD,
  title={Dmkformer: a dual-path routing Transformer with hybrid Mamba and KAN for medical time series},
  author={Zhizun Zeng and Qiuyuan Gan and Wei Zhang and Yujie Li},
  journal={Signal, Image and Video Processing},
  year={2026},
  url={https://api.semanticscholar.org/CorpusID:286478032}
}

Gaze Estimation based on Visual State Space Model with Hybrid Features

Yujie Li, Rongjie Liu, Zhizun Zeng, Ziwen Wang, Yuhang Hong, Benying Tan

ACM Transactions on Sensor Networks，2025。（JCR-Q1/CCF-B）

详情

Abstract

Paper

BibTeX

@article{10.1145/3765746,
author = {Li, Yujie and Liu, Rongjie and Zeng, Zhizun and Wang, Ziwen and Hong, Yuhang and Tan, Benying},
title = {Gaze Estimation Based on Visual State Space Model with Hybrid Features},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1550-4859},
url = {https://doi.org/10.1145/3765746},
doi = {10.1145/3765746},
note = {Just Accepted},
journal = {ACM Trans. Sen. Netw.},
month = sep,
keywords = {Gaze Estimation, State Space Model, VMamba, Hybrid Model}
}

GA-YOLOv8: A Road Defect Detection Model Based on Improved YOLOv8

ZhiZun Zeng, RongJie Liu, YongHang Huang, YuJie Li, BenYing Tan

VSIP 2025。（EI-Core/ACM）

详情

Abstract

Paper

BibTeX

@inproceedings{10.1145/3708568.3708574,
author = {Zeng, ZhiZun and Liu, RongJie and Huang, YongHang and Li, YuJie and Tan, BenYing},
title = {GA-YOLOv8: A Road Defect Detection Model Based on Improved YOLOv8},
year = {2025},
isbn = {9798400709647},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3708568.3708574},
doi = {10.1145/3708568.3708574},
booktitle = {Proceedings of the 2024 6th International Conference on Video, Signal and Image Processing},
pages = {36–41},
numpages = {6},
keywords = {Object Detection, Cross-Layer Feature Fusion, C2f-GDC Module, Adaptive Graphic Channel Attention (AGCA)},
series = {VSIP '24}
}

MamKanformer: A hybrid Mamba and KAN model based on Medformer for Medical Time-Series Classification

Zhizun Zeng, Qiuyuan Gan, Zhoukun Yan, Wei Zhang

CAIT 2024。（EI-Core/IEEE）

详情

Abstract

Paper

BibTeX

@INPROCEEDINGS{10962940,
  author={Zeng, Zhizun and Gan, Qiuyuan and Yan, Zhoukun and Zhang, Wei},
  booktitle={2024 5th International Conference on Computers and Artificial Intelligence Technology (CAIT)}, 
  title={MamKanformer: A hybrid Mamba and KAN model based on Medformer for Medical Time-Series Classification}, 
  year={2024},
  volume={},
  number={},
  pages={212-217},
  keywords={Deep learning;Time series analysis;Brain modeling;Transformers;Feature extraction;Electroencephalography;Stability analysis;Feeds;Artificial intelligence;Signal resolution;EEG;multivariate time series;Transformer;Mamba;Kan},
  doi={10.1109/CAIT64506.2024.10962940}}

奖项与荣誉#

学业表现

GPA 3.84/5

专业前 2%
校级一等奖学金

年度奖项

学科竞赛

华为 ICT 大赛 - 昇腾 AI 赛道

Global · 一等奖
美国大学生数学建模竞赛

Honorable Mention
挑战杯"揭榜挂帅"虚实联动专项赛

National · 二等奖
"蓝桥杯"全国软件和信息技术专业人才大赛（B 组）

National · 三等奖
全国大学生数学建模竞赛

National · 二等奖
全国大学生数学竞赛

Provincial · 一等奖