Shenzhen University
M.S. · Computer Science & Technology
M.S. · Computer Science & Technology
B.S. · Artificial Intelligence
Digital Signal Processing (JCR-Q2), under review.
DetailsAbstract to be updated.
@article{msconvformer2026,
title = {MSConvFormer: A Multi-Scale Depthwise Convolution Transformer for Medical Time Series Classification},
author = {To be updated},
journal = {Digital Signal Processing},
year = {2026},
note = {Under review}
}
MICCAI 2026 (CCF-B), under review.
DetailsAbstract to be updated.
@inproceedings{hierarchicalprogressive2026,
title = {Hierarchical Progressive Cross-modal Information Interaction for Incomplete Multimodal Brain Tumor Segmentation},
author = {To be updated},
booktitle = {MICCAI},
year = {2026},
note = {Under review}
}
Signal, Image and Video Processing, 2026. (JCR-Q3)
DetailsTransformer has shown great potential in modeling global dependencies in sequential data, research on Transformer-based variant architectures for medical time series analysis has attracted growing attention. However, challenges remain in capturing complex local patterns and long-term dependencies in medical time series due to their high-dimensional nonlinearities and temporal correlations. To address this, we propose a Transformer model DMKformer which incorporates a new feature fusion strategy named 2R-Attention. Specifically, on the basis of the original Transformer, we adopt the STAR and the self-attention structure to enhance the ability of the attention module to capture and utilize the information of long-term dependencies in the global signal. Meanwhile, a one-dimensional DWC module is introduced to improve local information capture and reduce overfitting. Additionally, we employ Mamba to optimize FFN for long sequences and utilize a KAN network to strengthen classification capabilities. We conduct extensive experiments on three publicly available datasets under both subject-independent and subject-dependent settings. The experimental results demonstrate that our method outperforms 11 benchmark models, achieving an average F1 score improvement of 2%, validating its effectiveness.
@article{Zeng2026DmkformerAD,
title={Dmkformer: a dual-path routing Transformer with hybrid Mamba and KAN for medical time series},
author={Zhizun Zeng and Qiuyuan Gan and Wei Zhang and Yujie Li},
journal={Signal, Image and Video Processing},
year={2026},
url={https://api.semanticscholar.org/CorpusID:286478032}
}
ACM Transactions on Sensor Networks, 2025. (JCR-Q1/CCF-B)
DetailsVisual State Space Model (denoted as VMamba), a vision-based model proposed to introduce Mamba into computer vision, has shown strong performance in recent work on computer vision tasks. However, the performance of VMamba in gaze estimation is still to be explored. In this paper, we propose two VMamba-based gaze estimation approaches: GazeVM-Pure based on pure VMamba and GazeVM-Hybrid based on hybrid VMamba. GazeVM-Pure is used to estimate gaze direction according to the original VMamba structure. GazeVM-Hybrid combines Convolutional Neural Network (CNN) and VMamba, where the Visual State Space (VSS) Block (the core module of VMamba) is used as a complementary component of the CNN. In GazeVM-Hybrid, the convolutional layers of ResNet-34 are used to learn local feature maps from face images, and VSS Block is used to capture global relations from feature maps. The experimental results show that GazeVM-Hybrid exhibits superior performance compared with existing state-of-the-art techniques with a nearly 0.11 decrease in angle error compared with Static Transformer Temporal Differential Network (STTDN) on the EyeDiap dataset.
@article{10.1145/3765746,
author = {Li, Yujie and Liu, Rongjie and Zeng, Zhizun and Wang, Ziwen and Hong, Yuhang and Tan, Benying},
title = {Gaze Estimation Based on Visual State Space Model with Hybrid Features},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1550-4859},
url = {https://doi.org/10.1145/3765746},
doi = {10.1145/3765746},
note = {Just Accepted},
journal = {ACM Trans. Sen. Netw.},
month = sep,
keywords = {Gaze Estimation, State Space Model, VMamba, Hybrid Model}
}
VSIP 2025. (EI-Core/ACM)
DetailsTimely and accurate detection of pavement defects is essential for maintaining road safety. Traditional manual detection methods are labor-intensive, costly, and inefficient. To address the challenges posed by various defect scales, complex backgrounds, and subtle visibility of defects, this paper proposes an enhanced pavement defect detection model, GA-YOLOv8s, based on YOLOv8s. Our approach contains several key improvements. First, we add an additional P6 layer to YOLOv8s, thus introducing a multi-scale detection strategy that enables the model to capture deeper semantic features and improve the detection of small targets such as cracks. Second, we propose the C2f-GDC module, which maintains the model’s ability to extract fine-grained features, especially for small defects such as cracks, while reducing GFLOPs. Finally, we integrate the Adaptive Graphic Channel Attention (AGCA) mechanism to improve cross-layer feature fusion to effectively balance deep semantic information and shallow geometric details. Experimental results show that the mAP value is improved by 3.5% compared to the YOLOv8s model. Compared with other algorithms, the method proposed in this paper performs better in small and mild damage detection.
@inproceedings{10.1145/3708568.3708574,
author = {Zeng, ZhiZun and Liu, RongJie and Huang, YongHang and Li, YuJie and Tan, BenYing},
title = {GA-YOLOv8: A Road Defect Detection Model Based on Improved YOLOv8},
year = {2025},
isbn = {9798400709647},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3708568.3708574},
doi = {10.1145/3708568.3708574},
booktitle = {Proceedings of the 2024 6th International Conference on Video, Signal and Image Processing},
pages = {36–41},
numpages = {6},
keywords = {Object Detection, Cross-Layer Feature Fusion, C2f-GDC Module, Adaptive Graphic Channel Attention (AGCA)},
series = {VSIP '24}
}
CAIT 2024. (EI-Core/IEEE)
DetailsThe rapid advancements in sensing technology and artificial intelligence have spurred a substantial number of novel demands and challenges within the realm of medical time series analysis. Notably, the effective extraction and utilization of the long-term and short-term dependency relationships, along with the features at various granularities encapsulated within high-dimensional and multivariate time series, have emerged as one of the pressing issues that demand immediate resolution. To address these, we propose a deep learning model MamKanformer for multichannel EEG signal processing. The model firstly solves the problem of long-short term dependencies which are difficult to be modeled effectively by Feed Forward in Transformer by using Mamba architecture. Subsequently, Kan is utilized to capture the critical feature information of different granularities contained in the time series data to assist in enhancing the output of the model. Finally, extensive experiments conducted on three public datasets have effectively verified the efficacy of the method proposed in this paper. Specifically, the MamKanformer model has demonstrated a superior performance over the other 11 baselines. It has achieved an average top ranking across all six metrics within the three datasets, which unequivocally attests to the outstanding capabilities of our proposed method.
@INPROCEEDINGS{10962940,
author={Zeng, Zhizun and Gan, Qiuyuan and Yan, Zhoukun and Zhang, Wei},
booktitle={2024 5th International Conference on Computers and Artificial Intelligence Technology (CAIT)},
title={MamKanformer: A hybrid Mamba and KAN model based on Medformer for Medical Time-Series Classification},
year={2024},
volume={},
number={},
pages={212-217},
keywords={Deep learning;Time series analysis;Brain modeling;Transformers;Feature extraction;Electroencephalography;Stability analysis;Feeds;Artificial intelligence;Signal resolution;EEG;multivariate time series;Transformer;Mamba;Kan},
doi={10.1109/CAIT64506.2024.10962940}}
GPA 3.84 / 5
Top 2% of major
First-Class University Scholarship
Annual award
Huawei ICT Competition · Ascend AI Track
Global · First Prize
American College Mathematical Modeling Competition
Honorable Mention
Challenge Cup "Cluster Assault" Virtual-Real Linkage Competition
National · Second Prize
"Blue Bridge Cup" National Software Competition (Group B)
National · Third Prize
National Mathematical Modeling Contest
National · Second Prize
National University Mathematics Competition
Provincial · First Prize
计算机科学与技术 · 硕士
人工智能 · 本科
Digital Signal Processing(JCR-Q2),审稿中。
详情摘要待补充。
MICCAI 2026(CCF-B),审稿中。
详情摘要待补充。
Signal, Image and Video Processing,2026。(JCR-Q3)
详情Transformer has shown great potential in modeling global dependencies in sequential data, research on Transformer-based variant architectures for medical time series analysis has attracted growing attention. However, challenges remain in capturing complex local patterns and long-term dependencies in medical time series due to their high-dimensional nonlinearities and temporal correlations. To address this, we propose a Transformer model DMKformer which incorporates a new feature fusion strategy named 2R-Attention. Specifically, on the basis of the original Transformer, we adopt the STAR and the self-attention structure to enhance the ability of the attention module to capture and utilize the information of long-term dependencies in the global signal. Meanwhile, a one-dimensional DWC module is introduced to improve local information capture and reduce overfitting. Additionally, we employ Mamba to optimize FFN for long sequences and utilize a KAN network to strengthen classification capabilities. We conduct extensive experiments on three publicly available datasets under both subject-independent and subject-dependent settings. The experimental results demonstrate that our method outperforms 11 benchmark models, achieving an average F1 score improvement of 2%, validating its effectiveness.
@article{Zeng2026DmkformerAD,
title={Dmkformer: a dual-path routing Transformer with hybrid Mamba and KAN for medical time series},
author={Zhizun Zeng and Qiuyuan Gan and Wei Zhang and Yujie Li},
journal={Signal, Image and Video Processing},
year={2026},
url={https://api.semanticscholar.org/CorpusID:286478032}
}
ACM Transactions on Sensor Networks,2025。(JCR-Q1/CCF-B)
详情Visual State Space Model (denoted as VMamba), a vision-based model proposed to introduce Mamba into computer vision, has shown strong performance in recent work on computer vision tasks. However, the performance of VMamba in gaze estimation is still to be explored. In this paper, we propose two VMamba-based gaze estimation approaches: GazeVM-Pure based on pure VMamba and GazeVM-Hybrid based on hybrid VMamba. GazeVM-Pure is used to estimate gaze direction according to the original VMamba structure. GazeVM-Hybrid combines Convolutional Neural Network (CNN) and VMamba, where the Visual State Space (VSS) Block (the core module of VMamba) is used as a complementary component of the CNN. In GazeVM-Hybrid, the convolutional layers of ResNet-34 are used to learn local feature maps from face images, and VSS Block is used to capture global relations from feature maps. The experimental results show that GazeVM-Hybrid exhibits superior performance compared with existing state-of-the-art techniques with a nearly 0.11 decrease in angle error compared with Static Transformer Temporal Differential Network (STTDN) on the EyeDiap dataset.
@article{10.1145/3765746,
author = {Li, Yujie and Liu, Rongjie and Zeng, Zhizun and Wang, Ziwen and Hong, Yuhang and Tan, Benying},
title = {Gaze Estimation Based on Visual State Space Model with Hybrid Features},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1550-4859},
url = {https://doi.org/10.1145/3765746},
doi = {10.1145/3765746},
note = {Just Accepted},
journal = {ACM Trans. Sen. Netw.},
month = sep,
keywords = {Gaze Estimation, State Space Model, VMamba, Hybrid Model}
}
VSIP 2025。(EI-Core/ACM)
详情Timely and accurate detection of pavement defects is essential for maintaining road safety. Traditional manual detection methods are labor-intensive, costly, and inefficient. To address the challenges posed by various defect scales, complex backgrounds, and subtle visibility of defects, this paper proposes an enhanced pavement defect detection model, GA-YOLOv8s, based on YOLOv8s. Our approach contains several key improvements. First, we add an additional P6 layer to YOLOv8s, thus introducing a multi-scale detection strategy that enables the model to capture deeper semantic features and improve the detection of small targets such as cracks. Second, we propose the C2f-GDC module, which maintains the model’s ability to extract fine-grained features, especially for small defects such as cracks, while reducing GFLOPs. Finally, we integrate the Adaptive Graphic Channel Attention (AGCA) mechanism to improve cross-layer feature fusion to effectively balance deep semantic information and shallow geometric details. Experimental results show that the mAP value is improved by 3.5% compared to the YOLOv8s model. Compared with other algorithms, the method proposed in this paper performs better in small and mild damage detection.
@inproceedings{10.1145/3708568.3708574,
author = {Zeng, ZhiZun and Liu, RongJie and Huang, YongHang and Li, YuJie and Tan, BenYing},
title = {GA-YOLOv8: A Road Defect Detection Model Based on Improved YOLOv8},
year = {2025},
isbn = {9798400709647},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3708568.3708574},
doi = {10.1145/3708568.3708574},
booktitle = {Proceedings of the 2024 6th International Conference on Video, Signal and Image Processing},
pages = {36–41},
numpages = {6},
keywords = {Object Detection, Cross-Layer Feature Fusion, C2f-GDC Module, Adaptive Graphic Channel Attention (AGCA)},
series = {VSIP '24}
}
CAIT 2024。(EI-Core/IEEE)
详情The rapid advancements in sensing technology and artificial intelligence have spurred a substantial number of novel demands and challenges within the realm of medical time series analysis. Notably, the effective extraction and utilization of the long-term and short-term dependency relationships, along with the features at various granularities encapsulated within high-dimensional and multivariate time series, have emerged as one of the pressing issues that demand immediate resolution. To address these, we propose a deep learning model MamKanformer for multichannel EEG signal processing. The model firstly solves the problem of long-short term dependencies which are difficult to be modeled effectively by Feed Forward in Transformer by using Mamba architecture. Subsequently, Kan is utilized to capture the critical feature information of different granularities contained in the time series data to assist in enhancing the output of the model. Finally, extensive experiments conducted on three public datasets have effectively verified the efficacy of the method proposed in this paper. Specifically, the MamKanformer model has demonstrated a superior performance over the other 11 baselines. It has achieved an average top ranking across all six metrics within the three datasets, which unequivocally attests to the outstanding capabilities of our proposed method.
@INPROCEEDINGS{10962940,
author={Zeng, Zhizun and Gan, Qiuyuan and Yan, Zhoukun and Zhang, Wei},
booktitle={2024 5th International Conference on Computers and Artificial Intelligence Technology (CAIT)},
title={MamKanformer: A hybrid Mamba and KAN model based on Medformer for Medical Time-Series Classification},
year={2024},
volume={},
number={},
pages={212-217},
keywords={Deep learning;Time series analysis;Brain modeling;Transformers;Feature extraction;Electroencephalography;Stability analysis;Feeds;Artificial intelligence;Signal resolution;EEG;multivariate time series;Transformer;Mamba;Kan},
doi={10.1109/CAIT64506.2024.10962940}}
GPA 3.84/5
专业前 2%
校级一等奖学金
年度奖项
华为 ICT 大赛 - 昇腾 AI 赛道
Global · 一等奖
美国大学生数学建模竞赛
Honorable Mention
挑战杯"揭榜挂帅"虚实联动专项赛
National · 二等奖
"蓝桥杯"全国软件和信息技术专业人才大赛(B 组)
National · 三等奖
全国大学生数学建模竞赛
National · 二等奖
全国大学生数学竞赛
Provincial · 一等奖