USING HOMOTOPY MULTI-HIERARCHICAL ENCODER REPRESENTATION FROM TRANSFORMERS (HMHERT) FOR TIME SERIES CHAOS CLASSIFICATION

Di Yu; Xue Yang

doi:10.11948/20250101

2026 Volume 16 Issue 1

Article Contents

Article navigation > Journal of Applied Analysis & Computation > 2026 > 16(1): 246-269

Next Article Previous Article

Di Yu, Xue Yang. USING HOMOTOPY MULTI-HIERARCHICAL ENCODER REPRESENTATION FROM TRANSFORMERS (HMHERT) FOR TIME SERIES CHAOS CLASSIFICATION[J]. Journal of Applied Analysis & Computation, 2026, 16(1): 246-269. doi: 10.11948/20250101

Citation:

Di Yu, Xue Yang. USING HOMOTOPY MULTI-HIERARCHICAL ENCODER REPRESENTATION FROM TRANSFORMERS (HMHERT) FOR TIME SERIES CHAOS CLASSIFICATION[J]. Journal of Applied Analysis & Computation, 2026, 16(1): 246-269. doi: 10.11948/20250101

USING HOMOTOPY MULTI-HIERARCHICAL ENCODER REPRESENTATION FROM TRANSFORMERS (HMHERT) FOR TIME SERIES CHAOS CLASSIFICATION

Di Yu^1,,
Xue Yang^2, ,

1.
School of Mathematics and Statistics, Northeast Normal University, Changchun 130024, China
2.
College of Mathematics, Jilin University, Changchun 130012, China

Author Bio: Email: yud140@nenu.edu.cn(D. Yu)
Corresponding author: Email: xueyang@jlu.edu.cn(X, Yang)

Abstract

Abstract

The Transformer architecture, renowned for its exceptional capacity in processing long-sequence data, inspires our framework that leverages its self-attention mechanism to classify time series through relational analysis between sequence elements. In this article, we propose a Homotopy Multi-Hierarchical Encoder Representation from Transformers (HMHERT), for chaotic/non-chaotic sequence classification. Empirical investigations into the linear combination coefficients of multi-head attention reveal that constrained homotopy coefficients significantly enhance model performance, with homotopy constrained configurations outperforming their unconstrained coefficient counterparts. Through systematic comparative analysis of Confusion Matrix, classification accuracy, F1-scores, and Matthews Correlation Coefficient (MCC), HMHERT exhibits significantly enhanced generalization performance, outperforming conventional models including Time-Delayed Reservoir Computing (RC), Fully Connected Neural Network (FCNN), Long Short Term Memory (LSTM), and Convolutional Neural Network (CNN) by 0.5097-0.9204 across MCC metrics. Furthermore, compared to the baseline Transformer encoder architecture, HMHERT achieves performance improvement, demonstrating the critical role of our proposed architectural modifications in chaotic pattern recognition.
- Chaotic classification /
- homotopy /
- multi-hierarchical /
- Transformer
MSC: 68T10

FullText(HTML)

References

[1]	S. J. Anagnostopoulos, J. D. Toscano, N. Stergiopulos and G. E. Karniadakis, Residual-based attention in physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering, 2024, 421: 116805. Google Scholar
[2]	A. Balestrino, A. Caiti and E. Crisostomi, Generalised entropy of curves for the analysis and classification of dynamical systems. Entropy, 2009, 11(2): 249-270. Google Scholar
[3]	N. Boullé, V. Dallas, Y. Nakatsukasa and D. Samaddar, Classification of chaotic time series with deep learning. Physica D: Nonlinear Phenomena, 2020, 403: 132261. Google Scholar
[4]	J. S. Cánovas, Topological sequence entropy of interval maps. Nonlinearity, 2003, 17(1): 49. Google Scholar
[5]	T. L. Carroll, Using reservoir computers to distinguish chaotic signals. Physical Review E, 2018, 98(5): 052209. Google Scholar
[6]	C. F. R. Chen, Q. Fan and R. Panda, Crossvit: Cross-attention multi-scale vision transformer for image classification. in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 357-366. Google Scholar
[7]	W. Chen and K. Shi, Multi-scale attention convolutional neural network for time series classification. Neural Networks, 2021, 136: 126-140. Google Scholar
[8]	Z. Cui, Q. Li, Z. Cao and N. Liu, Dense attention pyramid networks for multi-scale ship detection in sar images. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(11): 8983-8997. Google Scholar
[9]	J. Devlin, Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint, 2018. arXiv: 1810.04805. Google Scholar
[10]	H. Fan, B. Xiong, K. Mangalam, et al., Multiscale vision transformers. in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 6824-6835. Google Scholar
[11]	B. Feng and X. Zhou, Energy-informed graph transformer model for solid mechanical analyses. Communications in Nonlinear Science and Numerical Simulation, 2024, 108103. Google Scholar
[12]	N. Geneva and N. Zabaras, Transformers for modeling physical systems. Neural Networks, 2022, 146: 272-289. Google Scholar
[13]	D. Han, T. Ye, Y. Han, et al., Agent attention: On the integration of softmax and linear attention, in European Conference on Computer Vision. Springer, 2024, 124-140. Google Scholar
[14]	J. Hao and W. Zhu, Architecture self-attention mechanism: Nonlinear optimization for neural architecture search. J. Nonlinear Var. Anal, 2021, 5: 119-140. Google Scholar
[15]	A. Hemmasian and A. B. Farimani, Multi-scale time-stepping of partial differential equations with transformers. Computer Methods in Applied Mechanics and Engineering, 2024, 426: 116983. Google Scholar
[16]	Y. Li and Y. Li, A homotopy gated recurrent unit for predicting high dimensional hyperchaos. Communications in Nonlinear Science and Numerical Simulation, 2022, 115: 106716. Google Scholar
[17]	Y. Li and Y. Li, Predicting chaotic time series and replicating chaotic attractors based on two novel echo state network models. Neurocomputing, 2022, 491: 321-332. Google Scholar
[18]	D. W. Liedji, J. H. Talla Mbé and G. Kenné, Chaos recognition using a single nonlinear node delay-based reservoir computer. The European Physical Journal B, 2022, 95(1): 18. Google Scholar
[19]	D. Wenkack Liedji, J. H. Talla Mbé and G. Kenne, Classification of hyperchaotic, chaotic, and regular signals using single nonlinear node delay-based reservoir computers. Chaos: An Interdisciplinary Journal of Nonlinear Science, 2022, 32(12). Google Scholar
[20]	B. Lim, S. Ö. Arık, N. Loeff and T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 2021, 37(4): 1748-1764. Google Scholar
[21]	Z. Liu, Y. Lin, Y. Cao, et al., Swin transformer: Hierarchical vision transformer using shifted windows. in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 10012-10022. Google Scholar
[22]	E. N. Lorenz, Deterministic nonperiodic flow. Journal of Atmospheric Sciences, 1963, 20(2): 130-141. Google Scholar
[23]	H. Miao, W. Zhu, Y. Dan and N. Yu, Chaotic time series prediction based on multi-scale attention in a multi-agent environment, Chaos. Solitons & Fractals, 2024, 183: 114875. Google Scholar
[24]	S. Mukhopadhyay and S. Banerjee, Learning dynamical systems in noise using convolutional neural networks. Chaos: An Interdisciplinary Journal of Nonlinear Science, 2020, 30(10). Google Scholar
[25]	Y. Pan, T. Yao, Y. Li and T. Mei, X-linear attention networks for image captioning. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 10971-10980. Google Scholar
[26]	A. Radford, Improving language understanding by generative pre-training, 2018. Google Scholar
[27]	A. Sinha and J. Dolz, Multi-scale self-guided attention for medical image segmentation. IEEE Journal of Biomedical and Health Informatics, 2020, 25(1): 121-130. Google Scholar
[28]	J. Su, H. Li, R. Wang, et al., A hybrid dual-branch model with recurrence plots and transposed transformer for stock trend prediction. Chaos: An Interdisciplinary Journal of Nonlinear Science, 2025, 35(1). Google Scholar
[29]	S. Sun, W. Ren, X. Gao, et al., Restoring images in adverse weather conditions via histogram transformer, in European Conference on Computer Vision. Springer, 2024, 111-129. Google Scholar
[30]	A. Szczęsna, D. Augustyn, K. Harężlak, et al., Datasets for learning of unknown characteristics of dynamical systems. Scientific Data, 2023, 10(1): 79. Google Scholar
[31]	A. Vaswani, N. Shazeer, N. Parmar, et al., Attention is All You Need. Advances in Neural Information Processing Systems, 2017. Google Scholar
[32]	Y. Xiong, W. Yang, H. Liao, et al., Soft variable selection combining partial least squares and attention mechanism for multivariable calibration. Chemometrics and Intelligent Laboratory Systems, 2022, 223: 104532. Google Scholar
[33]	S. Yun and Y. Ro, Shvit: Single-head vision transformer with memory efficient macro design. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, 5756-5767. Google Scholar
[34]	A. Zeng, M. Chen, L. Zhang and Q. Xu, Are transformers effective for time series forecasting?. in Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37: 11121-11128. Google Scholar
[35]	C. Zhao, J. Ye, Z. Zhu and Y. Huang, Flrnn-fga: Fractional-order lipschitz recurrent neural network with frequency-domain gated attention mechanism for time series forecasting. Fractal and Fractional, 2024, 8(7): 433. Google Scholar
[36]	L. Zhornyak, M. A. Hsieh and E. Forgoston, Inferring bifurcation diagrams with transformers. Chaos: An Interdisciplinary Journal of Nonlinear Science, 2024, 34(5). Google Scholar
[37]	H. Zhou, S. Zhang, J. Peng, et al., Informer: Beyond efficient transformer for long sequence time-series forecasting. in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35: 11106-11115. Google Scholar
[38]	L. Zhu, X. Wang, Z. Ke, et al., Biformer: Vision transformer with bi-level routing attention. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, 10323-10333. Google Scholar