2022 Volume 12 Issue 5
Article Contents

Amina Benabid, Dan Su, Dao-Hong Xiang. LARGE MARGIN UNIFIED MACHINES WITH NON-I.I.D. PROCESS[J]. Journal of Applied Analysis & Computation, 2022, 12(5): 2110-2132. doi: 10.11948/20220222
Citation: Amina Benabid, Dan Su, Dao-Hong Xiang. LARGE MARGIN UNIFIED MACHINES WITH NON-I.I.D. PROCESS[J]. Journal of Applied Analysis & Computation, 2022, 12(5): 2110-2132. doi: 10.11948/20220222

LARGE MARGIN UNIFIED MACHINES WITH NON-I.I.D. PROCESS

  • Corresponding author: Email: daohongxiang@zjnu.cn(D. Xiang)
  • Fund Project: The authors is partially supported by the National Natural Science Foundation of China(Nos. 11871438 and U20A2068)
  • In this paper, we investigate the convergence theory of large margin unified machines (LUMs) in a non-i.i.d. sampling. We decompose the total error into sample error, regularization error and drift error. The appearance of drift error is caused by the non-identical sampling. Independent blocks sequences are constructed to transform the analysis of the dependent sample sequences into the analysis of independent blocks sequences under some mixing conditions. We also require the assumption of polynomial convergence of the marginal distributions to deal with the non-identical sampling. A novel projection operator is introduced to overcome the technical difficulty caused by the unbounded target function. The learning rates are explicitly derived under some mild conditions on approximation and capacity of the reproducing kernel Hilbert space.

    MSC: 68Q32, 41A46
  • 加载中
  • [1] P. L. Bartlett, The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network, IEEE Trans. Inf. Theory, 1998, 44(2), 525–536. doi: 10.1109/18.661502

    CrossRef Google Scholar

    [2] A. Benabid, J. Fan and D. Xiang, Comparison theorems on large-margin learning, Int. J. Wavelets Multiresolution Inf. Process., 2021, 19(05), 2150015 (18 pages). doi: 10.1142/S0219691321500156

    CrossRef Google Scholar

    [3] B. E. Boser, I. M. Guyon and V. N. Vapnik, A training algorithm for optimal margin classifiers, in Proceedings of the fifth annual workshop on Computational learning theory, 1992, 144–152.

    Google Scholar

    [4] D. Chen, Q. Wu, Y. Ying and D. Zhou, Support vector machine soft margin classifiers: error analysis, J. Mach. Learn. Res., 2004, 5, 1143–1175.

    Google Scholar

    [5] C. Cortes and V. Vapnik, Support-vector networks, Machine learning, 1995, 20, 273–297.

    Google Scholar

    [6] F. Cucker and D. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.

    Google Scholar

    [7] D. E. Edmunds and H. Triebel, Function spaces, entropy numbers, differential operators, 180, Cambridge University Press, 1996.

    Google Scholar

    [8] J. Fan and D. Xiang, Quantitative convergence analysis of kernel based large-margin unified machines, Commun. Pure Appl. Anal., 2020, 19(8), 4069–4083. doi: 10.3934/cpaa.2020180

    CrossRef Google Scholar

    [9] Y. Feng, J. Fan and J. Suykens, A statistical learning approach to modal regression, J. Mach. Learn. Res., 2020, 21(2), 1–35.

    Google Scholar

    [10] Q. Guo and P. Ye, Error analysis of least-squares $l^q$-regularized regression learning algorithm with the non-identical and dependent samples, IEEE Access, 2018, 6, 43824–43829. doi: 10.1109/ACCESS.2018.2863600

    CrossRef $l^q$-regularized regression learning algorithm with the non-identical and dependent samples" target="_blank">Google Scholar

    [11] X. Guo, T. Hu, Q. Wu, et al., Distributed minimum error entropy algorithms. , J. Mach. Learn. Res., 2020, 21, 1–31.

    Google Scholar

    [12] X. Guo, L. Li and Q. Wu, Modeling interactive components by coordinate kernel polynomial models, Math. Fund. Computing, 2020, 3(4), 263–277. doi: 10.3934/mfc.2020010

    CrossRef Google Scholar

    [13] Z. Guo and L. Shi, Classification with non-i. i. d. sampling, Math. Comput. Modell, 2011, 54(5–6), 1347–1364. doi: 10.1016/j.mcm.2011.03.042

    CrossRef Google Scholar

    [14] T. Hastie, R. Tibshirani and J. Friedman, The elements of statistical learning: Data Mining, Inference, and Prediction, Springer-Verlag, New York, 2001.

    Google Scholar

    [15] A. Khaleghi and G. Lugosi, Inferring the mixing properties of an ergodic process, arXiv: 2106.07054, 2021.

    Google Scholar

    [16] Y. Liu, H. Zhang and Y. Wu, Hard or soft classification? large-margin unified machines, J. Am. Stat. Assoc., 2011, 106(493), 166–177. doi: 10.1198/jasa.2011.tm10319

    CrossRef Google Scholar

    [17] J. S. Marron, M. J. Todd and J. Ahn, Distance-weighted discrimination, Journal of the American Statistical Association, 2007, 102(480), 1267–1271. doi: 10.1198/016214507000001120

    CrossRef Google Scholar

    [18] L. Peng, Y. Zhu and W. Zhong, Lasso regression in sparse linear model with $\varphi$-mixing errors, Metrika, 2022, 1–26.

    $\varphi$-mixing errors" target="_blank">Google Scholar

    [19] S. Smale and D. Zhou, Online learning with markov sampling, Anal. Appl., 2009, 7(01), 87–113. doi: 10.1142/S0219530509001293

    CrossRef Google Scholar

    [20] I. Steinwart and A. Christmann, Fast learning from non-i. i. d. observations, Adv. Neural Inf. Process. Syst., 2009, 22, 1–9.

    Google Scholar

    [21] I. Steinwart and A. Christmann, Estimating conditional quantiles with the help of the pinball loss, Bernoulli, 2011, 17(1), 211–225.

    Google Scholar

    [22] I. Steinwart and C. Scovel, Fast rates for support vector machines using gaussian kernels, Ann. Stat., 2007, 35(2), 575–607.

    Google Scholar

    [23] H. Sun and Q. Wu, Regularized least square regression with dependent samples, Adv. Comput. Math., 2010, 32(2), 175–189. doi: 10.1007/s10444-008-9099-y

    CrossRef Google Scholar

    [24] R. C. Williamson, A. J. Smola and B. Scholkopf, Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators, IEEE Trans. Inform. Theory, 2001, 47(6), 2516–2532. doi: 10.1109/18.945262

    CrossRef Google Scholar

    [25] K. Wong, Z. Li and A. Tewari, Lasso guarantees for $\beta $-mixing heavy-tailed time series, Ann. Stat., 2020, 48(2), 1124–1142.

    $\beta $-mixing heavy-tailed time series" target="_blank">Google Scholar

    [26] Q. Wu, Y. Ying and D. Zhou, Learning rates of least-square regularized regression, Found. Comput. Math., 2006, 6(2), 171–192. doi: 10.1007/s10208-004-0155-9

    CrossRef Google Scholar

    [27] Q. Wu, Y. Ying and D. Zhou, Multi-kernel regularized classifiers, J. Complex., 2007, 23(1), 108–134. doi: 10.1016/j.jco.2006.06.007

    CrossRef Google Scholar

    [28] D. Xiang, Logistic classification with varying gaussians, Comput. Math. Appl., 2011, 61(2), 397–407. doi: 10.1016/j.camwa.2010.11.016

    CrossRef Google Scholar

    [29] D. Xiang, Conditional quantiles with varying gaussians, Adv. Comput. Math., 2013, 38(4), 723–735. doi: 10.1007/s10444-011-9257-5

    CrossRef Google Scholar

    [30] D. Xiang and D. Zhou, Classification with gaussians and convex loss, J. Mach. Learn. Res., 2009, 10, 1447–1468.

    Google Scholar

    [31] Y. Xu and D. Chen, Learning rates of regularized regression for exponentially strongly mixing sequence, J. Stat. Plan. Inference, 2008, 138(7), 2180–2189. doi: 10.1016/j.jspi.2007.09.003

    CrossRef Google Scholar

    [32] B. Yu, Rates of convergence for empirical processes of stationary mixing sequences, Ann. Probab., 1994, 22, 94–116.

    Google Scholar

    [33] D. Zhou, The covering number in learning theory, J. Complex., 2002, 18(3), 739–767. doi: 10.1006/jcom.2002.0635

    CrossRef Google Scholar

    [34] D. Zhou, Capacity of reproducing kernel spaces in learning theory, IEEE Trans. Inform. Theory, 2003, 49(7), 1743–1752. doi: 10.1109/TIT.2003.813564

    CrossRef Google Scholar

Article Metrics

Article views(2354) PDF downloads(364) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint