CPQ Medicine (2021) 11:6
Research Article

Analysis of Excess Consumption in College Students by KPAG Method During the COVID-19 Pandemic


Xia Jiang1 & Bin Zhao2*

1Hubei University of Technology, Hospital, University in Wuhan, P.R. China
2School of Science, Hubei University of Technology, University in Wuhan, P.R. China

*Correspondence to: Dr. Bin Zhao, School of Science, Hubei University of Technology, University in Wuhan, P.R. China.

Copyright © 2021 Dr. Bin Zhao, et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: 07 April 2021
Published: 22 April 2021

Keywords: Data Science in Education; Consumption Risk; KPAG Algorithm; Early-Warning Model


Abstract

With the arrival of COVID-19, some areas are under closed management, bringing about changes in the way people consume. It also leads to the excessive consumption of some people, especially college students. In order to give early warning to unreasonable consumption behavior, this study designed KPAG algorithm to give early warning to consumption risk. Using particle swarm optimization (PSO) kernel principal component analysis (KPCA) parameter optimization, optimal polynomial kernel to delete data information, and ant colony genetic algorithm (association) clustering analysis of data dimensionality reduction, according to the consumption behavior of college students are divided into three categories, for the consumption behavior of college students to build an early warning model. Through the classification and verification experiment of real data, the results show that compared with the traditional PCA data fitting method, the accuracy of the model in this paper can reach 90%, which is more reliable than the traditional algorithm, and the accuracy of the model is improved by nearly 20%, which can be used for effective early warning.

Introduction
With the global outbreak of COVID-19 in 2019, the way people shop has partly changed. College students, as a special group of consumer groups, have the characteristics of low economic burden and high consumption power, which should be paid more attention to. At present, studies have proved that college students are more inclined to consume online than ordinary consumers (Kuswanto Heri, 2010), and online shopping festivals are an important factor affecting consumer behavior [1]. Moreover, with the closed management brought by the epidemic, the trend of college students’ online consumption and unhealthy consumption has also increased. Bollen Zoé (2021) [2] conducted a survey among Belgian college students and found that some college students consumed more alcohol during the closed management period, which required the government to pay attention and give early warning. Therefore, it is very necessary to evaluate the consumption behavior of college students and give early warning to unreasonable behaviors in the period of closed management, so as to prevent a series of campus risks such as excessive campus loans. In terms of research methods, many scholars used to establish logistic regression model to study the main factors affecting consumer consumption [3]. However, there is a high correlation between the related factors affecting college students’ consumption behavior, and in most cases, it is impossible to describe and analyze college students’ online consumption directly by using the original measurement indicators. Among many current research methods, PCA elimination is a common method to eliminate multicollinearity [4]. In view of the disadvantage that PCA cannot deal with nonlinear problems, some scholars have found that introducing kernel function for processing can obtain higher precision processing effect [5]. Some studies have established an excellent economic management model based on KPCA to solve the problem of information entanglement between data [6,7]. Nadia Souilem (2017) [8] designed an appropriate pre-filter for the optimization of the core function of KPCA algorithm. However, all the above studies only processed the data and lacked the continuous evaluation and classification of the corresponding processed objects. In addition, KPCA still has the disadvantages of setting parameters and difficulty in evaluating and classifying indexes. Ag Abo Khalil (2020) [9] found that the optimization algorithm could obtain better results in parameter optimization of kernel function. Zhao Min (2009) [10] designed a cultural particle swarm optimization algorithm. Rongyi Li (2010) [11] used particle swarm optimization to select kernel function parameters. The above research provides ideas for parameter optimization in this paper. The establishment of academic early warning model based on KPCA (Zhen Junling, 2015) [12] and fault detection model (Xiao Yiran, 2017) [13] also provides an important reference for the establishment of cluster early warning model in this study.

In this study, combined with the research content, the traditional PCA - linear regression is carried out on the consumption of evaluation model of optimization: three times by introducing kernel function solve the problem of nonlinear data processing, the introduction of optimization algorithm for the optimal kernel parameters optimization solve the defect of the parameters need to be set in combination with clustering algorithm to process the data and implement evaluation and classification of the early warning function.

KPAG-Method
Aiming at the problems of the traditional consumption warning model based on PA-Logistics, such as single warning direction, poor data dimension reduction effect, multiple fitting factors and low accuracy, this paper creatively designed the KPAG method. First of all, referred to the dimensionality reduction data processing idea of the kernel method. The sample data set was established, default kernel parameters were set, and the sample set was mapped to a higher dimension by the kernel function. First reference nuclear dimension reduction of the data processing methods, we established X sample data set and design for the default nuclear σ, then using the kernel function K to higher dimensional mapping of sample set, Φ(xi) sample data to feature space is obtained, then the principal component is obtained by principal component analysis and the sample on the principal component characteristic vector projection μ̅i, finally calculated feature space sample data within the class of discrete degree of w and discrete degree b’s difference value between classes. The default kernel parameters are judged according to the characteristic accumulative value of the first three principal components. If the requirements are not met, the objective function J will be established, and the difference between w and w will be taken as the fitness function J(σ), and the particle swarm optimization algorithm will be used to iteratively solve the optimal nuclear parameters σ. KPCA processing is carried out with σ. Finally, by referring to the idea of ant colony clustering algorithm, the early warning objective function J(w,c) is established through characteristic parameters, the uniform two-point crossover operator is introduced to update the information matrix, and the early warning result is obtained through iterative calculation. The algorithm flow chart is shown in Figure 1.


Figure 1: Algorithm flow chart

Modeling and Derivation
In the derivation, the original consumption data set is represented by X, and xi is the K-dimensional consumption impact factor. Remember that the space of KxN dimension vector is the input space, and the nonlinear mapping Φ is used to map the data sample X ={x1,x2 ,... x3} to the characteristic space H. Then, the high-dimensional consumption data set Φ(x) ={φ1, φ2,...,φ3} is obtained. By default, the high-dimensional consumption data set in the feature space has completed centralized processing, satisfying Φ(xi)Φ(xi)T=0. By PCA analysis of the high-dimensional consumption factor, the covariance matrix C can be calculated, where C satisfies relation λV = CV. After diagonalizing C, the consumption eigenvalue λ of C and the corresponding eigenvector V∈H are obtained. Since V ∈span{ Φ(x1), Φ(x2),... Φ(xi)}, it can be known that a group of coefficients α12,...,αn have the following relationship:

For the defined kernel function: K: Kij=(Φ(xi) • Φ(xj))=K(Xi,Xj), λV=CV is converted to nλα=Kα. through the kernel function, and the kernel function K(Xi,Xj) can be calculated in the original space. In order to meet the default centralization hypothesis mentioned above, the kernel matrix was adjusted to k̂ = K - InK - KIn + InKIn in this study, with r representing the number of principal components obtained through dimensionality reduction processing, k representing the contribution of the extracted k-principal component to the data set, αik representing the ith element of the feature vector, and the kth principal component of data point X representing αik K(xi,x). The first three principal component accumulation values were taken as the representative degree of the original data set. In order to optimize the kernel parameters of the introduced Gaussian kernel function K(x,xi)=exp(-‖x-xi2/σ2) and polynomial kernel function K(x,xi)=[(x•xi)+1]q, the problem was transformed into an optimization problem and PSO algorithm was introduced. There are altogether N samples of consumption data x∈K in , so the mean vector in the highdimensional consumption data space is

Calculate the difference δ=|μ̅1 - μ̅2|=|ηT(μ12)| between the pairwise mean values of the projection direction, calculate the discrete matrix Sb between classes and the discrete Sw matrix within classes:

According to the requirement of consumption warning model that the first three principal components should be greater than 0.85 and the first ten principal components should be greater than 0.95, the corresponding objective function is constructed:

min σ > 0

Taking as fitness function J(σ) , the value range of kernel parameter max min (σmaxmin) is determined by the limit of accumulation value of principal component in iteration. Set the number of particles N inertia weight and the maximum number of iterations T, randomly generate the initial population and continuously update the particle position until reaching the upper limit of iteration, output the optimal nuclear parameter σ, and complete the dimension reduction of the original consumption data. After obtaining the characteristic samples, xij represents the p characteristic parameter in the i sample, and cip represents the p characteristic parameter in the j category of consumption behavior center. According to the ant colony genetic algorithm, ant colony design clustering target is constructed. The consumption warning model of college students is as follows:

According to the transition probability formula, the ant solution is updated and the uniform two-point crossover operator is introduced to iterate and update the information matrix. When the clustering target is reached or the maximum number of iterations is reached, the clustering results are output, the ratio of characteristic data and income level is set as the threshold, the extreme samples are tagged into the model, and then different warnings are given to the characteristics of different consumer groups after classification.

Experiment and Analysis
From July to November after the COVID-19 outbreak, a questionnaire survey was conducted among undergraduates in a university in Hubei province by random sampling. According to the consumption structure ratio of contemporary college students, the design problems include nutrition, life, clothing, entertainment, and excessive consumption, with a total of 17 consumption influencing factors. Considering the different consumption evaluation of families with different incomes, the household income option is added to the questionnaire as one of the criteria to determine the warning interval in the following part. A total of 89 valid papers were recovered. Four grades were determined according to the Linkert Scale method. According to the extreme values in the questionnaire, 6 unreasonable answers were screened out, and 83 valid data were finally obtained. Part of data are shown in Table 1 below:

Table 1: Consumption questionnaire survey raw data table

Source: Questionnaires collected

Through data transcoding, the indicators are converted into 1,2,3,4 rating according to the consumption amount. Considering the special situation of consumption amount 0, the option of consumption amount 0 is separately defined as level 0. In order to represent the data indicators of college students’ consumption obtained in the questionnaire, Pearson correlation analysis was carried out on the data and covariance was calculated. Relevant data of various factors were obtained through Matlab programming, as shown in Table 2:

Table 2: Data factor correlation analysis data table


According to Table 2, it can be found that there is information entanglements among various impact factors, and the impact factors need to be dimension-reduced to reflect the consumption behavior of college students. Among all the indicators, the highest correlation with the total consumption of college students is online shopping of clothing, online skin care products and the amount of spending on Singles’ Day. It is reasonable to think that today’s online shopping culture has become an important part of contemporary university consumption and influence factors.

Table 3 is the PSO optimization parameter table. The kernel function is optimized by programming with Matlab. Set the default multinomial kernel parameter of 10 and Gaussian kernel parameter of 287 as the control group for comparison test.

Table 3: PSO optimization algorithm parameter table


After dimension-reduction processing, it can be found that KPCA is significantly better than PCA algorithm in data processing. After optimizing the Gaussian kernel function and polynomial kernel parameters respectively, it is found that when the kernel parameter is 8, the best dimension reduction effect can be obtained. The comparison of dimensionality reduction effect before and after PSO optimization is shown in Figure 2.


Figure 2: Comparison of effects before and after PSO optimization

It can be seen from Fig. 2 that when the kernel parameter σ is 8, the first three principal components of multiple linear kernel can achieve a cumulative contribution rate of 89.6%, which reaches the expectation. At the same time, the first three principal components obtained can get good aggregation effect in space, so the kernel function is considered to be the most suitable for this model.


Figure 3: Composition contribution accumulation chart

Ant colony genetic algorithm was adopted for recognition and classification, and the maximum iteration number was set as 3000. 83 samples were trained by 100 ants under 3 classification modes, and the average training time was 152 seconds. The maximum value of the ratio between consumption amount and income was set as the threshold value. Three samples of obvious consumption amount were too high, four were normal, and three samples of online shopping consumption amount was too high were labeled. After being substituted into the model, the recognition accuracy is 80%, which meets the expected requirements. PCA- regression fitting was used to calculate the consumption fitting curve. K1, K2 and K3 were used to represent the first three principal components respectively to fit the monthly consumption amount to get the consumption function:


Figure 4: Classification results of KPAG method


Figure 5: Traditional PCA-Regression consumption curve

It can be seen from Figure 3 that the obtained consumption model can divide college students into three categories according to their consumption behaviors and give early warning for different consumption behaviors. In the fitting image in Fig. 4, the traditional algorithm can partially fit the original monthly total consumption, and the fitting accuracy of the traditional PCA fitting regression algorithm is 58%. Comparison effect of the algorithm is shown in Table 4 [14-16].

Table 4: Comparison of algorithm effects


Conclusion
Through the collection of relevant data and the corresponding mathematical optimization analysis, the following conclusions are drawn:

(1) Online shopping of clothing, online shopping of consumer goods and the consumption amount of the Double 11 are the important influencing factors of college students’ consumption, which indicates that the current online consumption has had a significant impact on college students’ consumption behavior, which should be paid more attention to.
(2) Compared with the traditional PCA processing idea (Fig. 3), 7 principal components need to be extracted if the standard is met. However, the method adopted in this paper can reduce.
(3) The principal components to 2, and the cumulative contribution rate is 89.6%, which well maintains the data characteristics
(4) In this paper, the processing module of KPCA optimized by PSO can retain the features of the original data more completely and remove miscellaneous information (Fig.2). Compared with the case of no optimization, it shows a higher linear classification effect.
(5) Compared with the traditional algorithm, the consumption risk warning model designed in this paper based on KPAG method is richer in direction and can provide corresponding warnings for different consumption behaviors (Figure 4, Figure 5). The accuracy of the algorithm is higher than that of the traditional algorithm, and it has better feasibility and practical value.

Disclosure Statement
We have no conflict of interests to disclose and the manuscript has been read and approved by all named authors.

Formatting of Funding Sources
This work was supported by the Philosophical and Social Sciences Research Project of Hubei Education Department (19Y049), and the Staring Research Foundation for the Ph.D. of Hubei University of Technology (BSQD2019054), Hubei Province, China.

Bibliography

  1. Qian Shang, Jia Jin & Junping Qiu (2020). Utilitarian or hedonic: Event-related potential evidence of purchase intention bias during online shopping festivals. Neuroscience Letters, 715(134665).
  2. Bollen Zoé, Pabst Arthur, Creupelandt Coralie, Fontesse Sullivan, Lannoy Séverine & Maurage Pierre (2021). Prior drinking motives predict alcohol consumption during the COVID-19 lockdown: A cross-sectional online survey among Belgian college students. Addictive Behaviors, 115(106772), 1-5.
  3. Chen Shulan & Yan Li (2011). Analysis on Influencing Factors of College Students' Consumption Level in Guangxi Based on Order Logistic Model. Journal of Guangxi Academy of Sciences, (03), 186-189.
  4. Moeller Steen, Pisharady Pramod Kumar, Ramanna Sudhir, Lenglet Christophe, Wu Xiaoping, Dowdle Logan & Akçakaya Mehmet (2021). Noise reduction with Distribution Corrected (NORDIC) PCA in dMRI with complex-valued parameter-free locally low-rank processing. NeuroImage, 226(117539).
  5. Alsenan, S. A., Alturaiki, I. M. & Hafez, A. M. (2020). Auto-kpca: a two-step hybrid feature extraction technique for quantitative structure-activity relationship modeling. IEEE Access, 21(9), 1-2.
  6. Guo Yuanyuan (2014). Multicollinearity Elimination Problem Based on Kernel Principal Component Regression. (Doctoral dissertation, Hebei United University).
  7. Pan, W. Y. & Wang, Z. J. (2016). Evaluation of low carbon economic development level based on kernel principal component analysis. Finance and Economics, 000(004), 55-59, 91.
  8. Souilem, N., Elaissi, I. & Messaoud, H. (2017). On the use of KPCA pre-filtering for KCCA method. International Journal of Advanced Manufacturing Technology, 91, 4331-4340.
  9. Abo-Khalil, A. G. (2020). Maximum power point tracking for a PV system using tuned support vector regression by particle swarm optimization. Journal of Engineering Research, 8(4), 139-152.
  10. Zhao Min, Yang Huixian & Ou Xunyong. (2009). KPCA feature extraction based on cultural particle swarm optimization. Computer Application Research (08), 2908-2911.
  11. Li, R. Y., Zhao, J. & Shen, Z. Y. (2010). Kernel parameter optimization based on particle swarm optimization. Journal of Jiangnan University (Natural Science Edition)(04), 444-447.
  12. Zheng Junling (2015). The college students' academic warning model based on KPCA and application(a master's degree thesis, north China university of technology).
  13. Yiran Xiao & Xiaolin Tian (2017). Dark Background Image-denosing Based on KPCA Method.(eds.) Proceedings of the 2016 4th International Conference on Machinery, Materials and Information Technology Applications (pp.1128-1131). Proceedings of the 2016 4th International Conference on Machinery, Materials and Information Technology Applications.
  14. Kuswanto Heri, Hadi Pratama, Widyan Bima & Ahmad Imam Safawi (2020). Survey data on students' online shopping behaviour: A focus on selected university students in Indonesia. Data in brief., 29(105073).
  15. Yuan Hongfang, Xing Shuang & Wang Huaqing (2015). Fault diagnosis of gearbox based on kpca and improved ant colony genetic algorithm. Measurement and Control Technology, 034(006), 17-20.
  16. Fan, R., Chen, J., Xue & Zhao, Y. (2020). Quality Classification and Evaluation of Human-Machine Composite Translations of Scientific Text Based on KPCA, 2020 IEEE 3rd International Conference on Computer and Communication Engineering Technology (CCET).

Total Articles Published

8
1
1


Total Citations:

1
2
5




Highlights


Cient Periodique is a ‘Gold’ open access publisher that aspires to offer absolute free, unrestricted access to the valuable research information

We welcome all the eminent authors to submit your valuable paper

Cient Periodique invites the participation of honourable Editors and Authors

CPQ Journals provide Certificates for publication

Cient Periodique also offers memberships for potential Authors

Best Articles will be appreciated with the provision of corresponding Certificate