Mobile QR Code QR CODE

  1. (School of Marxism, Wuhan Textile University, Wuhan, Hubei 430073, China lyj994598@163.com )



Gaussian mixture model, Learning mode, LSTM, Performance prediction, Self-attention

1. Introduction

Online education is gradually becoming the key to development of education in the Internet era [1,2]. The advantages of e-learning lie in its abundant resources and sufficient freedom to choose the learning place and time[3]. Due to these advantages, e-learning is rapidly spreading all over the world, providing good prospects for development of modern education [4]. However, many studies have found that because of the freedom to choose the learning time and place, most students demonstrate inefficient, even ineffective, learning. This is mainly reflected in low course completion rates and failing-grade test scores. These problems have become obstacles to the development of online learning [5-7]. Learning behavior data can be used as predictive information to understand students' current learning status. From online learning platforms, massive amounts of student learning behavior data in student learning logs can be obtained [8,9]. Due to the huge amounts of data, a single statistical method has not been found to deal with data classification. Unlike traditional classrooms, e-learning platforms attract a variety of learners with differences in motivation, training, and learning styles. Learning style is often the most critical factor affecting a learner’s individual differences, and it often has a significant impact on both learning behavior and learning results [10]. Different learning models often reflect different learning motivations. Therefore, in-depth analysis of learning models and learners’ motivations will help to better understand learning conditions and predict learning results. In order to improve the effectiveness and efficiency of e-learning, this research identifies learning behaviors in students based on a Gaussian mixture model, and a bidirectional long short-term memory (Bi-LSTM) model predicts performance so both students and teachers can understand students’ learning status over time.

2. Related Work

Many scholars have conducted quality research into learning achievement prediction. Song et al. proposed academic achievement prediction based on sequential participation, consisting of a joint detector and a sequence predictor. The results showed better performance than existing methods in designing the detection mechanism [11]. Xu et al. analyzed the impact of general online learning behavior on student performance. The results showed that students' online behavior can predict their grades, and as the course progressed, the prediction results were more stable and reliable [12]. Hao et al. built a Bayesian network for predicting student achievement through a mountain climbing algorithm and maximum likelihood estimation. Experiments showed this method can accurately predict students' performance in a massive open online course (MOOC) [13]. Some studies build a prediction model of a student’s grade-point average based on machine learning. The results show that factors such as learning progress and personal characteristics are related to a student’s academic achievements [14]. Nuankaew and Sararat analyzed achievement data based on data mining from high school students in rural schools. The results showed that a reasonable distribution level of student academic achievement is composed of four clusters [15]. From data mining, scholars have carried out many studies. Long suggested that art education has not been paid enough attention. In order to provide reference standards for teaching art, he used the ID3 algorithm based on a decision tree to build a data mining model for analyzing art examination results [16]. The results showed that this method has advantages in obtaining information on student characteristics. To improve accuracy in evaluations of the teaching quality in physical education courses for colleges and universities, based on the hybrid technology of data mining and the hidden Markov model, Zeng conducted an in-depth study from two aspects: teachers' teaching ability and students' learning effect. The experimental analysis showed that the model is helpful for improving accuracy in evaluating teaching quality in college physical education [17]. Mart\'{i}n-Garc\'{i}a et al.analyzed the stages for adoption of blended learning methods in the context of higher education, and evaluated the relationship between these stages and a set of variables. The variables were related to personal and professional characteristics, perceptual attributes, and background variables. The results showed that the intention to use blended learning is the most important predictor of all application models [18]. The data mining method adopted by Safitri et al. was clustering analysis based on K-means and decision tree algorithms. The K-means process was used to find groups of students with similar learning patterns. The decision tree was used to model the clustering results to realize the analysis and decision-making process [19].

To sum up, in predictions of learning achievement, scholars have adopted a sequential predictor or a Bayesian network model. In data mining, scholars mainly mined and analyzed personal characteristics or teaching effects. Therefore, there are few studies on recognition of learning-behavior patterns in online learning, and there are few studies on introducing two-way LSTM into performance prediction. In view of this, this study clusters potential learning groups based on the Gaussian mixture model, and uses Bi-LSTM to predict students’ performance.

3. Learning-pattern Recognition and Results Prediction

3.1 Pattern Recognition based on the Gaussian Mixture Model

At present, the conclusions drawn from the analysis of individual learning behavior data on college students are mostly one-sided. This is because online learning is different from in-class learning, covering a wider range of topics, and including more students. With expansion of the number of samples, analysis of behavioral data for groups of students will be more meaningful than individual analysis. To better understand the characteristics of students' group learning, this study first defines the potential learning groups, the learning modes, and learning motivations. The learning group refers to individuals with similar learning behaviors. The learning mode represents common learning characteristics of the potential online learning group. Learning motivation refers to the students’ intentions and desire to learn independently. Table 1 is a summary of online learning behavior data from students in different courses in one university.

As shown in Table 1, the online learning behavior data selected for this study include the number of students, homework exercises, unit tests, videos watched, discussions, and weeks of learning. Considering that data defects and noise will interfere with the identification model, these data need to be transformed and cleaned in advance. The purpose of data transformation is to delete invalid data and unify the dimensions of different data. Data cleaning for this study involved the mean method to supplement missing data. Based on the above data, this study puts student learning behavior data into two kinds of recognition: learning payoff and learning harvest. Between them, the effort feature corresponds to the learning behavioral effort, and the harvest feature corresponds to the learning effect. Eq. (1) is the mathematical expression of learning behavior.

(1)
$ effort^{w}=\frac{\sum _{i=1}^{n}a_{i}\times ef_{i}^{w}}{\sum _{i=1}^{n}a_{i}} $

In Eq. (1), $effort^{w}$ represents the amount of learning behavioral effort in a week, $a_{i}$ represents the weight of the $i$-th learning behavior calculated by the Pearson correlation coefficient method, $ef_{i}^{w}$ represents the amount that students pay for the $i$-th learning behavior in week $w$, and $n$ is the total number of student learning behavior categories. Eq. (2) is the mathematical expression of learning harvest.

(2)
$ effect^{w}=\frac{effect^{w}-effect_{\max }^{w}}{effect_{\max }^{w}-effect_{\min }^{w}} $

In Eq. (2), $effect^{w}$ represents the learning the student gained in week w, while $effect_{\max }^{w}$ and $effect_{\min }^{w}$ represent the maximum and minimum harvest. Eq. (3) is the mathematical expression of the Pearson correlation coefficient.

(3)
$ R=\frac{1}{n-1}\sum _{i=1}^{n}\left(\frac{X_{i}-\overline{X}}{\delta _{X}}\right)\left(\frac{Y_{i}-\overline{Y}}{\delta _{Y}}\right) $

In Eq. (3), $X_{i}$ represents sample data; $\left(\frac{X_{i}-\overline{X}}{\delta _{X}}\right)$ represents the standard score of the sample data, in which $\overline{X}$ represents the mean value of the sample data, and $\delta _{X}$ is the standard deviation of the sample data. Eq. (4) is the learning efficiency calculation.

(4)
$ ratio^{w}=\frac{effect^{w}}{effort^{w}} $

In Eq. (4), $ratio^{w}$ represents the weekly learning efficiency of each student taking a certain course. Learning efficiency is based on the ratio of the students' weekly effort to their weekly gain. The learning efficiency sequence of a course consists of the entire week’s learning efficiency, which is recorded as $E_{ratio}=\left(ratio^{1}\right.,\,\,ratio^{2},$ $\left.\ldots ,\,\,ratio^{w}\right)$. The learning efficiency sequence is the input value of the Gaussian mixture model (GMM). Gaussian mixture is used for clustering analysis with the help of several linear combinations that obey a Gaussian distribution. Eq.(5) is the mathematical expression of the GMM.

(5)
$ \left\{\begin{array}{l} P\left(E_{ratio}\left| \theta \right.\right)=\sum _{k=1}^{K}\alpha _{k}\phi \left(E_{ratio}\left| \theta \right._{k}\right)\\ \phi \left(E_{ratio}\left| \theta \right._{k}\right)=\frac{1}{\sqrt{2\pi \delta _{k}}}\exp \left(-\frac{\left(E_{ratio}-\mu _{k}\right)^{2}}{2{\delta _{k}}^{2}}\right) \end{array}\right. $

In Eq. (5), $P\left(E_{ratio}\left| \theta \right.\right)$ represents the GMM mark, $\phi \left(E_{ratio}\left| \theta \right._{k}\right)$ represents the $k$-th Gaussian distribution function, and $\alpha _{k}$ is a mixed weight with a positive or zero value. The expression $\theta _{k}=\left(\mu _{k},{\delta _{k}}^{2}\right)$ is the intrinsic parameter of the Gaussian distribution function, in which $\mu _{k}$ and ${\delta _{k}}^{2}$ represent the mean and variance of the Gaussian distribution function, and k indicates the number of clusters. The output of the GMM in this study is the learning behavior model of the students. Fig. 1 is a schematic of the student learning motivation prediction model.

Fig. 1 illustrates the study’s capsule neural network as an objective learning method to predict students' learning motivation. The network includes a feature capsule layer, a motivation capsule layer, and an output layer. From the perspective of network flow, the feature capsule layer is the input layer. This layer is responsible for combining the students’ learning behavior feature vector and the learning efficiency sequence vector to obtain the feature capsule. The core of the dynamic capsule layer is the dynamic routing unit and four independent capsule units. The dynamic routing unit is responsible for transmitting the upper layer feature information, and the capsule unit retains the learning motivation information. The output layer uses a fully connected layer and normalization processing to get the classification results. Eq. (6) calculates the output layer loss function.

(6)
$ loss=\frac{1}{n}\sum _{i=1}^{n}\left(y_{i}-\overset{\frown }{y}_{i}\right)^{2} $

In Eq. (6), $n$ represents network training sample data, and $y_{i}$ and $\overset{\frown }{y}_{i}$ represent prediction labels and tag labels, respectively. The capsule neural network carries out network input and network output in the form of a vector, which will be beneficial in representing the characteristic information of learning data.

Fig. 1. Schematic diagram of the student learning motivation prediction model.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig1.png
Table 1. Data Sheet of Students' Online Learning Behavior in Different Courses.

Curriculum

# of students

# of homework exercises

# of unit tests

# of videos watched

# of discussions

# of study weeks

Advanced math

11228

11125

102281

11373

6724

16

College Physics

8550

9328

82940

10859

5411

12

English Listening and Speaking

9305

11111

90245

11242

2925

12

Computer Fundamentals

10768

11114

67650

11330

3467

10

College Chinese

6308

3231

50877

8809

3493

10

Sports

6875

3554

53194

4573

3035

12

3.2 Prediction from LSTM and the Self-attention Mechanism

The input data of the learning score model depends on the learning behavior model and the learning motivation obtained in the previous section. Considering that predicting students' performance is a multi-classification task, this research analyzes the relationship between students' learning behavior data based on Bi-LSTM and the self-attention mechanism (SAM) to predict performance. Fig. 2 is a schematic of the Bi-LSTM network structure.

In Fig. 2, Bi-LSTM is composed of forward LSTM and backward LSTM. The distribution positions of the two are in an up-down relationship. The input layer simultaneously sends sequence features into the forward and backward LSTM, and the output of the whole network is the combination and splicing of the output results of these two networks. Compared with one-way LSTM, Bi-LSTM has both front-to-back and back-to-front information flow. This two-way structure is conducive to a model obtaining the dynamic relationship between features of the learning sequence, and enables the model to more accurately identify dependencies between the data. Eq. (7) is the mathematical expression of the Bi-LSTM network.

(7)
$ \left\{\begin{array}{l} \overset{\rightarrow }{h}_{t}=\overset{\rightarrow }{LSTM}\left(W_{t},\overset{\rightarrow }{h}_{t-1}\right)\\ \overset{\leftarrow }{h}_{t}=\overset{\leftarrow }{LSTM}\left(W_{t},\overset{\leftarrow }{h}_{t+1}\right)\\ h_{t}=\left(\overset{\rightarrow }{h}_{t},\overset{\leftarrow }{h}_{t}\right) \end{array}\right. $

In Eq. (7), $\overset{\rightarrow }{LSTM}$ and $\overset{\leftarrow }{LSTM}$ represent the forward LSTM network and the backward LSTM network, respectively, $\overset{\rightarrow }{h}_{t}$ and $\overset{\leftarrow }{h}_{t}$, respectively, represent the state values of forward and backward hidden layers at time $t$, and $h_{t}$ represents the Bi-LSTM network status value at time $t$. This value is spliced by the state values of the forward hidden layer and the backward hidden layer. Although Bi-LSTM has the ability to acquire the dynamic relationship of the learning feature sequence, it does not have a unit for weighting the sequence features. Therefore, this study introduces the self-attention mechanism to weight the learning feature sequence. Fig. 3 is a schematic for weight vector generation of the SAM.

In Fig. 3, the difference between the self-attention mechanism and the general attention mechanism lies in the introduction of three weight vectors: the query weight vector (Q), the key weight vector (K), and the value weight vector (V). Since the SAM generates these three weight vectors with the same input value, it has the ability to analyze the internal relationship of the input sequence characteristics. Eq. (8) expresses the weighted sequence feature vector.

(8)
$ A\left(Q,K,V\right)=\text{softmax}\left(\frac{QK^{T}}{\sqrt{d_{k}}}\right)V $

In Eq. (8), $d_{k}$ represents the dimension of vector $Q$ and vector $K$. $A\left(Q,K,V\right)$ is the weighted sequence feature vector. Eq. (8) shows that the first step of the $A\left(Q,K,V\right)$ calculation is to obtain $QK$ through point multiplication. The second step is to normalize $QK$. Finally, the normalized result and vector $V$ are multiplied. Eq. (9) is the output result of the performance prediction model.

(9)
$ y=\text{softmax}\left(W_{s}\alpha ^{T}\right) $

In Eq. (9), $\alpha $ is the input vector, $W_{s}$ is the training parameter matrix, and $y$ is the output. Eq.(10) expresses the loss function used by the model to optimize the output.

(10)
$ loss_{\text{softmax}}=-\sum _{j=}^{N}y_{j}\log p_{j} $

In Eq. (10),$N$ is the number of output result categories, and $y_{j}$ and $p_{j}$ represent real labels and forecast labels, respectively. Fig. 4 shows the flow chart for performance prediction.

As shown in Fig. 4, the Bi-LSTM network obtains the output value of the hidden layer based on the weekly learning feature sequence. The output value together with the learning mode, learning motivation, and basic attribute feature vector, are input for the self-attention mechanism in the weighting calculation. After that, the weighted results are output through the fully connected layer and the normalization layer.

Fig. 2. The Bi-LSTM network’s structure.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig2.png
Fig. 3. Weighted calculation of the self-attention mechanism.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig3.png
Fig. 4. Schematic of the performance prediction process.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig4.png

4. Learning-pattern Recognition and Results Prediction

4.1 Results from Learning Modes

This study takes advanced mathematics and college physics as examples to analyze students' learning-behavior patterns. According to the results of cluster analysis, the learning rules of the students are discussed based on average learning efficiency, homework practice time, video learning time, and unit test scores. Fig. 5 shows the results of learning-behavior pattern recognition.

In Fig. 5(a), the GMM-based learning-behavior pattern recognition model adopted in this study divides potential learning groups into four categories. Each category corresponds to a learning behavior mode. The distribution of the four learning-behavior patterns is independent, so the clustering method provides good performance. Fig. 5(b) shows the change in learning-effort intensity corresponding to the four learning modes. The learning effort of students who adopt learning mode 1 is higher than 0.25, the curve trend is stable, and the fluctuation is small. Minimum and maximum learning effort are 0.275 and 0.47, respectively (a difference of less than 0.3). This shows that such students have a good and sustainable learning state. Therefore, this study names the learning behavior mode adopted by such students as planned. Students who adopt learning mode 2 only show temporary enthusiasm for learning at the beginning of the course and at the end of the course. Overall learning enthusiasm is low, and the total learning effort is less than 0.15. Therefore, this kind of student is classified as focused in order to conduct timely supervision. During the course, a student in learning mode 3 slowly climbs to 0.225, then briefly drops to 0.128, and finally, climbs quickly to 0.381. This shows that such students have a certain degree of autonomous learning ability, but a lack of planning, showing characteristics of surprise learning. Therefore, the mode is named catch-up. In the learning cycle of an entire course, students who adopt learning mode 4 put in no effort. Such students are unwilling to learn and have no enthusiasm for learning. Therefore, this research classifies them as stagnant.

Fig. 6(a) shows the mean curves for learning efficiency in the four types of students in higher mathematics. The average fluctuation in student learning efficiency from planned mode is the smallest, and the difference between the minimum and maximum is 0.88. The average learning efficiency of catch-up mode fluctuates greatly, with a maximum difference of 1.4. The overall learning efficiency of focused mode students declines, and their learning state is poor. Stagnant students hardly ever study, and the average learning efficiency is 0. Fig. 6(b) shows mean curves for learning efficiency of the four kinds of students in college physics courses. During the course, planned mode students still maintain the highest average learning efficiency. The learning efficiency curves of the other three types of students are in line with the characteristics of their learning mode.

As shown in Fig. 7, the durations of homework practice and video learning for stagnant students is almost 0, and only a short amount of time with video learning was spent in the sixth week. The length of homework practice by focused students showed a slow downward trend, which stayed at around 0.025 at the beginning and then decreased to 0. The video learning duration in such students first increased and then decreased. The duration in catch-up-mode students' learning activities was mainly concentrated in the middle and late stages of learning, and the maximum duration in video learning occurred in the seventh week. Video learning and homework practice by planned-mode students have characteristics of synchronous change, which indicates students can complete the learning tasks on time.

Figs. 8(a) and (b) show the mean curves for unit test scores of students in advanced mathematics and college physics courses, respectively. Stagnant students have a unit test score of almost 0 because their effort in course study is zero. The weekly unit test scores of focused students generally showed a decreasing trend, which is consistent with their homework practice curve and learning efficiency curve. In the first two weeks of the course, these students scored well on unit tests. However, due to less effort in follow-up study (homework was not completed on time, and they did not watch videos on time), the unit test scores fell below 0.2 after the fourth week. The mathematics unit test scores of catch-up-mode students were higher than 0.4, and the highest value appeared in the seventh week (0.91). Physics test scores were not less than 0.2, and the highest appeared in the tenth week (0.74). This shows that these students have the ability to learn quickly, but are poor at continuous learning. Planned-mode students showed stable and excellent results in advanced mathematics courses, and most of the weekly test scores were higher than 0.9. Although this type of student did not perform well on the initial unit tests for college physics, as the courses progressed, their learning plan was clear, and test scores were not less than 0.65 after the third week.

Fig. 5. Learning-behavior pattern recognition.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig5.png
Fig. 6. Mean curves for learning efficiency in students of different learning types.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig6.png
Fig. 7. Mean curves for learning activity duration in the different types of student.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig7.png
Fig. 8. Unit test score mean curves.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig8.png

4.2 Analysis of GMM, Bi-LSTM, and SAM Learning-achievement Prediction

To verify the effectiveness of the student achievement prediction model based on GMM, Bi-LSTM, and SAM, this study looks at learning behavior log data from six public courses provided online (advanced mathematics, physics, English listening and speaking, computer fundamentals, college Chinese, and sports). The sample was used for model training and testing at a ratio of 7:3. The improved K-means algorithm and the LadFG model are used as comparison methods. The former is an unsupervised clustering algorithm with a high utilization rate, while the latter has the advantage of taking into account the temporal characteristics of the sequence features. The evaluation indicators are weighted accuracy and weighted F1-score. The higher the values of these two indicators, the better the prediction effect from the model. Table 2 shows the performance index results of the three performance prediction models.

As shown in Table 2, the model proposed in this paper achieved the highest value in weighted accuracy and weighted F1-score for performance prediction in all six public courses. Taking advanced mathematics as an example, weighted accuracy and weighted F1-score for the model in this study were 0.865 and 0.869, respectively, which are 11.9% and 13.1% better than the improved K-means algorithm, and 20.8% and 21.2% higher than the LadFG algorithm. This is because the improved K-means algorithm directly classifies the learning behavior sequence data into different clusters, while the GMM model calculates the probability of the learning behavior sequence data falling into different clusters, so the latter can contain more feature information in performance prediction. The reason for the low prediction performance of the LadFG algorithm is that the model mainly focuses on similarity in time and density between learning-behavior sequences without taking into account the amount of learning effort and the learning gained, which easily leads to classification of learning patterns in the model. In addition, the LadFG algorithm uses one-way feedback to examine the sequence data, which is not conducive to obtaining influencing relationships between sequence features. Therefore, the Bi-LSTM network adopted in this study has more advantages for improving prediction accuracy. Table 3 shows the ablation results of the proposed model for learning pattern recognition.

As shown in Table 3, the performance prediction model with GMM obtained higher values in terms of weighted accuracy and weighted F1-score. For example, in advanced mathematics, these two indicators were 0.887 and 0.891, respectively, which are 13.8% and 14.1% higher than the prediction model without GMM. For English listening and speaking, these two indicators were 0.924 and 0.926, which are 5% and 5.3% higher than the prediction model without GMM. Therefore, learning mode and recognition play a significant role in improving performance prediction, which are the influencing factors to be considered in performance prediction. Fig. 9 shows the variation in weighted accuracy and weighted F1-score versus the number of iterations.

In Fig. 9(a), as the number of iterations increased, the weighted accuracy curve of the six public courses first rose rapidly, then slowed down, and finally tended to remain horizontal. This shows that the number of iterations has a significant impact on weighted accuracy. The six curves tended to converge after eight iterations. In Fig. 9(b), variation in the weighted F1-scores of the six courses is the same as the curve in Fig. 9(a). After eight iterations, the curve tended to converge. Therefore, when training and testing the model, the number of iterations can be set greater than or equal to 8.

Fig. 9. The change of weighted accuracy and weighted F1-score versus the number of iterations.
../../Resources/ieie/IEIESPC.2023.12.5.404/fig9.png
Table 2. Performance Index Results.

Model

This paper

Improved

K- Means

LadFG

Curriculum

Weighted accuracy

Weighted F1-score

Weighted accuracy

Weighted F1-score

Weighted accuracy

Weighted F1-score

Advanced Mathematics

0.865

0.869

0.773

0.768

0.716

0.717

College Physics

0.921

0.918

0.869

0.865

0.748

0.744

English Listening and Speaking

0.902

0.904

0.888

0.885

0.879

0.886

Computer Fundamentals

0.914

0.905

0.81

0.788

0.796

0.791

College Chinese

0.856

0.835

0.812

0.791

0.735

0.733

Sports

0.857

0.858

0.827

0.828

0.817

0.814

Overall

0.886

0.882

0.830

0.821

0.782

0.781

Table 3. Experimental Results from Ablation.

Model

GMM-Bi-LSTM-SAM

Bi-LSTM-SAM

Curriculum

Weighted accuracy

Weighted F1-score

Weighted accuracy

Weighted F1-score

Advanced Mathematics

0.887

0.891

0.779

0.781

College Physics

0.943

0.94

0.927

0.927

English Listening and Speaking

0.924

0.926

0.88

0.879

Computer Fundamentals

0.936

0.927

0.905

0.895

College Chinese

0.878

0.857

0.682

0.695

Sports

0.879

0.88

0.828

0.831

Overall

0.908

0.904

0.834

0.835

5. Conclusion

Diversified and massive electronic curriculum resources means the online learning platform has gradually become an important way for colleges and universities to cultivate talent. Although online learning broadens students' learning horizons, its effect is difficult to guarantee because it is unsupervised. To make full use of a large amount of learning behavior data on students using the online platform, a learning-behavior pattern recognition method based on GMM and a performance prediction method based on Bi-LSTM and SAM were designed and studied. The results showed that potential learning groups can be divided into four learning modes: stagnant, focused, catch-up, and planned. Among them, learning intensity, learning efficiency, and activity duration in stagnant students were all zero. The focused students showed no enthusiasm for learning and inefficient learning during most of the course cycle. Catch-up students suddenly start learning many times because they have no plan or guidance. From the guidance in their learning plan, planned-mode students have higher learning efficiency. In addition, the learning performance prediction model proposed in the research was better than the improved K-means and LadFG algorithms in the performance prediction for six public courses, including advanced mathematics and physics. For example, in advanced mathematics, the weighted accuracy index of the model proposed by this research was 11.9% and 20.8% higher than the other two algorithms. In terms of weighted F1-score, the model proposed here increased it by 13.1% and 21.2% compared with the others. This shows that learning efficiency based on learning effort and learning gain is more conducive to performance prediction.

ACKNOWLEDGMENTS

REFERENCES

1 
B. B. Lockee, ``Online education in the post-COVID era,'' Nature Electronics, Vol. 4, No. 1, pp. 5-6, Jan. 2021.DOI
2 
V. Singh, and A. Thurman, ``How many ways can we define online learning? A systematic literature review of definitions of online learning (1988-2018),'' American Journal of Distance Education, Vol. 33, No. 4, pp. 289-306, Oct. 2019.DOI
3 
P. Paudel, ``Online education: Benefits, challenges and strategies during and after COVID-19 in higher education,'' International Journal on Studies in Education, Vol. 3, No. 2, pp. 70-85, Sep. 2020.DOI
4 
R. E. Mayer, ``Thirty years of research on online learning,'' Applied Cognitive Psychology, Vol. 33, No. 2, pp. 152-159, Oct. 2019.DOI
5 
P. Chakraborty, P. Mittal, M. S. Gupta, S. Yadav, and A. Arora, ``Opinion of students on online education during the COVID‐19 pandemic,'' Human Behavior and Emerging Technologies, Vol. 3, No. 3, pp. 357-365, Dec. 2020.DOI
6 
N. R. Putri, and F. M. Sari, ``Investigating English teaching strategies to reduce online teaching obstacles in the secondary school,'' Journal of English Language Teaching and Learning, Vol. 2, No. 1, pp. 23-31, June. 2021.DOI
7 
L. Sun, Y. Tang, and W. Zuo, ``Coronavirus pushes education online,'' Nature Materials, Vol. 19, No. 6, pp. 687-687, April. 2020.DOI
8 
D. Nambiar, ``The impact of online learning during COVID-19: students’ and teachers’ perspective,'' The International Journal of Indian Psychology, Vol. 8, No. 2, pp. 783-793, July. 2020.DOI
9 
M. Adnan, and K. Anwar, ``Online Learning amid the COVID-19 Pandemic: Students' Perspectives,'' Journal of Pedagogical Research, Vol. 2, No. 1, pp. 45-51, June. 2020.DOI
10 
M. M. Hassan, T. Mirza, and M. W. Hussain, ``A critical review by teachers on the online teaching-learning during the COVID-19,'' International Journal of Education and Management Engineering, Vol. 10, No. 8, pp. 17-27, Oct. 2020.DOI
11 
X. Song, J. Li, S. Sun, H. Yin, and P. Dawson, ``SEPN: a sequential engagement based academic performance prediction model,'' IEEE Intelligent Systems, Vol. 36, No. 1, pp. 46-53, July. 2020.DOI
12 
Z. Xu, H. Yuan, and Q. Liu, ``Student performance prediction based on blended learning,'' IEEE Transactions on Education, Vol. 64, No. 1, pp. 66-73, Aug. 2020.DOI
13 
J. Hao, J. Gan, and L. Zhu, ``MOOC performance prediction and personal performance improvement via Bayesian network,'' Education and Information Technologies, Vol. 27, No. 5, pp. 7303-7326, 2022.DOI
14 
D. T. Ha, P. T. T. Loan, and C. N. Giap, ``An empirical study for student academic performance prediction using machine learning techniques,'' International Journal of Computer Science and Information Security,Vol. 18, No. 3, pp. 21-28, April. 2020.URL
15 
P. Nuankaew, and W. Sararat, ``Student Performance Prediction Model for Predicting Academic Achievement of High School Students,'' European Journal of Educational Research, Vol. 11, No. 2, pp. 949-963, 2022.URL
16 
Y. Long, ``Research on art innovation teaching platform based on data mining algorithm,'' Cluster Computing, Vol. 22, pp. Suppl 6, pp. 14943-14949, 2019.DOI
17 
Y. Zeng, ``Evaluation of physical education teaching quality in colleges based on the hybrid technology of data mining and hidden markov model,'' International Journal of Emerging Technologies in Learning (iJET), Vol. 15, No. 1, pp. 4-15, Jan. 2020.DOI
18 
A. V. Martín-García, F. Martínez-Abad, and D. Reyes-González, ``TAM and stages of adoption of blended learning in higher education by application of data mining techniques,'' British Journal of Educational Technology, Vol. 50, No. 5, pp. 2484-2500, 2019.DOI
19 
S. N. Safitri, H. Setiadi, and E. Suryani, ``Educational Data Mining Using Cluster Analysis Methods and Decision Trees based on Log Mining,'' Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol. 6, No. 3, pp. 448-456, 2022.DOI

Author

Jingjing Yang

Jingjing Yang, born in 1985, received a Ph.D. from Zhongnan University of Economics and Law in December 2018, majoring in economic history. She is a lecturer at Wuhan Textile University. She is interested in education and the economy, as well as comparative education.