Mobile QR Code QR CODE

  1. (College of Environment and Health, Yanching Institute of Technology, Langfang, Hebei 065201, China)
  2. (School of Intelligent Engineering, Yanching Institute of Technology, Langfang, Hebei 065201, China )



Feature analysis, Depression identification, Classifier algorithm

1. Introduction

Depression is currently the most common psychological disorder in the world [1]. According to the ``2022 National Depression Blue Book,'' based on the Chinese mental health survey, there are currently 95 million people with depression in China. Approximately 280,000 people commit suicide each year, of which 40% suffer from depression. Among depression patients, those aged 18-24 years accounted for 35.32% of the total. Of these, 50% were students, and 41% dropped out of school due to depression. Velichkovsky et al. [2] reported that depression patients took longer to process negative emotional stimuli, and healthy participants spent more time processing neutral emotional stimuli. They also found that the images of 57 moderately depressed patients showed faces with neutral or angry expressions. Porter-Vignola et al. [3] used linear mixed models for data analysis and reported that adolescents with depression recognized sadness slightly faster than those without depression. Zhou et al. [4] observed significant differences in facial expression between different populations with the same depression rating through a constructed prediction model.

The model showed promising results for identifying depression through facial expressions. Chen et al. [5] extracted the facial expression features from face images using directional gradient histograms. They used a support vector machine as a classifier and concluded that facial expression features had feasible discriminative ability for depression. Guo et al. [6] produced a dataset by having their subjects perform five emotion-evoking tasks. They used a Deep Belief Network (DBN) model to extract facial expressions. They concluded that facial expressions could be used to recognize patients with potential depression risk and that the female identification rate was generally higher than males.

Yang et al. [7] proposed a new interference-free psychological state assessment model that used facial videos collected by 5G terminals to assess the users’ psychological states in real time and found that the depression assessment model was effective. Singh and Goyal [8] used computer vision to decode depression. They reported that computer vision achieved 74% accuracy in identifying depression patients and concluded that the facial expressions of depression patients under any given psychological stimulus differed from those of non-depression patients.

Therefore, this study examined the differences in facial expression between college students with psychological depression and those with a healthy psychology by analyzing the facial expression features and then constructed a psychological depression recognition model based on facial feature analysis to test the feasibility of using facial expression features in identifying college students with psychological depression. This work assists college psychological counselors in judging whether students have psychological depression and lays a theoretical foundation for identifying and classifying college students with psychological depression through facial feature analysis in the future.

2. Methods for Depression Recognition and Classification

2.1 Facial Expression Feature Analysis

Facial expressions are the most direct way for people to express emotions. Patients with psychological depression often give people a feeling of unhappiness and depression, and their facial expressions are often characterized by pulling down the corners of their mouths, crying easily, and frowning. The most basic unit of a person’s facial expressions is the facial action unit (AU). After understanding the state of people’s facial AUs and the facial expressions represented by their related actions, a person’s psychological state can be judged by facial expressions. Facial AUs in the research field are specialized in expressing various facial expressions [9], as listed in Table 1. These AUs can be combined to express people’s facial expressions, such as frowning, pursing their lips, and smiling. Through the combination and intensity variation of AU, a variety of facial expressions of people can be composed, so the feature selection of AU can be used to eliminate the facial expression features irrelevant to this experiment and select the depression-related features that are useful for the classification model.

The selected AUs for this experiment involved four parts: eyebrows, eyes, nose, and mouth. The main detection and recognition process includes face detection, face alignment, face recognition, and AU feature extraction. The feature extraction for psychological depression in this paper focused mainly on facial expressions. Therefore, to better extract features, face detection must be performed on the image to distinguish and operate on the face. After face detection, the face in the video frame of the same experimental subject needs to be aligned with the principle of scaling and rotating to align with the inner corner point of the first frame image so that the eye corners of the face in all images are horizontal. After the above operations, further facial movement recognition is performed to extract the AU features. Finally, the facial AUs used for the classification model were selected by comparing the difference between the average values of AUs of the two groups of experimental subjects.

Table 1. AUs used in this paper and their corresponding facial movements.

AU category

Facial movements

AU1

Inner eyebrow elevation

AU2

Raised outer eyebrows

AU4

Eyebrow down

AU5

Raise upper eyelids and stare

AU7

Eyelid tightening

AU9

Wrinkled nose

AU12

Stretching the corners of the mouth

AU15

Pulling down the corners of the mouth

AU18

Pout

AU19

Stick out the tongue

AU27

Mouth wide open

2.2 Classification Methods

2.2.1 K-nearest Neighbor Classifier

(1) The principle of the K-nearest neighbor (KNN) algorithm [10] is to find the K instances in the training data set that are closest to the input data and classify them into a category. Therefore, the choice of K value will significantly impact the classification results of the KNN algorithm. Implementation of the KNN algorithm depends on the distance between data samples. This paper used the most commonly used distance function, the Euclidean distance. The distance d(x,y) between x and y is defined using the following formula:

(1)
$\mathrm{d}\left(\mathrm{x},\mathrm{y}\right)=\sqrt{\sum _{\mathrm{i}=1}^{\mathrm{n}}\left(\mathrm{x}_{\mathrm{i}}-\mathrm{y}_{\mathrm{i}}\right)^{2}}$.

The attribute with the largest difference in the distance formula has the greatest impact on the calculation results, so value normalization is needed. The simplest 0-1 normalization is used as follows:

(2)
$\mathrm{x}=\frac{\mathrm{x}-\min }{\max -\min }$,

where max stands for the maximum value of a variable, and min stands for the minimum value of a variable.

2.2.2 Logistic Regression Classifier

This study examined college students with a healthy psychology and those who suffer from psychological depression. Therefore, the logistic regression (LR) model chosen is a binomial logistic regression model [11], which is expressed as

(3)
$h_{\theta }\left(x\right)=g\left(\theta ^{T}x\right)=\frac{1}{1+e^{-{\theta ^{T}}x}}$,

where x denotes the feature vector, and g denotes the common S-shaped logical function. The value range of $\mathrm{g}\left(\mathrm{z}\right)=\frac{1}{1+\mathrm{e}^{-\mathrm{z}}}$ is $\left[0,1\right]$. When determining the boundary, it was assumed that

(4)
$\mathrm{P}\left(\mathrm{y}=1\left| \mathrm{x};\Theta \right.\right)=\mathrm{h}_{\Theta }\left(\mathrm{x}\right)$,
(5)
$\mathrm{P}\left(\mathrm{y}=0\left| \mathrm{x};\Theta \right.\right)=1-\mathrm{h}_{\Theta }\left(\mathrm{x}\right)$.

When $\mathrm{h}_{\Theta }\left(\mathrm{x}\right)$ is larger than or equal to 0.5, it is predicted that y=1, which is considered a positive sample; when $\mathrm{h}_{\Theta }\left(\mathrm{x}\right)$ < 0.5, it was predicted that y=0, which is considered as a negative sample.

2.2.3 Support Vector Machine Classifier

The Support Vector Machine (SVM) [12] has the advantages of a rapid solving speed and strong generalization ability. The SVM is usually used to deal with binary classification problems, such as whether one has psychological depression. For linearly separable problems in SVM, the formula for the hyperplane is

(6)
$\mathrm{W}^{\mathrm{T}}\mathrm{X}+\mathrm{b}=0$,

where W denotes the normal vector of the hyperplane, and b is a constant, which is the intercept of the hyperplane.

The SVM algorithm is used to find a hyperplane that can make WX+b=0. The binary variable $\mathrm{y}\in \left\{-1,1\right\}$ represents the negative and positive classes. The mathematical expressions are

(7)
when $y_{i}=1$, $W^{T}X_{i}+b\geq 0$,
(8)
when $y_{i}=-1$, $W^{T}X_{i}+b<0$.

The two boundaries are translated until they intersect the sample vector in space and distance $\mathrm{d}=\frac{2}{\left\| \mathrm{W}\right\| }$ between this vector and the hyperplane was calculated; a larger d value indicates a better classification effect. At this point, the SVM can be transformed into an equivalent quadratic convex optimization problem:

minimize: $\frac{1}{2}\left\| W\right\| ^{2};$ restricted condition:

s.t.$y_{i}\left(W^{T}X_{i}+b\right)\geq 1$, i=1,2,3,${\ldots}$,n.

Fig. 1. Examples of facial expression image collection.
../../Resources/ieie/IEIESPC.2023.12.6.511/fig1.png

3. Case Study

3.1 Experimental Subjects

Fifteen college students with healthy psychological conditions and 15 college students with psychological depression were selected as experimental subjects. In addition, college students with healthy psychological conditions were selected as the control group, and college students with psychological depression were selected as the experimental group. All experimental subjects signed an informed consent form and voluntarily participated. The exclusive criteria for the experimental subjects with psychological depression included the following: (1) having cardiovascular diseases or other mental illnesses; (2) having severe depression or severe suicidal tendencies; (3) women who are pregnant or breastfeeding; (4) taking stimulating foods, such as strong tea, coffee, or engaged in vigorous exercise within 24 hours before the test. The reason for the exclusion criteria is that the above behaviors can stimulate facial expressions and induce non-depressive expression changes. Selecting college students with healthy psychological conditions required an additional criterion besides the above exclusion criteria: a Hamilton Depression Scale (HAMD-17) [13] score greater than or equal to 7.

3.2 Data Collection and Processing

The data collection of this study included three parts: audio-visual appreciation, video watching, and audio playing [14]. The experimental subjects watched these three materials in a fixed environment and made corresponding facial expressions. The experimental subjects had a ten-minute rest to adjust their emotions after each part ended to avoid mutual interference between each part of the material. The experiment captured the facial expression responses of each experimental subject through video shooting with a camera, converted the video into video frames, and obtained the input facial image sequence through steps, such as face alignment and face scaling. According to the key point positions of the face, two eyebrow points, two eye points, nasal tip, and two mouth corner points were selected for automatic calibration to realize the extraction of AU features of the human face in the video. The extracted features were used as data for the experiment. The data were normalized to calculate the AU values. The length of the shot video might be inconsistent because the start and end of the video shooting were manually implemented. The impact of the video length on the experimental results was avoided by cropping the length of the shot video and cutting the irrelevant video information about the experiment, leaving only the important parts for video frame conversion.

3.3 Evaluation Indicators

Three evaluation indicators, accuracy, mean absolute error (MAE), and root mean square error (RMSE) [15], were used to evaluate the model recognition detection effect. The formula of accuracy is defined as

(9)
$\mathrm{Acc}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{FP}+\mathrm{TN}+\mathrm{FN}}$,

where TP indicates the number of depressed samples correctly identified as depressed; FP indicates the number of depressed samples incorrectly identified as non-depressed; FN indicates the number of non-depressed samples incorrectly identified as depressed; TN indicates the number of non-depressed samples correctly identified as non-depressed.

The MAE was calculated as follows:

(10)
$\mathrm{MAE}=\frac{\sum \left| \mathrm{x}-\mathrm{y}\right| }{\mathrm{n}}$.

The RMSE was calculated as follows:

(11)
$\text{RMSE}=\sqrt{\frac{1}{\mathrm{n}}\sum \left(\mathrm{x}-\mathrm{y}\right)^{2}}$.

In the above two equations, x denotes the actual predicted value; y denotes the true value; n denotes the times of identification.

3.4 Result Analysis

AU1, AU2, AU3, AU4, AU5, AU7, AU9, AU12, AU15, AU18, AU19, and AU27 among the AUs were selected as evaluation criteria for facial feature analysis. Openface was used to examine the AU data, and the averages of the 12 AUs were calculated. The averages of the AUs of the experimental and control groups were calculated separately, as shown in Fig. 2. A larger AU value indicates a more intense facial movement situation corresponding to the AU and a more intense corresponding emotion expressed. The averages of the AUs for each experimental subject in the experimental group and the control group were calculated separately. Significant differences in the averages of the six facial actions were noted (Fig. 2), indicating significant differences in facial expressions between college students with psychological depression and those with healthy psychological conditions. Therefore, these six facial action features were selected as the basis for subsequent classification models to identify college students with psychological depression.

The AU categories listed in Table 2 showed a significant difference in the average values between the experimental and control groups. Among them, AU4 corresponds to the facial action of the eyebrows down, and the key position of the facial point is the eyebrows. AU7 corresponds to the facial action of eyelid tightening, and the key positions of the facial point are two eyes. AU9 corresponds to the facial action of a wrinkled nose, and the key position of the facial point is the nasal tip. AU12 corresponds to stretching the corners of the mouth. AU18 corresponds to the facial action of pouting, and AU27 corresponds to the facial action of opening the mouth wide. The key positions of the three facial points were two mouth corners. The average value of an AU represents the activity range of the AU (Table 2), and the standard deviation represents the richness of the AU activity. The standard deviation and P values were calculated; the P value of each AU category was less than 0.05, indicating a significant difference in these six AU categories between the two groups of subjects. That is, the difference was significant, and they could be used for the subsequent analysis of classification models.

Three different classification models were used to calculate the predicted values for each frame extracted from the video, and the average value was taken as the predicted value for the video. The performance of the classification models was measured using the mean absolute error (MAE) and root-mean-square error (RMSE) [15]. The MAE and RMSE values of the three models, namely KNN, LR, and SVM, showed a decreasing trend in order (Table 3). As these were error values, smaller values indicated better the recognition and detection performance of the models. Therefore, the performance of the three classification models was ranked from low to high as KNN < LR < SVM.

After determining the facial expression features used for model classification, the three classification models were used to identify and classify the subjects in the experimental group, control group, and a combination of the two groups, as listed in Table 4. The recognition and classification accuracy of the KNN, LR, and SVM models was > 80%, > 90%, and > 95%, respectively, showing that KNN < LR < SVM. Therefore, using these three classification models to identify and classify college students with psychological depression based on facial expression feature analysis was effective.

Fig. 2. Comparison of the average values of AUs between the two groups.
../../Resources/ieie/IEIESPC.2023.12.6.511/fig2.png
Table 2. Analysis of AU category values of the two groups.

AU category

AU4

AU7

AU9

AU12

AU18

AU27

Eyebrow down

Eyelid tightening

Wrinkled nose

Stretching the corners of the mouth

Pout

Open the mouth wide

Experimental group

Average value

0.5724

0.6812

0.8107

0.9233

0.7720

0.3712

Standard deviation

0.3157

0.4218

0.5034

0.5127

0.4291

0.2167

Control group

Average value

0.3918

0.2431

0.5301

0.5617

0.6119

0.1109

Standard deviation

0.1573

0.0954

0.3451

0.3683

0.4269

0.0726

P value

0.037

0.024

0.022

0.018

0.043

0.029

Table 3. Identification results of different classification models.

MAE

RMSE

KNN

LR

SVM

KNN

LR

SVM

Experimental group

7.03

6.89

6.17

9.21

8.91

7.83

Control group

7.24

6.77

6.29

9.53

8.66

7.69

Two groups combined

7.14

6.83

6.23

9.37

8.79

7.76

Table 4. Experimental results of different classification models for identifying and classifying psychological depression in college students.

Accuracy (%)

KNN

LR

SVM

Experimental group

82.67%

90.43%

94.60%

Control group

84.55%

93.50%

95.00%

Two groups combined

83.61%

91.97%

94.8%

4. Discussion

Psychological depression among teenagers has a high risk, and most teenagers are in college, facing increasing trouble. In particular, at a young age, without too much social experience, they do not know how to solve problems, often belittling themselves and feeling worthless, leading to psychological depression. In this study, the facial expression features were analyzed by collecting data through camera shooting of the experimental subjects’ responses to audio-visual materials in three parts, i.e., audio-visual appreciation, video watching, and audio playback. Therefore, the feasibility of identifying and classifying college students with psychological depression through facial expression feature analysis was assessed using three classification models, namely KNN, LR, and SVM. The results showed that there were large differences in the average values of six facial actions, i.e., AU4, AU7, AU9, AU12, AU18, and AU27, and their standard deviations and P values were calculated, indicating significant differences, so they were used in the subsequent classification models. After the models were used for identification and classification, the accuracy of the three classification models was ranked from low to high as KNN < LR < SVM. Based on the research findings, the following recommendations are proposed for colleges regarding college students with psychological depression.

(1) Schools can establish social platforms, similar to online forums, for college students with psychological depression to express and talk about their emotions [16]. At the same time, schools can use this social platform to monitor the psychological status of these students, and enable counselors, class teachers, and other teachers to understand the psychological status of the students better and provide timely help.

(2) Schools can regularly organize cultural and recreational activities. The after-school cultural and recreational activities of college students are for entertainment and have a positive effect on regulating the emotions and relaxing the mind and body. During the college stage, the cultural and recreational activities are an indispensable part of student life. Regular cultural and recreational activities can help cultivate good psychological qualities in college students, maintain better mental health, and prevent the onset of psychological depression.

(3) Schools can help promote mental health knowledge. Each university has its psychological therapy room, but disseminating mental health knowledge is not universal. With the increasingly fierce competition in social life, college students face increasing pressure in academic, emotional, employment, and future planning aspects, leading to mental health problems. Colleges can establish mental health courses, organize related knowledge lectures and other activities to promote mental health knowledge and provide college students with professional support when needed.

5. Conclusion

This paper introduced psychological depression, feature analysis, and classifiers. By analyzing the facial expression features of experimental subjects, the differences in facial expression between individuals with psychological depression and those with healthy mental states were explored. The KNN, LR, and SVM algorithms were used to construct psychological depression recognition and classification models based on facial feature analysis. The feasibility of identifying and classifying individuals with psychological depression through facial expression feature analysis was validated using these algorithms. The experimental results revealed significant differences in the average values of six facial expression features (AU4, AU7, AU9, AU12, AU18, and AU27); the P values were less than 0.05, so they could be used for subsequent model analysis. The accuracies of the KNN, LR, and SVM classification models were approximately 80%, 90%, and 95%, respectively. This paper proves that a classification model constructed based on facial expression features can be used to identify and classify college students with psychological depression.

The limitation of this paper was that only three classification models (KNN algorithm, LR, and SVM), were used for facial expression recognition, and the manual extraction of features from facial expression images was required for these three classifiers before recognition. The manually extracted features made it difficult to describe the rules of expressions in depth. Therefore, future research will increase the number of samples and study the neural network algorithm that can automatically extract the image features.

REFERENCES

1 
H. Yildirim-Celik, S. Eroglu, K. Oguz, G. Karakoc-Tugru, Y. Erdogan, D. Isman-Haznedaroglu, C. Eker, A. S. Gonul, ``Emotional context effect on recognition of varying facial emotion expression intensities in depression,'' Journal of Affective Disorders, Vol. 308, No. 308, pp. 141-146, July. 2022.DOI
2 
B. B. Velichkovsky, F. R. Sultanova, D. V. Tatarinov, ``Explicit and Implicit Processing of Facial Expressions in Depression,'' Experimental Psychology, Vol. 14, No. 2, pp. 24-36, Jan. 2021.DOI
3 
E. Porter-Vignola, L. Booij, G. Bossé-Chartier, P. Garel, C. M. Herba, ``Emotional Facial Expression Recognition and Depression in Adolescent Girls: Associations with Clinical Features,'' Psychiatry Research, Vol. 298, No. 8, pp. 1-8, Jan. 2021.DOI
4 
X. Zhou, Z. Wei, M. Xu, S. Qu, G. Guo, ``Facial Depression Recognition by Deep Joint Label Distribution and Metric Learning,'' IEEE Transactions on Affective Computing, Vol. 13, No. 3, pp. 1605-1618, Sep. 2020.DOI
5 
L. Chen, X. Ma, N. Zhu, H. Xue, H. Zeng, H. Chen, X. Wang, X. Ma, ``Facial Expression Recognition With Machine Learning and Assessment of Distress in Patients With Cancer,'' Oncology Nursing Forum, Vol. 48, No. 1, pp. 81-93, Jan. 2021.DOI
6 
W. Guo, H. Yang, Z. Liu, Y. Xu, B. Hu, ``Deep Neural Networks for Depression Recognition Based on 2D and 3D Facial Expressions Under Emotional Stimulus Tasks,'' Frontiers in Neuroscience, Vol. 15, pp. 1-19, April. 2021.DOI
7 
M. Yang, Y. Ma, Z. Liu, H. Cai, X. Hu, B. Hu, ``Undisturbed Mental State Assessment in the 5G Era: A Case Study of Depression Detection Based on Facial Expressions,'' IEEE Wireless Communications, Vol. 28, No. 3, pp. 46-53, June. 2021.DOI
8 
J. Singh, G. Goyal, ``Decoding depressive disorder using computer vision,'' Multimedia Tools and Applications, Vol. 80, No. 6, pp. 8189-8212, March. 2021.DOI
9 
S. Namba, W. Sato, M. Osumi, K. Shimokawa, ``Assessing Automated Facial Action Unit Detection Systems for Analyzing Cross-Domain Facial Expression Databases,'' Sensors, Vol. 21, No. 12, pp. 1-18, June. 2021.DOI
10 
B. S. Hantono, L. E. Nugroho, P. I. Santosa, ``Mental Stress Detection via Heart Rate Variability using Machine Learning,'' International Journal on Electrical Engineering and Informatics, Vol. 12, No. 3, pp. 431-444, Sep. 2020.DOI
11 
B. Nicholson, S. Morse, T. Lundgren, N. Vadiei, S. Bhattacharjee, ``Effect of depression on health behavior among myocardial infarction survivors in the United States,'' The Mental Health Clinician, Vol. 10, No. 4, pp. 222-231, July. 2020.DOI
12 
K. Srinivasan, N. Mahendran, D. R. Vincent, C. Y. Chang, S. Syed-Abdul, ``Realizing an Integrated Multistage Support Vector Machine Model for Augmented Recognition of Unipolar Depression,'' Electronics, Vol. 9, No. 4, pp. 1-16, April. 2020.DOI
13 
P. Bech, P. Allerup, L. F. Gram, N. Reisby, R. Rosenberg, O. Jacobsen, A. Nagy, ``The Hamilton Depression Scale,'' Acta Psychiatrica Scandinavica, Vol. 63, No. 3, pp. 290-299, 1981.DOI
14 
Y. Zhang, Y. Zhang, J. Chen, W. Chen, Z. Chen, J. Wang, ``Magnetoencephalogram analysis of depression based on multivariable sign transfer entropy,'' Journal of Physics: Conference Series, Vol. 1592, No. 1, pp. 1-8, August. 2020.DOI
15 
L. Yang, D. Jiang, H. Sahli, ``Feature Augmenting Networks for Improving Depression Severity Estimation from Speech Signals,'' IEEE Access, Vol. 8, pp. 24033-24045, Jan. 2020.DOI
16 
Y. Ding, X. Chen, Q. Fu, S. Zhong, ``A Depression Recognition Method for College Students Using Deep Integrated Support Vector Algorithm,'' IEEE Access, Vol. 8, pp. 75616-75629, April. 2020.DOI

Author

Yachai Sun
../../Resources/ieie/IEIESPC.2023.12.6.511/au1.png

Yachai Sun was born in Hebei, China, in 1987. From 2013 to 2016, she studied at the China University of Mining and Technology and received her master’s degree in 2016. Currently, she works at the Yanching Institute of Technology. She has published seven papers. Her main research fields are mental health education and ideological and political education.

Haixia Zhang
../../Resources/ieie/IEIESPC.2023.12.6.511/au2.png

Haixia Zhang was born in Langfang, Hebei, China, in 1982. She received her master’s degree from the Beijing University of Chemical Technology, China. She currently works at the Intelligent Engineering Institute, Yanching Institute of Technology. She has published five papers. Her main research fields are mental health education and ideological and political education.

Jiyu Men
../../Resources/ieie/IEIESPC.2023.12.6.511/au3.png

Jiyu Men was born in Hebei, China, in 1988. He received his master’s degree from Sehan University. Currently, he works in the school of Yanching Institute of Technology. He has published five papers. His research interests include the psychology of aesthetic education.