SunYachai1
ZhangHaixia2
MenJiyu1
-
(College of Environment and Health, Yanching Institute of Technology, Langfang, Hebei
065201, China)
-
(School of Intelligent Engineering, Yanching Institute of Technology, Langfang, Hebei
065201, China )
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Feature analysis, Depression identification, Classifier algorithm
1. Introduction
Depression is currently the most common psychological disorder in the world [1]. According to the ``2022 National Depression Blue Book,'' based on the Chinese mental
health survey, there are currently 95 million people with depression in China. Approximately
280,000 people commit suicide each year, of which 40% suffer from depression. Among
depression patients, those aged 18-24 years accounted for 35.32% of the total. Of
these, 50% were students, and 41% dropped out of school due to depression. Velichkovsky
et al. [2] reported that depression patients took longer to process negative emotional stimuli,
and healthy participants spent more time processing neutral emotional stimuli. They
also found that the images of 57 moderately depressed patients showed faces with neutral
or angry expressions. Porter-Vignola et al. [3] used linear mixed models for data analysis and reported that adolescents with depression
recognized sadness slightly faster than those without depression. Zhou et al. [4] observed significant differences in facial expression between different populations
with the same depression rating through a constructed prediction model.
The model showed promising results for identifying depression through facial expressions.
Chen et al. [5] extracted the facial expression features from face images using directional gradient
histograms. They used a support vector machine as a classifier and concluded that
facial expression features had feasible discriminative ability for depression. Guo
et al. [6] produced a dataset by having their subjects perform five emotion-evoking tasks. They
used a Deep Belief Network (DBN) model to extract facial expressions. They concluded
that facial expressions could be used to recognize patients with potential depression
risk and that the female identification rate was generally higher than males.
Yang et al. [7] proposed a new interference-free psychological state assessment model that used facial
videos collected by 5G terminals to assess the users’ psychological states in real
time and found that the depression assessment model was effective. Singh and Goyal
[8] used computer vision to decode depression. They reported that computer vision achieved
74% accuracy in identifying depression patients and concluded that the facial expressions
of depression patients under any given psychological stimulus differed from those
of non-depression patients.
Therefore, this study examined the differences in facial expression between college
students with psychological depression and those with a healthy psychology by analyzing
the facial expression features and then constructed a psychological depression recognition
model based on facial feature analysis to test the feasibility of using facial expression
features in identifying college students with psychological depression. This work
assists college psychological counselors in judging whether students have psychological
depression and lays a theoretical foundation for identifying and classifying college
students with psychological depression through facial feature analysis in the future.
2. Methods for Depression Recognition and Classification
2.1 Facial Expression Feature Analysis
Facial expressions are the most direct way for people to express emotions. Patients
with psychological depression often give people a feeling of unhappiness and depression,
and their facial expressions are often characterized by pulling down the corners of
their mouths, crying easily, and frowning. The most basic unit of a person’s facial
expressions is the facial action unit (AU). After understanding the state of people’s
facial AUs and the facial expressions represented by their related actions, a person’s
psychological state can be judged by facial expressions. Facial AUs in the research
field are specialized in expressing various facial expressions [9], as listed in Table 1. These AUs can be combined to express people’s facial expressions, such as frowning,
pursing their lips, and smiling. Through the combination and intensity variation of
AU, a variety of facial expressions of people can be composed, so the feature selection
of AU can be used to eliminate the facial expression features irrelevant to this experiment
and select the depression-related features that are useful for the classification
model.
The selected AUs for this experiment involved four parts: eyebrows, eyes, nose, and
mouth. The main detection and recognition process includes face detection, face alignment,
face recognition, and AU feature extraction. The feature extraction for psychological
depression in this paper focused mainly on facial expressions. Therefore, to better
extract features, face detection must be performed on the image to distinguish and
operate on the face. After face detection, the face in the video frame of the same
experimental subject needs to be aligned with the principle of scaling and rotating
to align with the inner corner point of the first frame image so that the eye corners
of the face in all images are horizontal. After the above operations, further facial
movement recognition is performed to extract the AU features. Finally, the facial
AUs used for the classification model were selected by comparing the difference between
the average values of AUs of the two groups of experimental subjects.
Table 1. AUs used in this paper and their corresponding facial movements.
AU category
|
Facial movements
|
AU1
|
Inner eyebrow elevation
|
AU2
|
Raised outer eyebrows
|
AU4
|
Eyebrow down
|
AU5
|
Raise upper eyelids and stare
|
AU7
|
Eyelid tightening
|
AU9
|
Wrinkled nose
|
AU12
|
Stretching the corners of the mouth
|
AU15
|
Pulling down the corners of the mouth
|
AU18
|
Pout
|
AU19
|
Stick out the tongue
|
AU27
|
Mouth wide open
|
2.2 Classification Methods
2.2.1 K-nearest Neighbor Classifier
(1) The principle of the K-nearest neighbor (KNN) algorithm [10] is to find the K instances in the training data set that are closest to the input
data and classify them into a category. Therefore, the choice of K value will significantly
impact the classification results of the KNN algorithm. Implementation of the KNN
algorithm depends on the distance between data samples. This paper used the most commonly
used distance function, the Euclidean distance. The distance d(x,y) between x and
y is defined using the following formula:
The attribute with the largest difference in the distance formula has the greatest
impact on the calculation results, so value normalization is needed. The simplest
0-1 normalization is used as follows:
where max stands for the maximum value of a variable, and min stands for the minimum
value of a variable.
2.2.2 Logistic Regression Classifier
This study examined college students with a healthy psychology and those who suffer
from psychological depression. Therefore, the logistic regression (LR) model chosen
is a binomial logistic regression model [11], which is expressed as
where x denotes the feature vector, and g denotes the common S-shaped logical function.
The value range of $\mathrm{g}\left(\mathrm{z}\right)=\frac{1}{1+\mathrm{e}^{-\mathrm{z}}}$
is $\left[0,1\right]$. When determining the boundary, it was assumed that
When $\mathrm{h}_{\Theta }\left(\mathrm{x}\right)$ is larger than or equal to 0.5,
it is predicted that y=1, which is considered a positive sample; when $\mathrm{h}_{\Theta
}\left(\mathrm{x}\right)$ < 0.5, it was predicted that y=0, which is considered as
a negative sample.
2.2.3 Support Vector Machine Classifier
The Support Vector Machine (SVM) [12] has the advantages of a rapid solving speed and strong generalization ability. The
SVM is usually used to deal with binary classification problems, such as whether one
has psychological depression. For linearly separable problems in SVM, the formula
for the hyperplane is
where W denotes the normal vector of the hyperplane, and b is a constant, which is
the intercept of the hyperplane.
The SVM algorithm is used to find a hyperplane that can make WX+b=0. The binary variable
$\mathrm{y}\in \left\{-1,1\right\}$ represents the negative and positive classes.
The mathematical expressions are
The two boundaries are translated until they intersect the sample vector in space
and distance $\mathrm{d}=\frac{2}{\left\| \mathrm{W}\right\| }$ between this vector
and the hyperplane was calculated; a larger d value indicates a better classification
effect. At this point, the SVM can be transformed into an equivalent quadratic convex
optimization problem:
minimize: $\frac{1}{2}\left\| W\right\| ^{2};$ restricted condition:
s.t.$y_{i}\left(W^{T}X_{i}+b\right)\geq 1$, i=1,2,3,${\ldots}$,n.
Fig. 1. Examples of facial expression image collection.
3. Case Study
3.1 Experimental Subjects
Fifteen college students with healthy psychological conditions and 15 college students
with psychological depression were selected as experimental subjects. In addition,
college students with healthy psychological conditions were selected as the control
group, and college students with psychological depression were selected as the experimental
group. All experimental subjects signed an informed consent form and voluntarily participated.
The exclusive criteria for the experimental subjects with psychological depression
included the following: (1) having cardiovascular diseases or other mental illnesses;
(2) having severe depression or severe suicidal tendencies; (3) women who are pregnant
or breastfeeding; (4) taking stimulating foods, such as strong tea, coffee, or engaged
in vigorous exercise within 24 hours before the test. The reason for the exclusion
criteria is that the above behaviors can stimulate facial expressions and induce non-depressive
expression changes. Selecting college students with healthy psychological conditions
required an additional criterion besides the above exclusion criteria: a Hamilton
Depression Scale (HAMD-17) [13] score greater than or equal to 7.
3.2 Data Collection and Processing
The data collection of this study included three parts: audio-visual appreciation,
video watching, and audio playing [14]. The experimental subjects watched these three materials in a fixed environment and
made corresponding facial expressions. The experimental subjects had a ten-minute
rest to adjust their emotions after each part ended to avoid mutual interference between
each part of the material. The experiment captured the facial expression responses
of each experimental subject through video shooting with a camera, converted the video
into video frames, and obtained the input facial image sequence through steps, such
as face alignment and face scaling. According to the key point positions of the face,
two eyebrow points, two eye points, nasal tip, and two mouth corner points were selected
for automatic calibration to realize the extraction of AU features of the human face
in the video. The extracted features were used as data for the experiment. The data
were normalized to calculate the AU values. The length of the shot video might be
inconsistent because the start and end of the video shooting were manually implemented.
The impact of the video length on the experimental results was avoided by cropping
the length of the shot video and cutting the irrelevant video information about the
experiment, leaving only the important parts for video frame conversion.
3.3 Evaluation Indicators
Three evaluation indicators, accuracy, mean absolute error (MAE), and root mean square
error (RMSE) [15], were used to evaluate the model recognition detection effect. The formula of accuracy
is defined as
where TP indicates the number of depressed samples correctly identified as depressed;
FP indicates the number of depressed samples incorrectly identified as non-depressed;
FN indicates the number of non-depressed samples incorrectly identified as depressed;
TN indicates the number of non-depressed samples correctly identified as non-depressed.
The MAE was calculated as follows:
The RMSE was calculated as follows:
In the above two equations, x denotes the actual predicted value; y denotes the true
value; n denotes the times of identification.
3.4 Result Analysis
AU1, AU2, AU3, AU4, AU5, AU7, AU9, AU12, AU15, AU18, AU19, and AU27 among the AUs
were selected as evaluation criteria for facial feature analysis. Openface was used
to examine the AU data, and the averages of the 12 AUs were calculated. The averages
of the AUs of the experimental and control groups were calculated separately, as shown
in Fig. 2. A larger AU value indicates a more intense facial movement situation corresponding
to the AU and a more intense corresponding emotion expressed. The averages of the
AUs for each experimental subject in the experimental group and the control group
were calculated separately. Significant differences in the averages of the six facial
actions were noted (Fig. 2), indicating significant differences in facial expressions between college students
with psychological depression and those with healthy psychological conditions. Therefore,
these six facial action features were selected as the basis for subsequent classification
models to identify college students with psychological depression.
The AU categories listed in Table 2 showed a significant difference in the average values between the experimental and
control groups. Among them, AU4 corresponds to the facial action of the eyebrows down,
and the key position of the facial point is the eyebrows. AU7 corresponds to the facial
action of eyelid tightening, and the key positions of the facial point are two eyes.
AU9 corresponds to the facial action of a wrinkled nose, and the key position of the
facial point is the nasal tip. AU12 corresponds to stretching the corners of the mouth.
AU18 corresponds to the facial action of pouting, and AU27 corresponds to the facial
action of opening the mouth wide. The key positions of the three facial points were
two mouth corners. The average value of an AU represents the activity range of the
AU (Table 2), and the standard deviation represents the richness of the AU activity. The standard
deviation and P values were calculated; the P value of each AU category was less than
0.05, indicating a significant difference in these six AU categories between the two
groups of subjects. That is, the difference was significant, and they could be used
for the subsequent analysis of classification models.
Three different classification models were used to calculate the predicted values
for each frame extracted from the video, and the average value was taken as the predicted
value for the video. The performance of the classification models was measured using
the mean absolute error (MAE) and root-mean-square error (RMSE) [15]. The MAE and RMSE values of the three models, namely KNN, LR, and SVM, showed a decreasing
trend in order (Table 3). As these were error values, smaller values indicated better the recognition and
detection performance of the models. Therefore, the performance of the three classification
models was ranked from low to high as KNN < LR < SVM.
After determining the facial expression features used for model classification, the
three classification models were used to identify and classify the subjects in the
experimental group, control group, and a combination of the two groups, as listed
in Table 4. The recognition and classification accuracy of the KNN, LR, and SVM models was >
80%, > 90%, and > 95%, respectively, showing that KNN < LR < SVM. Therefore, using
these three classification models to identify and classify college students with psychological
depression based on facial expression feature analysis was effective.
Fig. 2. Comparison of the average values of AUs between the two groups.
Table 2. Analysis of AU category values of the two groups.
AU category
|
AU4
|
AU7
|
AU9
|
AU12
|
AU18
|
AU27
|
Eyebrow down
|
Eyelid tightening
|
Wrinkled nose
|
Stretching the corners of the mouth
|
Pout
|
Open the mouth wide
|
Experimental group
|
Average value
|
0.5724
|
0.6812
|
0.8107
|
0.9233
|
0.7720
|
0.3712
|
Standard deviation
|
0.3157
|
0.4218
|
0.5034
|
0.5127
|
0.4291
|
0.2167
|
Control group
|
Average value
|
0.3918
|
0.2431
|
0.5301
|
0.5617
|
0.6119
|
0.1109
|
Standard deviation
|
0.1573
|
0.0954
|
0.3451
|
0.3683
|
0.4269
|
0.0726
|
P value
|
0.037
|
0.024
|
0.022
|
0.018
|
0.043
|
0.029
|
Table 3. Identification results of different classification models.
|
MAE
|
RMSE
|
KNN
|
LR
|
SVM
|
KNN
|
LR
|
SVM
|
Experimental group
|
7.03
|
6.89
|
6.17
|
9.21
|
8.91
|
7.83
|
Control group
|
7.24
|
6.77
|
6.29
|
9.53
|
8.66
|
7.69
|
Two groups combined
|
7.14
|
6.83
|
6.23
|
9.37
|
8.79
|
7.76
|
Table 4. Experimental results of different classification models for identifying and classifying psychological depression in college students.
|
Accuracy (%)
|
KNN
|
LR
|
SVM
|
Experimental group
|
82.67%
|
90.43%
|
94.60%
|
Control group
|
84.55%
|
93.50%
|
95.00%
|
Two groups combined
|
83.61%
|
91.97%
|
94.8%
|
4. Discussion
Psychological depression among teenagers has a high risk, and most teenagers are in
college, facing increasing trouble. In particular, at a young age, without too much
social experience, they do not know how to solve problems, often belittling themselves
and feeling worthless, leading to psychological depression. In this study, the facial
expression features were analyzed by collecting data through camera shooting of the
experimental subjects’ responses to audio-visual materials in three parts, i.e., audio-visual
appreciation, video watching, and audio playback. Therefore, the feasibility of identifying
and classifying college students with psychological depression through facial expression
feature analysis was assessed using three classification models, namely KNN, LR, and
SVM. The results showed that there were large differences in the average values of
six facial actions, i.e., AU4, AU7, AU9, AU12, AU18, and AU27, and their standard
deviations and P values were calculated, indicating significant differences, so they
were used in the subsequent classification models. After the models were used for
identification and classification, the accuracy of the three classification models
was ranked from low to high as KNN < LR < SVM. Based on the research findings, the
following recommendations are proposed for colleges regarding college students with
psychological depression.
(1) Schools can establish social platforms, similar to online forums, for college
students with psychological depression to express and talk about their emotions [16]. At the same time, schools can use this social platform to monitor the psychological
status of these students, and enable counselors, class teachers, and other teachers
to understand the psychological status of the students better and provide timely help.
(2) Schools can regularly organize cultural and recreational activities. The after-school
cultural and recreational activities of college students are for entertainment and
have a positive effect on regulating the emotions and relaxing the mind and body.
During the college stage, the cultural and recreational activities are an indispensable
part of student life. Regular cultural and recreational activities can help cultivate
good psychological qualities in college students, maintain better mental health, and
prevent the onset of psychological depression.
(3) Schools can help promote mental health knowledge. Each university has its psychological
therapy room, but disseminating mental health knowledge is not universal. With the
increasingly fierce competition in social life, college students face increasing pressure
in academic, emotional, employment, and future planning aspects, leading to mental
health problems. Colleges can establish mental health courses, organize related knowledge
lectures and other activities to promote mental health knowledge and provide college
students with professional support when needed.
5. Conclusion
This paper introduced psychological depression, feature analysis, and classifiers.
By analyzing the facial expression features of experimental subjects, the differences
in facial expression between individuals with psychological depression and those with
healthy mental states were explored. The KNN, LR, and SVM algorithms were used to
construct psychological depression recognition and classification models based on
facial feature analysis. The feasibility of identifying and classifying individuals
with psychological depression through facial expression feature analysis was validated
using these algorithms. The experimental results revealed significant differences
in the average values of six facial expression features (AU4, AU7, AU9, AU12, AU18,
and AU27); the P values were less than 0.05, so they could be used for subsequent
model analysis. The accuracies of the KNN, LR, and SVM classification models were
approximately 80%, 90%, and 95%, respectively. This paper proves that a classification
model constructed based on facial expression features can be used to identify and
classify college students with psychological depression.
The limitation of this paper was that only three classification models (KNN algorithm,
LR, and SVM), were used for facial expression recognition, and the manual extraction
of features from facial expression images was required for these three classifiers
before recognition. The manually extracted features made it difficult to describe
the rules of expressions in depth. Therefore, future research will increase the number
of samples and study the neural network algorithm that can automatically extract the
image features.
REFERENCES
H. Yildirim-Celik, S. Eroglu, K. Oguz, G. Karakoc-Tugru, Y. Erdogan, D. Isman-Haznedaroglu,
C. Eker, A. S. Gonul, ``Emotional context effect on recognition of varying facial
emotion expression intensities in depression,'' Journal of Affective Disorders, Vol.
308, No. 308, pp. 141-146, July. 2022.
B. B. Velichkovsky, F. R. Sultanova, D. V. Tatarinov, ``Explicit and Implicit Processing
of Facial Expressions in Depression,'' Experimental Psychology, Vol. 14, No. 2, pp.
24-36, Jan. 2021.
E. Porter-Vignola, L. Booij, G. Bossé-Chartier, P. Garel, C. M. Herba, ``Emotional
Facial Expression Recognition and Depression in Adolescent Girls: Associations with
Clinical Features,'' Psychiatry Research, Vol. 298, No. 8, pp. 1-8, Jan. 2021.
X. Zhou, Z. Wei, M. Xu, S. Qu, G. Guo, ``Facial Depression Recognition by Deep Joint
Label Distribution and Metric Learning,'' IEEE Transactions on Affective Computing,
Vol. 13, No. 3, pp. 1605-1618, Sep. 2020.
L. Chen, X. Ma, N. Zhu, H. Xue, H. Zeng, H. Chen, X. Wang, X. Ma, ``Facial Expression
Recognition With Machine Learning and Assessment of Distress in Patients With Cancer,''
Oncology Nursing Forum, Vol. 48, No. 1, pp. 81-93, Jan. 2021.
W. Guo, H. Yang, Z. Liu, Y. Xu, B. Hu, ``Deep Neural Networks for Depression Recognition
Based on 2D and 3D Facial Expressions Under Emotional Stimulus Tasks,'' Frontiers
in Neuroscience, Vol. 15, pp. 1-19, April. 2021.
M. Yang, Y. Ma, Z. Liu, H. Cai, X. Hu, B. Hu, ``Undisturbed Mental State Assessment
in the 5G Era: A Case Study of Depression Detection Based on Facial Expressions,''
IEEE Wireless Communications, Vol. 28, No. 3, pp. 46-53, June. 2021.
J. Singh, G. Goyal, ``Decoding depressive disorder using computer vision,'' Multimedia
Tools and Applications, Vol. 80, No. 6, pp. 8189-8212, March. 2021.
S. Namba, W. Sato, M. Osumi, K. Shimokawa, ``Assessing Automated Facial Action Unit
Detection Systems for Analyzing Cross-Domain Facial Expression Databases,'' Sensors,
Vol. 21, No. 12, pp. 1-18, June. 2021.
B. S. Hantono, L. E. Nugroho, P. I. Santosa, ``Mental Stress Detection via Heart Rate
Variability using Machine Learning,'' International Journal on Electrical Engineering
and Informatics, Vol. 12, No. 3, pp. 431-444, Sep. 2020.
B. Nicholson, S. Morse, T. Lundgren, N. Vadiei, S. Bhattacharjee, ``Effect of depression
on health behavior among myocardial infarction survivors in the United States,'' The
Mental Health Clinician, Vol. 10, No. 4, pp. 222-231, July. 2020.
K. Srinivasan, N. Mahendran, D. R. Vincent, C. Y. Chang, S. Syed-Abdul, ``Realizing
an Integrated Multistage Support Vector Machine Model for Augmented Recognition of
Unipolar Depression,'' Electronics, Vol. 9, No. 4, pp. 1-16, April. 2020.
P. Bech, P. Allerup, L. F. Gram, N. Reisby, R. Rosenberg, O. Jacobsen, A. Nagy, ``The
Hamilton Depression Scale,'' Acta Psychiatrica Scandinavica, Vol. 63, No. 3, pp. 290-299,
1981.
Y. Zhang, Y. Zhang, J. Chen, W. Chen, Z. Chen, J. Wang, ``Magnetoencephalogram analysis
of depression based on multivariable sign transfer entropy,'' Journal of Physics:
Conference Series, Vol. 1592, No. 1, pp. 1-8, August. 2020.
L. Yang, D. Jiang, H. Sahli, ``Feature Augmenting Networks for Improving Depression
Severity Estimation from Speech Signals,'' IEEE Access, Vol. 8, pp. 24033-24045, Jan.
2020.
Y. Ding, X. Chen, Q. Fu, S. Zhong, ``A Depression Recognition Method for College Students
Using Deep Integrated Support Vector Algorithm,'' IEEE Access, Vol. 8, pp. 75616-75629,
April. 2020.
Author
Yachai Sun was born in Hebei, China, in 1987. From 2013 to 2016, she studied at
the China University of Mining and Technology and received her master’s degree in
2016. Currently, she works at the Yanching Institute of Technology. She has published
seven papers. Her main research fields are mental health education and ideological
and political education.
Haixia Zhang was born in Langfang, Hebei, China, in 1982. She received her master’s
degree from the Beijing University of Chemical Technology, China. She currently works
at the Intelligent Engineering Institute, Yanching Institute of Technology. She has
published five papers. Her main research fields are mental health education and ideological
and political education.
Jiyu Men was born in Hebei, China, in 1988. He received his master’s degree from
Sehan University. Currently, he works in the school of Yanching Institute of Technology.
He has published five papers. His research interests include the psychology of aesthetic
education.