Mobile QR Code QR CODE

  1. (Department of Networking and Communications, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu 603203, India {saisantd, suprajap}@srmist.edu.in)



Emotion recognition, Facial expressions, Electro encephalogram, Collaborative multimodal emotion recognition, Multi-resolution binarized image feature extraction, Dwarf mongoose optimization algorithm, Improved federated learning generative adversarial network

1. Introduction

Emotion is a process that involves expressions, and it operates in both conscious and unconscious circumstances in humans [1,2]. People communicate with eachother using expressions. Some emotions include sadness, joy, rage, and fear. In human-computer interactions, more research has been done on emotion recognition [3]. The system has emotional connections because the human–computer interaction system is dynamic and complicated. ER is categorized in the brain using electro encephalogram signals [4-7].Therefore, electro encephalogram signals are a vital part ofresearch [8-10]. The human–computer interaction and multimedia technologies of today are highly advanced, making emotion recognition automatic [15]. Emotions are adjusted using emotion recognition at the 3rdparty suggestion. Emotional AI is usedin many places, such as health care, entertainment, and education [16]. Artificial intelligence research in the robotic field has been raised. Many multinational companies, like Microsoft, Google, Samsung, have investing trillions of dollars in emotion recognition systems [17,18]. Never the less, more time and domain knowledge are needed to implement this technique. Thus, ER identifies the user’s emotions and responds with the help of multimedia content [19,20].

The problem statement of collaborative multi modal emotion recognition is to develop a system that accurately recognizes emotions from multiple modalities, such as facial expressions, EEG signals. This is a challenging problem because each modality has limitations [11-20]. For example, facial expressions can be subtle and difficult to interpret, and EEG signals can be noisy, making itdifficult to extract features.The motivation forusing collaborative multimodal emotion recognition through improved federated learning generative adversarial network (GAN) for facial expressions and EEG signals is to address the limitations of each modality by combining the strengths of multiple modalities. GANs are a type of machine learning algorithm that can be used to generate realistic data. In this case, the GAN can generate synthetic facial expressions or EEG signals similar to real data. This can be used to improve the performance of the emotion recognition system by providing more training data and helping to regularize the training process. Federated learning is a paradigm that allows multiple devices to train a machine-learning model collaboratively without sharing their data. The combination of collaborative multimodal emotion recognition, GANs, and federated learning has the potential to develop more accurate emotion recognition systems.The model upgrades ER accuracy by combining the strengths of multiple modalities.

Remaining manuscript is organized as follows. The recent research is revealed in part 2. Part 3 outlines the proposed technique, and part 4 reports the outcome and discussions. Part 5 concludes this manuscript.

2. Literature Survey

Severalstudieshave usedelectroencephalogram-based ER. Among them, some studies arerevised here.

Zhang et al. [11] presented a Multiple modal ER and EEG-based Cross-subject with Cross-modal Model (CSCM). The input data wastaken from Standard Energy Efficiency Data (SEED) and SEED-IV. The presented model achieved abetter area under the curve value and lower f-measure.

Zhao et al. [12] presented an attention-based hybrid deep learning method for EEG emotion recognition. The model extracted the critical feature details and providedan efficient categorization. EEG data differential entropy characteristics are extracted and arranged by electrode location. The typical encoder encodes EEG input and extracts spatial information before introducing the band attention method to apply adaptable weights to distinct bands. The LSTM network extracted the temporal features, and the time attention technique attained critical temporal data. The input data was taken from two datasets, Dataset for Emotion Analysis using Physiological Signals (DEAP) and SEED. The presented model achieved better accuracy and lower average running time.

Wang et al. [13] presented the Multi-modal emotion recognition using EEG and speech signal. The Multimodal Emotion Database (MED4) consisted of four modalities. MED4 is comprised of simultaneously collected information from individual EEGs, photo plethysmography, speech, and face pictures after being affected by video stimulus meant to elicit joyful, sad, angry, or neutral moods. The presented model achieved a lower accuracy.

Wang et al. [14] presented the Multi-Modal Domain Adaptation Variation Autoencoder for EEG-basis ER. The multimodal domain adaptive variation autoencoder (MMDA-VAE) approach learns shared cross-domain latent representations of multimodal data. A Multi-modal Variation Auto Encoder (MVAE) was used to project information from multiple modes into a single space. The input data was taken from two data sets, SEED and SEED-IV. The presented model achieves abetter f-measure and lower area under the curve.

3. Proposed Methodology

Emotion identification depends on EEG signals because they can depict variations in human brain stages. In addition to EEG signals, a face expression signal is used as an external physiological characterization signal for ER [21,22]. EEG with facial expression signals categorize emotions and are combined to the internal neural models and external sub-consciousness actions using an improved federated learning generative adversarial network to make multi-mode emotion recognition. The stability of emotional expression ability for facial expression and EEG signals throughout time [23,24]. The ER accuracy is enhanced after the fusion of EEG signals and facial expressions. Fig. 1 portrays a flowchart of the MER-IFLGAN method.

Fig. 1. Flowchart of the MER-IFLGAN method.
../../Resources/ieie/IEIESPC.2024.13.1.61/fig1.png

3.1 Data Acquisition

In this study, a video library (VL) is initially made for video emotion-evoked electro encephalogram experimentations. Ninety video clips found in VL. These video clips have been combined into the WMV format after being compiled from various films and TV shows. These 90 video clips span three emotional spectra: pornographic, neutral, and violent. The pornographic and violent movie clips come from two kinds of films: action and drama. VL has 90 videoclips, with 30 clips each in pornographic, neutral, and violent videos. Six examiners (three men and three women) assessed only one emotion type before involved in the VL. While choosing videoclips, these 6 examiners could selectavideo clip from the video library if they felt that a particular emotional type of clip was appropriate. The video emotion-evoked EEG experimentation involved 13 healthy participants, seven men and six women. The subjects are 24–28 years, and the corrected vision achieved 1.0.

The 90 video clips were considered as the stimulus to elicit EEG signals with various video emotions. While watching the video clips that were playing constantly on the computer, the participants wore a 64-lead Quik-Cap electrode cap to produce electro encephalogram signals that represented various video emotions. A 10–20 system electrode coordination is used to arrange the electrodes of electrode cap. The EEG signals generated by experiments are collected and prepared using the Neuroscan system. E-Prime software invented through PST Company was used to design video emotion-evoked EEG experimentation. Initially, the computer screens in the presence of the subjects exhibited the instructions with caution. The subjects began the experimentation by press the space bar after carefully considering the experimental design and overall experiment content. There were 90 videos in the library, 10 for each type of emotion, and 30 clips were selected randomly for each topic.These 30 video clips playback randomly toavert subjects from making inertial memories. The cross-shape prompt was shown onthe computer screen before each video clip was played to gain the subjects' attention [25]. After playing each video clip, a rest period was allowed for subjects to remain silent. The experiment endedafter 30 chosen video clips were all played. The EEG signals from the participants were recorded at 1000 Hz,a sampling rate through out the experiment. The test was repeatedfor every subject until the EEG signals of the final 13 subjects were obtained. Four emotions (happy, sad, fear, and neutral) were considered to examine how differently EEG and facial expression signals can identify various emotional stages.

3.2 EEG Signal Feature Extraction Utilizing Dwarf Mongoose Optimization Algorithm

The EEG signal is a higher dimension, a weak physiological signal that is non-stationary and non-linear. The benefits of the feature-selection process using DMO include the simplicity of understanding and lack of prerequisite knowledge. As a result, DMO can pick features from complex EEG signals that are both higher dimensional and objectively more accurate classifications. First, dominant and non-dominant features are extracted from input electro encephalogram signal feature vector. DMO replicates the compensating behavioral response of the dwarf mongoose. The efficacy of each solution wascalculated after the population was initiated in the alpha group. The alpha female was selected using Eq. (1),

(1)
$ \delta =\frac{fitness_{i}}{{\sum }_{i=1}^{n}fitness_{i}} $

where$fitness_{i}$represent dominant features of EEG Signal. The solutions updating mechanism is based on Eq. (2)

(2)
$ \mathrm{M }_{i+1}=\mathrm{M }_{i}+P\ast P_{eep} $

Let$P_{eep}$represents dominant femalevocalization that keeps the family on track; $P$represents a distributed random number. The sleeping mound is assessed by Eq. (3)

(3)
$ SM_{i}=\frac{fitness_{i+1}-fitness_{i}}{\max \left\{\left| fitness_{i+1},fitness_{i}\right| \right\}} $

The scale of the average sleeping was determined using Eq. (4)

(4)
$ \varphi =\frac{{\sum }_{i=1}^{n}SM_{i}}{n} $

Once the baby sitting exchange criteria were satisfied, the approach progressed to the scouting stage, where the subsequent food source or resting mound was considered. If the family forages far away in the scout group section, they will find abetter-sleeping mound [26]. The scout mongoose is expressed as Eq. (5),

(5)
$\begin{align} \mathrm{M }_{i+1}=\begin{cases} \mathrm{M }_{i}-Cf\ast P\ast rand\ast \left[\mathrm{M }_{i}-\vec{X}\right]; & if\varphi _{i+1}> \varphi _{i}\\ \mathrm{M }_{i}+Cf\ast P\ast rand\ast \left[\mathrm{M }_{i}-\vec{X}\right]; & else \end{cases} \end{align} $

where$rand$represents a random number in the range$\left(0,1\right)$;$Cf$ represents the collective-volatile movement control parameter; $\vec{X}$ represents the movement vector, which can be determined based on Eq. (6)

(6)
$ \vec{X}={\sum }_{i=1}^{n}\frac{M_{i}\ast SM_{i}}{M_{i}} $

This equation removes the non-dominant features as of feature vector of the input electro encephalogram signal.The dominant features were then combined again using the feature vectors of the input electro encephalogram signal to produce new feature vectors.

3.3 Facial Expression Feature Extraction using Multi-resolution Binarized Image-feature Extraction

The facial expression is an essential portion of expressing emotions. Human faces and facial expressions can communicate a wide range of emotions. Utilize MBIFE to identify these states. This stage involves delivering the chosen EEG signal features to an MBIFE for feature extraction. The central and linear symmetrical were also used. The histogram image is expressed as Eq. (7),

(7)
$ \mathrm{H }_{{f_{1}}}\left(S,R\right)=\left[{h}_{S,R}^{0},{h}_{S,R}^{1},{h}_{S,R}^{2},.....,{h}_{S,R}^{p-1}\right]^{\mathrm{T }} $

where$S$is the window size; $R$is the pixel code-word resolution;$P$ is pixel intensity. The stacked normalized elements are expressed as Eq. (8),

(8)
$ {h}_{S,R}^{p}=\frac{1}{p}{\sum }_{j=1}^{p}\delta _{p}\left(j\right) $

$\delta _{p}(j)$ is expressed as Eq. (9),

(9)
$\begin{align} \delta _{p}\left(j\right)=\begin{cases} 1\;\;\;\;\;\;\;\;\;\;\;\;if\,V_{p}=p\\ 0\;\;\;\;\;\;\;\;\;\;\;\;otherwise \end{cases} \end{align} $

The final image depiction was then built through concatenation of histograms acquired by the application of every filter of multi-resolution bank is expressed as Eq. (10),

(10)
$ \mathrm{H }_{m}=\left[\mathrm{H }_{{f_{1}}},\mathrm{H }_{{f_{2}}},........,\mathrm{H }_{{f_{n}}}\right]^{\mathrm{T }} $

where$\mathrm{H }_{{f_{1}}}$represents calculated and concatenated histograms of the responses acquired with the applied filter bank. The large produced histograms werecollected column-wise in a single representative matrix, and all the examined defect pictures of all classes were handled in the same manner, which is expressed as Eq. (11),

(11)
$\begin{align} \mathrm{H }=\begin{bmatrix} \mathrm{H }_{{f_{1}}1} & \cdots & \mathrm{H }_{{f_{1}}\mathrm{M }}\\ \vdots & \ddots & \vdots\\ \mathrm{H }_{{f_{n}}1} & \cdots & \mathrm{H }_{{f_{n}}\mathrm{M }} \end{bmatrix} \end{align} $

where$\mathrm{M }$ isthe processed defect image count [27]. The remaining data reduction and categorization processes are expressed as Eq. (11) as a starting point. The covariance matrix $H^{\mathrm{T}}$ was calculated and is expressed as Eq. (12)

(12)
$ C_{m}=\varphi .\varphi ^{\mathrm{T }} $

Using the class difference principle, the between-class scatter matrix is expressed as Eq. (13)

(13)
$ B_{S}={\sum }_{i=1}^{C}\left(\varphi _{{c_{i}}}-\varphi \right)\left(\varphi _{{c_{i}}}-\varphi \right)^{\mathrm{T }} $

Eq. (14) expresses the with in-class scatter matrix,

(14)
$ W_{S}={\sum }_{i=1}^{C}{\sum }_{\mathrm{K }\in c_{i}}^{Q_{i}}\left(y_{\mathrm{K }}-\varphi _{{c_{i}}}\right)\left(y_{\mathrm{K }}-\varphi _{{c_{i}}}\right)^{\mathrm{T }} $

The ratio among the projections of $B_{S}$and $W_{S}$was calculated using the Fisher criterion approach expressed as Eq. (15),

(15)
$ W_{pm}=\frac{W^{\mathrm{T }}{W}_{pca}^{\mathrm{T }}B_{S}W_{pca}W}{W^{\mathrm{T }}{W}_{pca}^{\mathrm{T }}W_{S}W_{pca}W} $

Certain features were extracted from the EEG signals lacking transformation using Eq. (15). It incursless computational cost and is simpleto perform. From this, many effectual features were extracted utilizing Multi-resolution binarized image feature extraction (MBIFE). Subsequently, the extracted features werefed to the emotion recognition.

3.4 Multimodal Emotion Recognition based Upon Improved Federated Learning Generative Adversarial Network

The classification procedure was essential for recognizing emotions from facial expressions and EEG signals. The best classification algorithm should be chosen to develop a clear and precise mode to forecast emotions in realtime. This determines the efficacy and precision of multi modal emotion recognition. The Flexible activation Functions with Improved Federated Learning Generative Adversarial Network (IFLGAN) is proposed. The IFLGAN classifier was used to identify emotions, such as sadness, fear, happiness, and neutrality. IFLGAN can tailor to the individual characteristics of emotions and deal with the complications with facial expression by building various trees levels over the enormous training dataset utilizing 2 scans, with three times better performance than existing methods. Little run-time resources are used by IFLGAN, and no storage space is needed to save temporary data. The emotions are divided into four categories at the categorization unit: fear, sadness, happiness, and neutral. The Generator $G$ and Discriminator $D$ is expressed in Eq. (16),

(16)
$ \begin{array}{l} \min _{G}\max _{D}\,\,\,\,V\left(G.D\right)=\Phi _{1}\left(\mathrm{E }_{X\approx {\Pr _{1}}\left(X\right)}\log D_{1}\left(X\right)+\mathrm{E }_{Z\approx {P_{Z}}\left(Z\right)}\left[\log \left(1-D_{1}\left(G_{1}(Z)\right)\right)\right]\right)\\ +\Phi _{2}\left(\mathrm{E }_{X\approx {\Pr _{2}}\left(X\right)}\log D_{2}\left(X\right)+\mathrm{E }_{Z\approx {P_{Z}}\left(Z\right)}\left[\log \left(1-D_{2}\left(G_{2}(Z)\right)\right)\right]\right)+.....\\ +\Phi _{\mathrm{K }}\left(\mathrm{E }_{X\approx {\Pr _{\mathrm{K }}}\left(X\right)}\log D_{1}\left(X\right)+\mathrm{E }_{Z\approx {P_{Z}}\left(Z\right)}\left[\log \left(1-D_{\mathrm{K }}\left(G_{\mathrm{K }}(Z)\right)\right)\right]\right) \end{array} $

where$X$represents the training data along mini-batch size, $G_{1}(Z)$ represents generated data with mini-batch size [28]. The $soft\max $operation was performed and expressed as Eq. (17),

(17)
$ soft\max {\sum }_{i=1}^{\mathrm{K }}\Phi _{i}=1 $

MMD score is found through computing the predictions, and it is expressed in Eq. (18):

(18)
$ \mathrm{MMD}_{\mathrm{i}}=\underset{\left|\left|f\right|\right|\leq 1}{\sup }\left|\left|\mathrm{E }\left(f\left(x\right)\right)-\mathrm{E }\left(f(G(Z))\right)\right|\right| $

where$G_{i}$ represents the generator and $D_{i}$ represents the generator. The $soft\max $operation has been performed and expressed as Eq. (19),

(19)
$ soft\max .\Phi _{i}=\frac{e^{{\mathrm{MMD}_{\mathrm{i}}}}}{{\sum }_{j=1}^{\mathrm{K }}e^{{\mathrm{MMD}_{\mathrm{j}}}}} $

where $\mathrm{MMD}_{\mathrm{i}}$ represents the MMD Score. For each generator, the optimal discriminator is expressed as Eq. (20),

(20)
$ {D}_{i}^{\ast }\left(x\right)=\frac{\Pr _{i}}{\Pr _{i}+PG_{i}} $

If $\Pr _{i}=PG_{i}$, then ${D}_{i}^{\ast }\left(x\right)=\frac{1}{2}$. The global minimum of the virtual training criterion is expressed as Eq. (21),

(21)
$ V\left(G_{i}\right)=-2\log 2+\mathrm{K }L\left(\left.\Pr _{i}\right| \frac{\Pr _{i}+PG_{i}}{2}\right)+\mathrm{K }L\left(\left.PG_{i}\right| \frac{\Pr _{i}+PG_{i}}{2}\right) $

The formula of the global generator is expressed as Eq. (22),

(22)
$ G_{g}\left(X;\vartheta _{{G_{g}}}\right)={\sum }_{1=1}^{n}\Phi G_{i}\left(X;\vartheta _{{G_{i}}}\right) $

where$\vartheta _{{G_{i}}}$isthe $i^{th}$generator’s parameters;$\vartheta _{{G_{g}}}$denotes the parameters of the global generator. The parameters of each generatorare replaced in Eq. (22) using Eq. (23),

(23)
$ G_{i}\left(X;\vartheta _{{G_{i}}}\right)=G_{g}\left(X;\vartheta _{{G_{g}}}\right) $

where$G_{{g_{a}}}$and $G_{{g_{MMD}}}$is expressed in Eq. (24),

(24)
$\begin{align} \begin{cases} G_{{g_{a}}}=\frac{1}{2}G_{1}\left(Z;\vartheta _{{G_{1}}}\right)+\frac{1}{2}G_{2}\left(Z;\vartheta _{{G_{2}}}\right)\\ G_{{g_{MMD}}}=\Phi _{1}\times G_{1}\left(Z;\vartheta _{{G_{1}}}\right)+\Phi _{2}\times G_{2}\left(Z;\vartheta _{{G_{2}}}\right) \end{cases} \end{align} $

Hence, the IFLGAN is categorized as sad, fear, happy, and neutral.

4. Result and Discussion

This segment defines the experimental out come of the MER-IFLGAN technique. The proposed approach was simulated in Math Works Inc, MATLAB{\textregistered} version 9.7.0.1190202 (R2019b). The metrics were examined. The obtained results of MER-IFLGAN were compared with existing MER-CSCM [11] and MER-LSTM [12] models.

4.1 Performance Metrics

The performance of the proposed method was evaluated.

4.1.1 F Measure

This is computed by Eq. (25),

(25)
$ F\,measure=\frac{h}{\left(h+\frac{1}{2}\left[i+j\right]\right)} $

Let$h$specifies true positive;$i$specifies the true negative;$j$specifies false positive.

4.1.2 Accuracy

This was determined using Eq. (26),

(26)
$ A=\frac{h+k}{h+i+j+k} $

where$k$implies false negative.

4.2 Performance Analysis

Tables 1-3 list the efficiency of the MER-IFLGAN technique. In these tables, the performance metrics were evaluated. The efficiency was compared to the existing MER-CSCM and MER-LSTM approaches.

Table 1 presents accuracy evaluation. The MER-IFLGAN achieves 31.21% and 34.06% greater accuracy for happy; 26.01% and 27.79% greater accuracy for sad; 45.34% and 22.78% greater accuracy for fear; 46.28% and 34.11% greater accuracy for Neutral compared to the MER-CSCM and MER-LSTM models, respectively.

Table 2 lists the results of the F-measure analysis. The MER-IFLGAN achieved the following compared to the existing MER-CSCM and MER-LSTM models, respectively:35.67% and 33.54% better F-measure for happy; 36.73% and 34.71% better F-measure for sad; 43.14% and 46.27% higher F-measure for fear; 45.26% and 45.87% higher F-measure for Neutral.

Table 3 presents the Average Running Time (ART) analysis. MER-IFLGAN achieved the following compared to the existing MER-CSCM and MER-LSTM models, respectively: 35.45% and 33.32% shorter ART for happy; 38.77% and 25.89% shorter ART for sad; 35.67% and 45.14% shorter ART for fear; 42.15%, and 43.26% shorter ART for Neutral.

Fig. 2 presents RoC analysis. The MER-IFLGAN achieved a 2.92% and 4.15% higher AUC value than the existing MER-CSCM and MER-LSTM methods, respectively.

Fig. 2. Analysis of RoC.
../../Resources/ieie/IEIESPC.2024.13.1.61/fig2.png
Table 1. Accuracy evaluation.

Methods

Accuracy (%)

Happy

Sad

Fear

Neutral

MER-CSCM

76.48

86.61

91.78

84.34

MER-LSTM

83.51

75.31

81.73

81.79

MER-IFLGAN (proposed)

98.55

98.11

98.67

97.77

Table 2. F1 score estimation.

Methods

F-measure (%)

Happy

Sad

Fear

Neutral

MER-CSCM

75.76

91.67

84.64

85.33

MER-LSTM

83.45

79.35

78.78

81.71

MER-IFLGAN (proposed)

98.89

98.91

99.78

98.90

Table 3. Average Running Time Analysis.

Methods

ART(s)

Happy

Sad

Fear

Neutral

MER-CSCM

5.55

6.66

7.56

7.45

MER-LSTM

8.9

6.66

7.45

9.7

MER-IFLGAN (proposed)

2.36

3.96

4.17

5.61

5. Conclusion

The Improved Federated Learning Generative Adversarial Network Espoused Multimodal Emotion Recognition in EEG and facial expression was implemented. The MER-IFLGAN technique was simulated in MATLAB. The MER-IFLGAN method achieved11.14% and 8.36% higher F-measure than the existing MER-CSCM and MER-LSTM models, respectively. Most studies used two distinct emotion signals as their target objects, but the recognized emotion rate tended to lessen when people delineated obfuscated emotion signals. Hence, the next study will concentrate on structuring a more efficient emotion database and consolidating more emotion details to enhance the development of the ER scheme. In addition, there is a lack of a global public data base that contains video and associated evoked EEG. A future public video EEG data base can be developed by examining ways to maximize the video type, count, and length and amassing EEG signals from numerous subjects.

REFERENCES

1 
E.S. Salama, R.A. El-Khoribi, M.E. Shoman, M.A.W. Shalaby, ``A 3D-convolutional neural network framework with ensemble learning techniques for multi-modal emotion recognition.'' Egyptian Informatics Journal, vol. 22, no. 2, pp. 167-176. 2021.DOI
2 
Y. Wu, J. Li, ``Multi-modal emotion identification fusing facial expression and EEG.'' Multimedia Tools and Applications, vol. 82, no. 7, pp. 10901-10919. 2023.DOI
3 
L. Fang, S.P. Xing, Z. Ma, Z. Zhang, Y. Long, K.P. Lee, S.J. Wang, ``Emo-MG Framework: LSTM-based Multi-modal Emotion Detection through Electro encephalography Signals and Micro Gestures.'' International Journal of Human-Computer Interaction, vol. 1, no. 1, pp. 1-17. 2023.DOI
4 
F. H. Shajin, B. Aruna Devi, N. B. Prakash, G. R. Sreekanth, P. Rajesh, ``Sailfish optimizer with Levy flight, chaotic and opposition-based multi-level thres holding for medical image segmentation.'' Soft Computing, pp. 1-26. Apr. 2023.DOI
5 
F. H. Shajin, P. Rajesh, M. R. Raja, ``An efficient VLSI architecture for fast motion estimation exploiting zero motion pre judgment technique and a new quadrant-based search algorithm in HEVC.'' Circuits, Systems, and Signal Processing, pp. 1-24. Mar. 2022.DOI
6 
P. Rajesh, F. Shajin, ``A multi-objective hybrid algorithm for planning electrical distribution system.'' European Journal of Electrical Engineering, vol. 22, no. 4-5, pp. 224-509. Jun. 2020.DOI
7 
P. Rajesh, R. Kannan, J. Vishnupriyan, B. Rajani, ``Optimally detecting and classifying the transmission line fault in power system using hybrid technique.'' ISA transactions, vol. 130, pp. 253-264. Nov. 2022.DOI
8 
F.M. Alamgir, M.S. Alam, ``Hybrid multi-modal emotion recognition framework based on Inception V3 DenseNet.'' Multimedia Tools and Applications, vol. 1, no. 1, pp. 1-28. 2023.DOI
9 
S. Dutta, B.K.. Mishra, A. Mitra, A. Chakraborty, ``A Multi-modal Approach for Emotion Recognition Through the Quadrants of Valence-Arousal Plane.'' SN Computer Science, vol. 4, no. 5, pp. 460. 2023.DOI
10 
S. Liu, P. Gao, Y. Li, W. Fu, W. ``Ding, Multi-modal fusion network with complementarity and importance for emotion recognition.'' Information Sciences, vol. 619, no. 1, pp. 679-694. 2023.DOI
11 
J.M. Zhang, X. Yan, Z.Y. Li, L.M. Zhao, Y.Z. Liu, H.L. Li, B.L. Lu. ``A Cross-subject and Cross-modal Model for Multimodal Emotion Recognition''. InNeural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8-12, Proceedings, Part VI 28 2021 (pp. 203-211). Springer International Publishing. 2021.DOI
12 
Z. Zhao, Z. Gong, M. Niu, J. Ma, H. Wang, Z. Zhang, Y. Li. ``Automatic respiratory sound classification via multi-branch temporal convolutional network'' InICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 9102-9106). IEEE. 2022.DOI
13 
Q. Wang, M. Wang, Y. Yang, X .Zhang. ``Multi-modal emotion recognition using EEG and speech signals.'' Computers in Biology and Medicine. vol. 149: 105907. 2022.DOI
14 
Y. Wang, S. Qiu, D. Li, C. Du, B.L. Lu, H. He. ``Multi-modal domain adaptation variational auto encoder for eeg-based emotion recognition.'' IEEE/CAA Journal of Automatica Sinica. vol. 9(9): 161226. 2022.DOI
15 
M. Maithri, U. Raghavendra, A. Gudigar, J. Samanth, P.D. Barua, M. Murugappan, Y. Chakole, U.R. Acharya, ``Automated emotion recognition: Current trends and future perspectives.'' Computer methods and programs in biomedicine, vol. 215, no. 1, p. 106646. 2022.DOI
16 
C. Guanghui, Z. Xiaoping, ``Multi-modal emotion recognition by fusing correlation features of speech-visual.'' IEEE Signal Processing Letters, vol. 28, pp. 533-537. 2021.DOI
17 
Y. Hu, F. Wang, ``Multi-Modal Emotion Recognition Combining Face Image and EEG Signal.'' Journal of Circuits, Systems and Computers, vol. 32, no. 07, p. 2350125. 2023.DOI
18 
M. Wang, Z. Huang, Y. Li, L. Dong, H. Pan, ``Maximum weight multi-modal information fusion algorithm of electro encephalographs and face images for emotion recognition.'' Computers & Electrical Engineering, vol. 94, no. 1, p. 107319. 2021.DOI
19 
Y. Zhang, C. Cheng, Y. Zhang, ``Multimodal emotion recognition using a hierarchical fusion convolutional neural network.'' IEEE access, vol. 9, no. 1, pp. 7943-7951. 2021.DOI
20 
D. Liu, L. Chen, Z. Wang, G. Diao, ``Speech expression multimodal emotion recognition based on deep belief network.'' Journal of Grid Computing, vol. 19, no. 2, p. 22. 2021.DOI
21 
H. Zhang, ``Expression-EEG based collaborative multimodal emotion recognition using deep autoencoder.'' IEEE Access, vol. 8, no. 1 pp. 164130-164143, 2020.DOI
22 
F. Aldosari, L. Abualigah, and K.H. Almotairi, ``A normal distributed dwarf mongoose optimization algorithm for global optimization and data clustering applications.'' Symmetry, vol. 14, no. 5, pp. 1021. 2022.DOI
23 
D. Saisanthiya, P. Supraja. "Heterogeneous Convolutional Neural Networks for Emotion Recognition Combined with Multimodal Factorised Bilinear Pooling and Mobile Application Recommendation", International Journal of Interactive Mobile Technologies (iJIM), 2023.DOI
24 
M. Park, and S. Chai, BTIMFL: A Blockchain-Based Trust Incentive Mechanism in Federated Learning. In International Conference on Computational Science and Its Applications (pp. 175-185). Cham: Springer Nature Switzerland, vol. 1, no. 1, pp. 1 June. 2023.DOI
25 
H. Zhang, ``Expression-EEG based collaborative multimodal emotion recognition using deep auto encoder.'' IEEE Access, vol. 8, no. 1, pp. 164130-164143. 2020.DOI
26 
J.O. Agushaka, A.E. Ezugwu, L. Abualigah, ``Dwarf mongoose optimization algorithm.'' Computer methods in applied mechanics and engineering, vol. 391, no. 1, p. 114570. 2022.DOI
27 
L.K.. Pavithra, T. Sree Sharmila, P. Subbulakshmi, ``Texture image classification and retrieval using multi-resolution radial gradient binary pattern. ``Applied Artificial Intelligence, vol. 35, no. 15, pp. 2298-2326. 2021.DOI
28 
W. Li, J. Chen, Z. Wang, Z. Shen, C. Ma, X. Cui, ``Ifl-gan: Improved federated learning generative adversarial network with maximum mean discrepancy model aggregation.'' IEEE Transactions on Neural Networks and Learning Systems. vol. 1, no. 1, pp. 1-12022.DOI
D. Saisanthiya
../../Resources/ieie/IEIESPC.2024.13.1.61/au1.png

D. Saisanthiya received B.Tech degree in CSE from Arulmigu Meenakshi Amman College of Engi-neering, Thiruvannamalai affiliated to Anna University ,Tamil nadu at 2009. M.Tech Degree in CSE from Sastha Institute of Sience and Technology, Chembarambakkam affiliated to Anna University , Tamil Nadu at 2011. She is currently working towards the Ph.D. degree at the School of Computing, Faculty of Engineering and Technology, SRM Institute of Science and Technology, India. His research interests include deep learning and Machine learning algorithms.

P. Supraja
../../Resources/ieie/IEIESPC.2024.13.1.61/au2.png

P. Supraja Currently working as an Associate professor, School of Computing, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Chennai, Tamil Nadu, India She was a recipient of AICTE Visvesvaraya Best Teacher Award 2020. Previously She completed the Indo-US WISTEMM Research fellow ship at University of Southern California, Los Angeles, USA funded by IUSSTF and DST Govt of India and She served as a Post-Doctoral Research Associate at Northumbria University, Newcastle, UK and completed her PhD from Anna University in 2017. She has published more than 50 research papers in reputed national and international level journals/conferences. She received her university-level Best Research Paper Award in 2022 & 2019 also she has received funding from AICTE for conducting STTP. Her research interests include Cognitive Computing, Optimization algorithms, Machine learning, Deep Learning, Wireless Communication, and IoT. She is a reviewer in IEEE, Inderscience, Elsevier and Springer Journals. She is also a member of several national and international professional bodies including IEEE, ACM, ISTE, etc. In addition, she has received the young women in Engineering award and Distinguished Young Researcher award from various International Organizations.