Mobile QR Code QR CODE

Applying AR Technology Integrating Unity3D with the Vuforia SDK for Oral English Teaching

(Wei Huang) ; (Haiyan Zhang)

With the rapid progress of information technology, its application to teaching has gradually become a hot topic in the education field. Augmented reality (AR) combines virtual and real characteristics that can improve comprehension in a virtual environment, bringing new development opportunities to oral English teaching. Based on integration of the Vuforia SDK in the U-nity3D augmented reality engine, this research applies AR technology to spoken English teaching, improves a convolutional neural network (CNN), and proposes an English speech recognition system based on a connectionist temporal classification (CTC)-CNN (maxout). The results from experiments varying the number of iterations and the loss value, the proposed model converges after 80 iterations with strong performance. In recognition of spoken English with or without noise, the accuracy of this method was highest at 0.957 and 0.894, respectively, which is better than the CTC-CNN (sigmoid) model. In recognizing six kinds of spoken English, the accuracy of the CTC-CNN (maxout) model stabilizes at about 95%, with the highest accuracy at 97%. The accuracy rate shows that the method can be effectively applied to oral English teaching, and can provide a new reference method for innovations in oral English teaching and the improvement of teaching efficiency.

Automatic Chinese-English Translation Algorithm based on Out-of-vocabulary Words in the Context of Cross-cultural Communication

(Jiayan Duan) ; (Hongwei Ma) ; (Junxia Wang)

In the context of cross-cultural communication, translation between languages has become increasingly important. Based on automatic Chinese?English translation, this study examined the processing of out-of-vocabulary (OOV) words. First, this paper briefly introduces two basic translation models: seq2seq and Transformer. Second, we propose a semantic-based OOV processing method, which replaces OOV words with the most similar words by calculating the semantic similarity of word vectors and then uses the source-language sentences with the replaced words to train a translation model. Compared to the seq2seq model, the Bilingual Evaluation Understudy (BLEU) values of the Transformer model were higher (37.26 for the NIST06 dataset and 30.75 for the NIST08 dataset). After OOV processing, retaining low-frequency OOV words was conducive to the improvement of BLEU scores, which were increased by 0.63 and 0.09 for NIST06 and NIST08 for the Transformer model, respectively. This shows the effectiveness of the OOV processing method. The OOV processing method could be applied to automatic Chinese?English translation.

Research on Key Technologies of End to side Computing Network based on Field Level AI Reasoning for Terminal Equipment

(Mao Ni) ; (Ting Zhou) ; (Hengjiang Wang) ; (Fang Cu)

This study analyzes the field-level AI reasoning technology of its terminal equipment and applies it to optimizing computing resources under the cooperation of multiple UAVs. The experimental results indicates that the performance of the ICEM method is superior to the other three methods, and the maximum value is 15 × 108 bits/Joule. An increase of 10 thresholds can reduce the number of iterations 50-fold when the amount of CDF iterations is 0.9. This can reach more than 20 times if the number of UAVs and mobile terminal devices doubles. The UAV will run at the maximum speed in most time slots, and the maximum computation of the terminal equipment can reach 15 Mbits when the entire working cycle is 30 seconds. Therefore, the algorithm proposed in this study achieves higher computing energy efficiency based on less convergence time and is effective in end-to-end computing power network computing resource scheduling.

Research on Intelligent Decision-making Method of Investment Scheme of High-speed Railway Construction Project

(Xiaochen Duan) ; (Jingjing Hao) ; (Yanliang Niu)

This paper proposes a particle swarm optimization algorithm, error back propagation neural network, fuzzy inference system, and other non-linear method models with high fitting degree and accuracy to optimize the scientific and accurate decision-making of investment plans for international high-speed railway projects, solve the problems of lag, linearity, and simplicity in the current investment forecasting and decision-making methods, and maximize the economic and social benefits of investment, based on the mining of historical data of the full life cycle cost. These methods were suitable for the randomness, complexity, and non-linearity presented in the full life cycle of international high-speed rail investment. The investment plan decision-making of international high-speed railway construction projects was conducted. The investment error of the selected line construction stage was 1.85%, and the operating investment error from November 14, 2007, to now was 0.64%, which is within the allowable range of ±3%. The ratio of operating investment in the next 80 years to that in the first 20 years was five. The built model was applied to three alternative routes, and the predicted results according to the model were the same as the selected routes in the actual project.

Extraction of Key Frames from Dance Videos and Movement Recognition by Multi-feature Fusion

(Jie Yan)

Dance videos contain numerous complex movements that make movement recognition difficult. The multi-feature fusion method proposed in this paper first takes color and texture features of video frames for clustering, and then extracts key frames. Based on the key frames, spatial and temporal features, respectively, are extracted by using long short-term memory and a three-dimensional convolutional neural network. A multi-feature fusion method designed for movement recognition is used in experiments conducted with both the DanceDB dataset and a self-built dataset. The results show that the proposed multi-feature fusion method has high recall and precision ratios for key frame extraction, and achieved recognition accuracy of 42.67% and 50.64% with DanceDB and the self-built dataset, respectively. This paper validates the effectiveness of the proposed method for key frame extraction and movement recognition in dance videos, and suggests potential practical applications. The approach improves the reliability of video processing and provides theoretical support for further research on deep learning methods in the field of video processing.

Binarized Spiking Neural Networks Optimized with Color Harmony Algorithm for Liver Cancer Classification

(Pushpa Balakrishnan) ; (B. Baskaran) ; (S. Vivekanandan) ; (P. Gokul)

Binarized spiking neural networks optimized with a color harmony algorithm for liver cancer classification (BSNN-CHA-LCC) are proposed to classify liver cancer as normal and abnormal. Initially, fusion of an MRI dataset and CT-scan datasets of a liver cancer dataset were taken, and the input images were given to CWF-based preprocessing for removing noise and increasing the quality of input computed tomography (CT) and magnetic resonance imaging (MRI). The preprocessed images of CT and MRI are given to improve the non-sub sampled Shearlet transform (INSST) method-based feature extraction for extracting features. The extracted features were given BSNN to classify liver cancer as normal and abnormal. The proposed method was implemented, and the efficiency of the proposed BSNN-CHA-LCC method was evaluated under performance metrics, such as precision, sensitivity, F-scores, specificity, accuracy, error rate, and computational time. The proposed technique achieved23.03%, 11.56%, and 21.22% higher accuracy and 36.12%, 15.23%, and 27.11% lower error rates than the existing models, such as hybrid-feature analysis depending on machine-learning for liver cancer categorization utilizing fused images (MLP-LCC), Deep learning-based classification of liver cancer histopathology images utilizing only global labels (mask-RCNN-LCC), and deep learning based liver cancer identification utilizing watershed transform and Gaussian mixture method (DNN-GMM-LCC), respectively.

Research on Identification and Classification of Depression in College Students through Feature Analysis

(Yachai Sun) ; (Haixia Zhang) ; (Jiyu Men)

The facial expression features of experimental subjects were analyzed to explore the differences in facial expression between patients with psychological depression and those with a healthy psychology. Three psychological depression identification and classification models, the K-nearest neighbor algorithm, support vector machine, and logistic regression, were constructed to verify the usability of a classification model through facial expression feature analysis and identify college students with psychological depression. The experimental results showed significant differences in the average values of several facial expressions, including AU4, AU7, AU9, AU12, AU18, and AU27. The P values were less than 0.05, so they were used in subsequent model analysis. The results obtained through identification by the classification model showed that the performance of the K-nearest neighbor algorithm in the mean absolute error (MAE), root mean square error (RMSE), and accuracy were the poorest, followed by the logistic regression and support vector machine. The MAE, RMSE, and accuracy of the K-nearest neighbor algorithm were 7.14, 9.37, and 83.61%, respectively; the values of the logistic regression algorithm were 6.83, 8.79, and 91.97%, respectively; the values of the support vector machine were 6.23, 7.76, and 94.80%, respectively. This study showed that a classification model constructed using facial expression features could be used to identify and classify college students with psychological depression.

ACDD: Automated COVID Detection using Deep Neural Networks

(Ghulam Musa Raza) ; (Muhammad Shoaib) ; (Byung-Seo Kim)

December 2019 witnessed the outbreak of a novel coronavirus, thought to have started in the Chinese city of Wuhan. The situation worsened owing to its quick spread across the globe, leading to a worldwide pandemic that became known as COVID-19. To suppress the pandemic, early detection of positive COVID-19 patients has become highly important. There is a lack of precise automated tool kits available for use in diagnosing medical conditions, so auxiliary diagnostic tools are in high demand. Important information about this virus can be extracted from X-ray images, which can be used in conjunction with advanced artificial intelligence. This study addresses the unavailability of physicians in remote areas, and the complex algorithm proposed in this study can find potential matches for patients in rural areas who need care. This could help to improve access to care for those who need it most. The purpose of this study is to develop a novel model that can automatically detect COVID-19 by utilizing chest X-ray images. The proposed model incorporates binary and multi-class classification and can be employed by radiologists for timely detection of the COVID-19 virus in an effective manner.

Unveiling the Power of Deep Learning: A Comparative Study of LSTM, BERT, and GRU for Disaster Tweet Classification

(Ihsan Ullah) ; (Anum Jamil) ; (Imtiaz Ul Hassan) ; (Byung-Seo Kim)

Disasters have serious effects on people's lives and buildings. Therefore, social media platforms, such as Twitter, have become more critical. They are crucial tools for responding to and managing disasters effectively. This study examined the effectiveness of various deep learning models, such as bidirectional encoder representations from transformers (BERT), gated recurrent units (GRU), and long short-term memory (LSTM) for classifying disaster-related tweets. Twitter data related to different disasters were collected using hashtags. The data were then cleaned, preprocessed, and manually annotated by a team. The annotated data were divided into training, validation, and testing sets. The data were used to train three models based on BERT, GRU, and LSTM for the categorical classification of disaster tweets. Finally, the three models were evaluated and compared using the test data. BERT achieved an accuracy of 96.2%, making it the most effective model. In contrast, the LSTM and GRU models achieved an accuracy of 93.2% and 88.4%, respectively. These findings underscore the potential effectiveness of deep learning models in classifying disaster-related tweets, offering insights that could enhance disaster management strategies, refine social media monitoring processes, bolster public safety, and provide directions for future research.

Tracking Control of IPMSM based on Disturbance Observer

(Yongho Jeon) ; (Shinwon Lee)

A state observer was designed to estimate the state variables required to control the speed, position, and current of the rotor shaft of a synchronous motor system with a permanent magnet rotor. When designing a state observer, an output equation is constructed with measurable state variables as output states among state variables. The state vector is estimated by designing the observer using the state equation and output equation, mathematical models of the motor system, and the Luenberger state observer. A PI controller is configured using the desired reference input and the estimated state to follow as feedback. The precise control performance can be obtained by following the reference speed of the motor. The controllers must be able to compensate for load fluctuations, various parameter errors, and model errors. For this purpose, a state observer was designed to estimate the state, including the disturbance using the Luenberger observer. Obtaining a state estimation error and speed tracking error within 0.1 [%] in the steady state was possible after applying the state estimator designed for a one-horsepower class IPMSM to the disturbance-compensated speed controller and current controller.