Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 14, No. 02, p.152-164

ISSN (online) :

2287-5255

Received : 23 October 2023Revised : 9 March 2024Accepted : 17 June 2024

DOI :

https://doi.org/10.5573/IEIESPC.2025.14.2.152

Regular Paper

On Application of Machine Learning for Deciding Acupoints in Acupuncture and Moxibustion Treatment

YANGHang¹ WURen² NAKATAMitsuru³ GEQi-Wei⁴

( The Graduate School of East Asian Studies, Yamaguchi University, 1677-1 Yoshida, Yamaguchi-shi, 753-8514 Japan a505snu@yamaguchi-u.ac.jp)
( Faculty of Information Science, Shunan University, 843-4-2 Gakuendai, Shunan-shi, 745-8566 Japan renwu@shunan-u.ac.jp)
( Faculty of Education, Yamaguchi University, 1677-1 Yoshida, Yamaguchi-shi, 753-8513 Japan mnakata@yamaguchi-u.ac.jp)
( Yamaguchi University, 1677-1 Yoshida, Yamaguchi-shi, 753-8511 Japan gqw@yamaguchi-u.ac.jp)

^*Corresponding Author: Qi-Wei GE

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

This paper discusses a machine learning-based approach to optimize acupuncture and moxibustion treatment (AMT). The goal is to develop a model that can offer personalized acupoints prescriptions for patients based on their symptoms, enhancing both the efficiency and effectiveness of treatment. A database comprising symptoms and acupoints prescriptions for 3,000 disease cases was used, and 11 machine learning algorithms were applied to learn from this data. The training process utilized 90% of the data for 5-fold cross-validation and 10% for testing to assess generalization ability. Intersection over Union (IoU) was chosen as the key evaluation metric for the models. The Seq2seq model with attention mechanism emerged as the best-performing algorithm, achieving an IoU of 95.72% on cross-validation and 95.33% on the test set. These results suggest that using Seq2seq with attention can significantly reduce subjectivity in acupoint selection and increase the efficiency of AMT. This approach provides a promising data-driven method for improving treatment precision and saving time in clinical settings.

Keywords

Artificial intelligence, Machine learning, Acupuncture and moxibustion, Traditional Chinese medicine

1. Introduction

Acupuncture and moxibustion treatment (AMT for short hereafter) has the characteristics of wide adaptability, remarkable curative effect, convenient application, low price, and safe, and has been widely promoted in the world. It has been proved through clinical practice that it has certain effects on more than 300 kinds of diseases in the fields of internal medicine, surgery, gynecology and pediatrics, and has good effects on about 100 kinds of diseases, such as chronic fatigue syndrome, withdrawal symptoms, etc. ^[1]. Acupuncturists gather information about the patient's condition through the four diagnostic methods of traditional Chinese medicine (TCM for short hereafter), namely ``inspection, audio-olfactory examination, interrogation, and palpation'', and provide appropriate AMT prescriptions according to meridian theory and their own clinical experiences. Over thousands of years, a vast amount of AMT clinical experience has been recorded in the form of text. If these data can be systematically utilized, they could provide substantial assistance for acupuncturists and even support for patients' self-help treatment in the foreseeable future. However, the description of symptoms in existing AMT books and clinical data is not unified, standardized, and further difficult to be quantified, which poses challenges to the global promotion of AMT.

Artificial intelligence (AI for short hereafter) technology has flourished and has been widely used in many fields in recent years. As one of the core technologies of AI, machine learning has been applied to all walks of life including the medical field ^[2]. Machine learning is to recognize input data sets through computer, encode the data into computer models or algorithms, train appropriate mathematical models, and test and verify the trained models on new data. The ability of machine learning to analyze data related to medical treatment and results is expected to transform medicine into a data-driven, results oriented discipline, which will have a profound impact on the detection, diagnosis and treatment of diseases ^[3]. In acupuncture clinics, acupoints are selected on the premise of obtaining the four diagnostic information of patients through the examination and palpation of doctors. Existing medical technology can already assist doctors in obtaining this information, such as pulse diagnostic devices to obtain pulse signals of patients and tongue scanners to obtain the condition of patients' tongues. Nevertheless, the process from gathering patient information through the four diagnostic methods to formulating the final acupoints prescription still heavily relies on the acupuncturist's reasoning and experience. This is exactly where machine learning can offer its significant advantages. Machine learning can be used to generate acupoints, covering all successfully treated cases. However, practical application faces challenges in data, medical validation, and ethics, necessitating careful incremental development to ensure patient safety and treatment effectiveness.

At present, the application of machine learning in the field of AMT is still in its infancy. Researchers have applied many traditional machine learning algorithms and deep learning algorithms to AMT, and achieved some research results ^[4]. Yang et al. ^[5] used artificial neural network to predict the clinical efficacy of AMT in the treatment of depression based on the demographic characteristics of patients and the data of the disease related self-assessment scale. Based on demographic data, four diagnostic information of TCM, symptom evaluation scale and other parameters, Pei et al. ^[6] built a model through artificial neural network to predict the clinical efficacy of acupuncture in treating heroin dependence. Hao et al. ^[7] used the fuzzy neural network to predict the changes of pain related biochemical indicators enkephalin in patients receiving electroacupuncture treatment based on physiological electrical signals such as ECG and EEG. Gan et al. ^[8] proposed an AMT support system from the data of four diagnostic methods of TCM to AMT prescription.

Although the application of machine learning in the field of AMT is promising, the further development of machine learning in the field of AMT is obstructed by such reasons as the huge demand for data sample size of machine learning, the need for a certain theoretical basis of TCM for AMT data processing, and the complexity of this processing process. As far as we know, there is no research on providing acupoints prescriptions that can cope with multiple diseases based on the data of four diagnostic methods by machine learning methods. In this paper, a method to learn the text information of existing AMT books and clinical data through AI algorithms is proposed to provide acupoints prescriptions for treating patients. We extract the names of symptoms from the texts of AMT, and unify and standardize them, so as to build a database of symptoms and corresponding acupoints prescriptions. We test various algorithms to learn the data in the database for train a model that can provide acupoints prescriptions based on patients' symptoms and identify that an algorithm, Seq2seq with attention ^[9], performs significantly better than the other algorithms.

The paper is organized as follows. In Section 2, we introduce the basic principles of AMT and the related introduction of machine learning. In Section 3, we introduce our proposed method of deciding acupoints prescriptions through machine learning by specifically describing the process of database construction, algorithm selection, data preprocessing, and determining the evaluation criteria of the model. In Section 4, we select the best-performing algorithm, namely Seq2seq with attention, by 5-fold cross validation, and further analyze it by ablation experiments. Finally, we conclude the paper in Section 5.

2. AMT and Machine Learning

2.1 Basic Principles of AMT

The basic contents of AMT mainly include AMT theory, AMT technology and clinical application of AMT. AMT theory mainly includes meridian theory and the rules of acupoints. AMT technology mainly includes acupuncture, moxibustion and other needling methods. Clinical application of AMT is a comprehensive application of AMT theory and technology. Meridian theory is an important part of TCM, which covers the distribution, physiological functions, pathological changes of human meridian system and its relationship with internal organs, and runs through the diagnosis and treatment of AMT ^[1]. Meridians are the channels for human body to transport Qi and blood, which are all over the body. Here Qi, as a special concept in TCM, is the most fundamental substance in the construction of the human body and in the maintenance of its life activities. Acupoints are special parts of human body surface that are infused with Qi from internal organs and meridians, and also the places where acupuncture, moxibustion and other stimuli are operated. Stimulating appropriate acupoints has the effect of dredging meridians, harmonizing Qi and blood, restoring the balance of yin and yang, and coordinating viscera, so as to achieve the purpose of disease prevention and treatment. There are totally 409 acupoints identified by WHO (World Health Organization) ^[10]. The names and WHO notations of each meridian, as well as the acupoints, are shown in Table 1. In addition, the parts that have neither specific name nor fixed position but are stimulated by tenderness or other reactions are collectively referred to as ``Ashi points''. Ashi points are usually near or far away from lesion, and they usually appear with the occurrence of a disease and disappear with the recovery of a disease ^[1].

The decision-making process of traditional AMT is shown in Fig. 1. Acupuncturists obtain the symptoms of patients through the four diagnostic methods, and analyze and summarize these symptoms according to meridian theory, so as to clarify the etiology, location, pathogenesis and urgency of the disease. On this basis, the appropriate AMT prescriptions are determined by comprehensively considering the meridian theory and the rules of acupoints. AMT prescription includes acupoints prescription and manipulation. Acupoints prescription is the first component of AMT prescription. Each acupoint in the body has relative specificity and may have the same or different therapeutic functions. Selecting acupoints with the same or similar functions can enhance the effectiveness of the treatment by strengthening the synergistic effect between the acupoints. Manipulation is the second component of AMT prescription, which includes treatment methods, specific operation and timing of treatment ^[1]. In this paper, we focus our discussion on deciding acupoints prescription.

Fig. 1. The decision-making process of traditional AMT.

Table 1. Meridians and acupoints in human body.

Meridian name	WHO notation of meridian	WHO notation of acupoints	Number of acupoints
Lung Meridian	LU	LU1∼LU11	11
Large Intestine Meridian	LI	LI1∼LI20	20
Stomach Meridian	ST	ST1∼ST45	45
Spleen Meridian	SP	SP1∼SP21	21
Heart Meridian	HT	HT1∼HT9	9
Small Intestine Meridian	SI	SI1∼SI19	19
Bladder Meridian	BL	BL1∼BL67	67
Kidney Meridian	KI	KI1∼KI27	27
Pericardium Meridian	PC	PC1∼PC9	9
Triple Energizer Meridian	TE	TE1∼TE23	23
Gallbladder Meridian	GB	GB1∼GB44	44
Liver Meridian	LR	LR1∼LR14	14
Conception Vessel	CV	CV1∼CV24	24
Governor Vessel	GV	GV1∼GV28	28
Extras Points	EX	EX-B1∼EX-B9 EX-UE1∼EX-UE11 EX-LE1∼EX-LE12	48
Total			409

2.2 Machine learning

The process of selecting acupoints can be regarded as establishing the corresponding relationship between the combination of a series of symptoms and the combination of acupoints in the acupuncture treatment scheme, rather than the corresponding relationship between a single symptom and a single acupoint. Since the number of combinations of different symptoms and the number of combinations of acupoints in reality are very large, it is impossible to cover all possible situations just by establishing a database from existing books and clinical data. Machine learning has the ability of generalization and can learn the rules hidden behind the data. The model trained by training data can also provide appropriate output for data other than training data with the same rule.

Machine learning is the study of how computers simulate human learning behavior to obtain new knowledge or experience, and improve their own performance by restructuring existing knowledge. Machine learning is widely used to solve classification, regression, clustering and other problems because it can learn data rules and patterns in massive data through computers and extract latent information. Maron et al. ^[11] proposed Naive Bayes (NB) algorithm for classification according to probability principle based on Bayesian theory. Cover et al. ^[12] proposed KNN classification algorithm based on distance measurement. Breiman et al. ^[13] proposed the early Decision Tree (DT) classification algorithm-cart algorithm, which uses the tree structure algorithm to divide the data into discrete classes. Support vector machine is a two-class classification model, which is a linear classifier with the largest interval in feature space ^[14]. Its learning strategy is to maximize interval. Artificial Neural Network (ANN) classification adjusts the parameters of artificial neural network according to the given training samples to make the network output close to the known sample class label ^[15].

Deep learning is a special kind of machine learning, which is a new research direction in the field of machine learning. Deep learning can be applied in various fields. According to different applications, the form of deep neural network is also different. Common deep learning models mainly include Fully Connected (FC) network, Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). The fully connected network structure is the basic deep neural network layer. Each node of the fully connected layer is connected with all nodes of the upper layer. Because all the outputs and inputs of the full connection layer are connected, the parameters of the full connection layer are the most, which requires a considerable amount of storage and computing space. CNN is a neural network used to process data with grid structure, which is often used in the field of computer vision. The difference from FC is that the upper and lower neurons of CNN cannot be connected directly, but through the ``convolution core'' as the intermediary, and the parameters of the hidden layer are greatly reduced through the sharing of ``core''. RNN is also one of the commonly used deep learning models, which is a kind of neural network used to process sequence data. It is often used in the field of natural language processing, and can also be used in the field of computer vision.

3. Methodology

In this work, we are to train a model by learning relevant data through machine learning, and provide acupoints prescriptions for treating patients by the trained model based on patients' symptoms. For this purpose, database, algorithms, data preprocessing and evaluation metrics are essential components.

1) Database: As far as we know, there is currently no database available for providing acupoints prescriptions based on patients' symptoms, so we need to establish a database of symptoms and acupoints prescriptions.

2) Algorithms: We regard the problem of acupoint selection as a multi-label classification problem and use 11 algorithms for multi-label classification including 8 classifiers provided by Scikit-learn and Feedforward neural network (FNN) ^[16], TextCNN ^[17], and Seq2seq with attention ^[9].

3) Data preprocessing: Different algorithms have different requirements for data representation, so we have different preprocessing for data when using different algorithms.

4) valuation Metrics: The traditional evaluation criteria in multi-label classification problems of accuracy, precision, and recall are not applicable to the evaluation of the model in this study due to the characteristics of the AMT prescription data. We take Intersection over Union (IoU) ^[18] as the

3.1 Database construction

The text information of AMT collected from books is preprocessed to extract the symptoms used to judge diseases, and then these symptoms are standardized and unified. On this basis, we build a database of symptoms and corresponding acupoints prescriptions. Fig. 2 shows the construction process of disease cases in the database through ``chronic fatigue syndrome with stagnation of liver-Qi'' as an example. The text in the green box is the description text of the disease, and the words marked with green underline are the corresponding symptoms. The words marked in the red box are used in TCM to reflect one cause of the sample case, and the circled numbers, e.g., , show the corresponding symptoms. The text in the blue box is the acupoints prescription of the disease case, and the words marked with blue underline are the corresponding acupoints which are represented by acupoint numbers. One of the symptom names in the figure, ``rib-side and abdominal distention and pain'', is expressed in a word in TCM, but it actually consists of four different symptoms. Therefore, the symptom name is divided into four symptom names, which is numbered as (148,149,158,159) in the database. In addition, the word with the meaning ``or'' often appears in the text describing the case to indicate that the same disease may have multiple symptoms, which may or may not occur at the same time. For such case, we record it in the database as different cases for the same disease name. In this case, although the symptom combination of the disease is not completely the same, the corresponding acupoints prescriptions are the same. In this way, we build our database of symptoms and corresponding acupoints prescriptions.

Our database includes the data shown in Tables 2 and 3. Table 2 shows symptoms and acupoints with their related numbers. Table 3 shows disease case number, name of disease case, corresponding symptoms represented by symptom number and corresponding acupoints represented by acupoint number. In Table 3, the symptoms and acupoints are represented by their numbers according to Table 2 and the orders of symptom numbers and acupoint numbers are respectively the same as the symptoms and acupoints appear in texts of AMT prescription. One of the key factors in applying text data of books and clinical data to actual AMT through machine learning is symptom. Symptoms are the bridge between patient's condition and text data. The patient's condition can be accurately described by giving the combination of symptoms. At the same time, these symptoms can be diagnosed through various medical technologies. Different from traditional AMT, our method requires the presence or absence of various symptoms to accurately describe the patient's condition, so our naming method for symptoms is not exactly the same as that of TCM. Luo ^[19] gave the classification basis of 399 TCM symptom names by analyzing the attributes of symptoms. We plan to use these 399 symptom names as the symptom names in our database. However, 399 symptoms are not enough to accurately describe some diseases. Hence we have expanded the symptom names to 734 up to now. As to the name of acupoints, 409 acupoints certified by WHO and Ashi acupoint, totally 410 acupoint names are included in our database. Currently we have collected 3000 diseases cases of acupoints prescriptions from books ^[1,^20,^21]. We choose these three sources as our data materials because they are currently the only ones we have managed to compile, and we prioritized them due to the credibility of textbooks. In fact, additional high-quality data sources are essential before proceeding with further clinical trials.

Fig. 2. Take ``chronic fatigue syndrome'' as an example to illustrate the construction process of the database. Here the text of this figure is a translation of the Chinese text on page 845 of reference ^[1].

Table 2. Database-1.

Table 3. Database-2.

3.2 Algorithm selection

We regard the problem of acupoint selection as the problem of determining which acupoints are used for a series of symptoms. In machine learning, this kind of problem can be classified as multi-label classification problem and belong to supervised learning. The combination of symptom numbers is used as features of the model input, while the combination of acupoint numbers is used as labels of the model output. Meanwhile, it is obvious that the data type of this study is numeric. Based on the above observations, this study considers 11 different algorithms as potential candidates. We divide these 11 algorithms into the following 4 categories.

$\bullet$ Traditional machine learning algorithms: Traditional machine learning algorithms can be used to solve multi-label classification problems by employing problem transformation techniques, including Binary Relevance, Classifier Chains, and Label Powerset ^[22]. We use 8 classifiers for multi-label classification provided by Scikit-learn, a library in python that provides many unsupervised and supervised learning algorithms, which includes Naive Bayes, Decision Tree classifier, Extra Tree classifier, Extra Trees classifier, Kneighbors classifier, Random Forest classifier, Ridge classifierCV, Neural network. At the same time, each classifier is experimented with the three problem transformation techniques. Although these algorithms are commonly used to solve multi-label classification problems, the experimental results are not satisfactory, and the specific results are shown in Table 6 of the next chapter. Given these suboptimal results, we shift our focus to deep learning.

$\bullet$ FNN: FNN is a unidirectional multilayer network. Information is transmitted from the input layer to one direction layer by layer until the end of the output layer. FNN is composed of three parts, namely input layer, hidden layer and output layer. The hidden layer can be one layer or multiple layers. FNN can handle more complex nonlinear relationships in data and has stronger generalization than traditional machine algorithms. We hope that these advantages can achieve better results than traditional machine learning algorithms.

$\bullet$ TextCNN: CNN is usually used in the direction of computer vision (CV). Yoon ^[17] made some modifications to the input layer of CNN and proposed a text classification model TextCNN. The core of TextCNN is to convert words into word vectors through word2vec, and then turns sentences into matrices composed of word vectors, which are calculated through CNN. We hope to enhance the learning of the feature parts of the model through the use of CNN, thereby improving the effectiveness of the model.

$\bullet$ Seq2seq with attention: Seq2seq is an algorithm of encoding-decoding structure, originally proposed by Google, mainly for solving sequence-to-sequence tasks such as machine translation and text summarization. However, this approach has also been applied to multi-label classification problems, with previous studies ^[23,^24,^25] demonstrating the effectiveness of encoder-decoder models and achieving promising results. Seq2seq has two modules, namely encoder and decoder. The encoder encodes the input data and the decoder decodes the encoded data. Simple RNN, GRU, LSTM ^[26], etc. can be used inside the encoder and decoder. The addition of Attention mechanism (hereinafter referred to as attention) can greatly improve the performance of Seq2seq. The fundamental principle of ``Attention'' lies in simulating human information processing by assigning different weights to different parts of the data, thereby selectively focusing on key information. This mechanism calculates the relevance weights of input parts, enabling the model to concentrate on crucial information and ignore irrelevant details, thereby improving the accuracy and efficiency of handling complex tasks ^[27].

3.3 Data preprocessing

Different algorithms have different requirements for data representation, so we have different preprocessing for data when using different algorithms. For the 8 classifiers of traditional machine learning algorithms and FNN, the data can be preprocessed in the same way. For TextCNN and Seq2seq with attention, we use different data preprocessing methods separately.

For the 8 classifiers of traditional machine learning algorithms and FNN, numerical data can be directly used as features and labels of machine learning. Therefore, we encode the symptom of each case as a vector with a length of 734, the total number of symptoms in our database. Each element of the vector corresponds to a symptom name in the database. We encode the acupoints prescription of each case as a vector with a length of 410. Each element of the vector corresponds to the acupoint name in the database. As shown in Fig. 3, the left part of the table represents symptoms, ``0'' means there is no such symptom in the disease case, and ``1'' means there is such symptom in the disease case. The right part of the table shows the acupoints in the acupoints prescription corresponding to the symptoms on the left part of the table. ``0'' means that the acupoint is not stimulated, and ``1'' means that the acupoint is stimulated.

For TextCNN, the core of this algorithm is to convert words into word vectors through word2vec, and then turns sentences into matrices composed of word vectors, which are calculated through CNN. Therefore, if symptoms are coded as grid data, the information of the data can be obtained through CNN. Since the number of symptoms in the database is 734 at present, which is not very large, and there is no obvious relationship between the symptoms, we do not need to use the word2vec used in TextCNN. As shown in Fig. 4, our approach is to expand the symptom vector of 734 into a vector of 756 in length with ``0'', and then transform the vector into a matrix of $(27$, $28)$. In this way, both one-dimensional convolution kernel or two-dimensional convolution kernel can be used for operation.

For Seq2seq with attention, the embedding layer (which is already included in its structure) handles the conversion of the discrete inputs into a continuous vector representation, thus obviating the need to manually convert the symptom numbers and acupoint numbers into vectors. Therefore, the symptom numbers and acupoint numbers in the disease cases shown in Table 3 can be directly used as input features and corresponding labels for training the model.

Fig. 3. Coding method of the traditional machine learning algorithms and FNN.

Fig. 4. Feature encoding of TextCNN.

3.4 Evaluation Metrics

There are 4 possible situations for the results of classification problems, as shown in Table 4. In multi-label classification problems, the indicators commonly used to evaluate models are accuracy, precision, recall, for which the calculation formulas are shown in Eqs. (1)-(3) ^[22]. Accuracy measures the overall correctness of the model's predictions. Precision measures the proportion of correctly predicted positive labels among all the predicted positive labels. Recall measures the proportion of correctly predicted positive labels among all the actual positive labels. However, the above three evaluation criteria are not applicable to the evaluation criteria of the model in this study because of the characteristics of AMT prescription data.

(1)

$ Accuracy = \frac{TP + TN}{TP + TN + FP + FN},\\ $

(2)

$ Precision = \frac{TP}{TP + FP} ,\\ $

(3)

$ Recall = \frac{TP}{TP + FN} ,\\ $

(4)

$ IoU = \frac{TP}{TP + FP + FN}. $

Table 4. Four possible situations for the results of classification problems.

1	True Positive (TP)	Number of positive classes predicted as positive classes (1→1)
2	True Negative (TN)	Number of negative classes predicted as negative classes (0→0)
3	False Positive (FP)	Number of negative classes predicted as positive classes (0→1)
4	False Negative (FN)	Number of positive classes predicted as negative classes (1→0)

Table 5. The evaluation results of an acupoints prescription with two acupoints using different evaluation criteria.

Prediction results		Accuracy	Precision	Recall	IoU
A wrong acupoint	TP=0 TN=407 FP=1 FN=2	99.3%	0%	0%	0%
Two wrong acupoints	TP=0 TN=406 FP=2 FN=2	99.0%	0%	0%	0%
A correct acupoint and a wrong acupoint	TP=1 TN=407 FP=1 FN=1	99.5%	50%	50%	33.3%
A correct acupoint and two wrong acupoints	TP=1 TN=406 FP=2 FN=1	99.3%	33.3%	50%	25%
Two correct acupoints and a wrong acupoint	TP=2 TN=407 FP=1 FN=0	99.8%	66.7%	100%	66.7%
Two correct acupoints	TP=2 TN=408 FP=0 FN=0	100%	100%	100%	100%

IoU ^[18], also known as the Jaccard Index, is a metric utilized to compare the similarities and differences between finite sample sets. A higher IoU value signifies a higher degree of similarity between samples. As an evaluation metric, IoU not only comprehensively assesses the model's accuracy in pinpointing effective acupoints but also effectively penalizes the incorrect prediction of non-effective acupoints as effective, thereby effectively mitigating the challenges caused by imbalanced labels. The calculation formula for IoU is shown in Eq. (4). Table 5 shows the results of different evaluation indicators in several possible situations for predicting acupoint prescriptions with two acupoints. When the prediction result is a wrong acupoint, the accuracy is still as high as 99.3%, which is obviously not suitable for this problem. It can be seen from Table 5 that IoU is even more strict than other evaluation criteria, such as accuracy, precision and recall. Hence, IoU is taken as the main evaluation criterion of the model in this study.

Table 6. Experimental results of different machine learning algorithms.

Evaluating Indicator	Naive bayes	DecisionTree Classifier	ExtraTree Classifier	ExtraTrees Classifier	Kneighbors Classifier	RandomForest Classifier	Ridge ClassifierCV	Neural network	FNN	TextCNN	Seq2seq with attention
Accuracy	84.00%	80.22%	67.11%	82.66%	73.33%	82.88%	78.90%	84.44%	99.82%	99.81%	99.91%
Precision	33.24%	36.06%	31.53%	36.29%	30.12%	33.10%	33.72%	33.61%	49.32%	43.50%	97.23%
Recall	32.66%	35.12%	31.3%	32.11%	23.78%	32.70%	29.74%	33.00%	49.08%	39.70%	96.83%
IoU	30.13%	33.19%	27.84%	31.77%	23.11%	29.78%	28.62%	30.17%	41.62%	38.80%	95.72%

4. Experimental results and analysis

In this section, we report the experimental results of 11 machine learning algorithms on our database and identify the best performing algorithm Seq2seq with attention. Additionally, we analyze the impact of factors such as the presence or absence of attention and the order of data in the Seq2seq with attention algorithm on the results.

4.1 Comparative experiments

In order to reduce the uneven distribution of training data and test data and more objectively evaluate the effects of the trained model, we use 5-fold cross validation to verify the model, and conducts random processing before data segmentation. We take 90% of the data in the database are used for 5-fold cross validation to select the optimal algorithm, and 10% of the data as test data to evaluate the performance of the final model.

Table 6 shows the average results of 5-fold cross validation experiments for the 11 machine learning algorithms, obtained by adjusting their respective hyperparameters to achieve optimal performance. For Seq2Seq with attention, we set vocab_size to 735, wordvec_size to 128, hidden_size to 256, and batch_size to 64. Other hyperparameter settings are consistent with those documented in reference ^[27]. It can be seen from the results that the IoU of Seq2seq with attention algorithm is the highest, that is, 95.72%, while the IoUs of all other algorithms are less than 41.62%.

Table 7 provides a detailed breakdown of the results and average values obtained from 5-fold cross validation specifically for Seq2seq with attention. We retain the best model in the training process as trained model, and evaluate the effectiveness of the trained model through test data. We input the symptom numbers of 300 disease cases not participating in the training model into the trained model, and then the trained model gives the corresponding acupoint numbers. IoU of the test data is 95.33%.

Table 7. Experimental results of the 5-fold cross validation of Seq2seq with attention.

Evaluating Indicator	First	Second	Third	Fourth	Fifth	Average value
Accuracy	99.82%	99.96%	99.98%	99.97%	99.83%	99.91%
Precision	94.17%	98.69%	99.22%	99.02%	95.04%	97.23%
Recall	93.03%	98.83%	99.29%	98.81%	94.20%	96.83%
IoU	90.76%	98.30%	98.89%	98.50%	92.14%	95.72%

4.2 Further Experimental Analysis of Seq2seq with attention

To investigate the impact of the order of symptoms and acupoints numbers on the experimental results, we conduct the experiment by arranging the symptoms and acupoints numbers in ascending and descending order in the original data. Table 8 shows the results of 5-fold cross validation for both ascending and descending orders. Comparison of Table 7 and 8 reveals that label ordering has little effect on experimental outcomes, which is consistent with the findings reported in ^[23]. Specifically, the results suggest that if models are trained with appropriate regularization techniques, the order of symptoms and acupoints numbers is not a major factor influencing final performance.

Table 8. Experimental results of the 5-fold cross validation of Seq2seq with attention on ascending and descending data.

Order of data	Evaluating Indicator	First	Second	Third	Fourth	Fifth	Average value
Ascending data	Accuracy	99.70%	99.93%	99.93%	99.96%	99.73%	99.85%
	Precision	90.94%	98.06%	98.66%	99.17%	92.22%	95.81%
	Recall	90.94%	97.94%	98.26%	98.70%	91.78%	95.52%
	IoU	87.98%	97.20%	97.79%	98.39%	89.38%	94.15%
Descending data	Accuracy	99.68%	99.82%	99.98%	99.97%	99.83%	99.85%
	Precision	90.70%	95.46%	99.22%	99.02%	95.04%	95.89%
	Recall	89.91%	95.28%	99.29%	98.81%	94.20%	95.50%
	IoU	86.73%	93.36%	98.89%	98.50%	92.14%	93.92%

In order to analyze the impact of attention on model performance, we conduct ablation experiment. Table 9 shows the 5-fold cross validation results of Seq2seq without attention. Comparison of Table 7 and Table 9 reveals that the IoU of Seq2seq without attention is significantly lower than Seq2seq with attention. In addition, the IoU of the third and fourth folds are higher than 95%, while the IoU of the first, second, and fifth folds are all below 70%, indicating that the model's generalization ability without attention is poor. The experimental results show that the attention not only improves the IoU of the model, but also enhances the robustness of the model.

Table 9. Experimental results of the 5-fold cross validation of Seq2seq without attention.

Evaluating Indicator	First	Second	Third	Fourth	Fifth	Average value
Accuracy	99.06%	99.12%	99.93%	99.95%	99.07%	99.42%
Precision	72.77%	75.34%	98.13%	98.45%	73.56%	83.65%
Recall	76.56%	80.21%	98.30%	98.54%	78.06%	86.34%
IoU	63.54%	67.34%	97.18%	97.93%	64.58%	78.11%

To evaluate the generalization and stability of the algorithm under increased data volume, we conducted experiments on Seq2seq with attention using augmented data. For the symptoms of a case, some of them are not ``and'' relationship, but ``or'' relationship. Therefore, one or two secondary symptoms can be removed for cases with more than 10 symptoms. In this way, the number of disease cases can be increased and the influence of the main symptoms in the model can be enhanced. We expanded 3000 cases to 6000 in this way. Table 10 provides a detailed breakdown of the results and average values obtained from 5-fold cross validation specifically for Seq2seq with attention using 6000 augmented data. Comparing Table 7 and Table 10 reveals a slight improvement in the model's IoU for the dataset of 6000 cases, indicating that the algorithm has good scalability. It should be noted that AMT, as a medical practice, demands exceptionally high data quality standards. The augmented data in this paper is solely for testing the algorithm's scalability, not for training the final model.

Table 10. Experimental results of the 5-fold cross validation of Seq2seq with attention using 6000 augmented data.

Evaluating Indicator	First	Second	Third	Fourth	Fifth	Average value
Accuracy	99.85%	99.97%	99.98%	99.96%	99.86%	99.92%
Precision	95.60%	99.10%	99.23%	98.98%	96.04%	97.79%
Recall	95.75%	99.30%	99.26%	98.72%	96.21%	97.85%
IoU	93.93%	98.77%	98.92%	98.37%	94.84%	96.97%

4.3 Discussion

For traditional machine learning algorithms, we attribute their poor performance to the following three reasons. (1) Traditional machine learning algorithms, such as Decision Trees, Random Forests, and Naive Bayes methods, become ineffective in capturing higher order correlations as they can only capture first or second order correlations ^[25]; (2) For multi-label classification problems with excessive numbers of labels (e.g., 410 labels in this study), transformation techniques that include Binary Relevance, Classifier Chains, and Label Powerset are ineffective; (3) The imbalance of labels in AMT data is also one of the important reasons for the poor performance of traditional machine learning algorithms. It is difficult to ensure that the positive and negative sample sizes of each label in the training data are consistent in AMT data. For FNN and TextCNN, although higher-order correlations in features can be captured, their generalization ability is limited due to ignoring the correlations between labels.

For Seq2seq with attention, the superior performance in deciding acupoints compared to other algorithms can be attributed to 3 key advantages. Firstly, Seq2seq can capture higher order relationships among features. Secondly, due to the encoding-decoding structure of Seq2seq, it can accurately capture and predict the relationships between multiple labels while learning high-order correlations of features, thereby enhancing the model's generalization ability and overall performance. Furthermore, the addition of attention further improves the performance and generalization ability of the model.

Based on the experimental results and the analysis mentioned above, Seq2seq with attention is sufficiently applicable to determine acupoints.

4.4 Case Study

We showcase our model's practical performance through an example. We input a test case's symptom combination into our model, and it provides an acupoints prescription, as illustrated in Fig. 5. The model's prescription corresponds to the acupoints prescription described in reference ^[21] for Yin-Yang deficiency hypertension. Despite this particular combination of symptoms in the disease case not being included in the model's training data, the model provides an accurate acupoints prescription, which shows that our model has good generalization.

Fig. 5. Exemplary acupoints prescription by Seq2seq with attention.

5. Conclusion

This paper proposed a methodology of applying machine learning to provide acupoints prescriptions for treating patients based on symptoms. Firstly, a database of symptoms and corresponding acupoints prescriptions is built by extracting the names of symptoms from the texts of AMT and unifying and standardizing them. Secondly, 11 machine learning algorithms are applied to learn the data in the database for training the related model, and then the trained model is used to provide acupoints prescriptions for treating patients. Computational experiments were done, in which 90% of the data in the database of 3000 disease cases are used for 5-fold cross validation in order to select the algorithm with the best performance, and 10% of the data are used as test data to evaluate the generalization ability of the final model. From the experimental results, finally we find Seq2seq with attention is the best among the 11 algorithms and sufficiently applicable to determine acupoints.

As the future related works, we are to (1) further collect more data to expand the application scope of our model; (2) add another part of AMT prescription besides acupoints prescription, that is, manipulation, so as to provide more complete AMT prescription for treating patients; (3) leverage the complementary strengths of TCM and Western medicine to further enhance our model.

ACKNOWLEDGMENTS

This work was supported by JSPS KAKENHI Grant Number 20H04284 (Grant-in-Aid for Scientific Research (B)) and JST SPRING, Grant Number JPMJSP2111.

REFERENCES

B. L. Zhang, X. M. Shi, and F. R. Liang, Theory and Practice of Acupuncture & Moxibustion (in Chinese), China Press of Traditional Chinese Medicine, 2019.

R. C. Deo, ``Machine learning in medicine,'' Circulation, vol. 132, no. 20, pp. 1920-1930, 2015.

J. Goecks, V. Jalili, L. M. Heiser, and J. W. Gray, ``How machine learning will transform biomedicine,'' Cell, vol. 181, no. 1, pp. 92-101, 2020.

J. Liang, M. Y. Ming, C. B. Wang, X. L. Lv, Z. R. Sun, and H. N. Yin, ``Research progress in the integration of machine learning and the science of acupuncture and moxibustion,'' Acupuncture Research, vol. 46, no. 6, pp. 460-463, 2021.

X. Y. Yang, Y. Tu, and D. M. Duan, ``Application of curative effect prediction method in acupuncture treatment of depression,'' Journal of Beijing University of traditional Chinese Medicine (in Chinese), vol. 31, no. 5, pp. 355-357, 2008.

M. Fei and P. Xv, ``Estimation of the curative effects of acupuncture on heroin dependence by neural networks,'' Lishizhen Medicine and Materia Medica Research (in Chinese), vol. 19, no. 12, pp. 2974-2975, 2008.

W. S. Hao, X. S. Zhu, X. R. Wang, H. Y. Yang, Z. H. Wang, and Y. J. Zhang, ``Biochemical index variation prediction during electroacupuncture analgesia using ANFIS method,'' Journal of Shanghai Jiaotong University (in Chinese), vol. 42, no. 2, pp. 177-180, 2008.

Q. Gan, R. Wu, M. Nakata, and Q. W. Ge, ``A proposal of support system for acupuncture and moxibustion treatment in traditional Chinese medicine,'' IEICE Transactions on Information and Systems, vol. 120, no. 245, pp. 40-43, 2020.

I. Sutskever, O. Vinyals, and Q. V. Le, ``Sequence to sequence learning with neural networks,'' Advances in Neural Information Processing Systems, pp. 1-9, 2014.

A. Hyodo, Traditional Chinese Medicine Meridians and Acupoint Textbooks (in Japanese), Shinsei Publishing, 2012.

M. E. Maron and J. L. Kuhns, ``On relevance, probabilistic indexing and information retrieval,'' Journal of the ACM (JACM), vol. 7, no. 3, pp. 216-244, 1960.

T. Cover and P. Hart, ``Nearest neighbor pattern classification,'' IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21-27, 1967.

L. Breiman and J. H. Friedman, Classification and Regression Trees, Routledge, 2017.

T. M. Cover, ``Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition,'' IEEE Transactions on Electronic Computers, vol. 14, no. 3, pp. 326-334, 1965.

W. S. McCulloch and W. Pitts, ``A logical calculus of the ideas immanent in nervous activity,'' The Bulletin of Mathematical Biophysics, vol. 5, no. 4, pp. 115-133, 1943.

M. Frean, ``The upstart algorithm: A method for constructing and training feedforward neural networks,'' Neural Computation, vol. 2, no. 2, pp. 198-209, 2014.

Y. Kim, ``Convolutional neural networks for sentence classification,'' Proc. of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746-1751, 2014.

H. Rezatofighi, N. Tsoi, J. Y. Gwak, A. Sadeghian, I. Reid, and S. Savarese, ``Generalized intersection over union: A metric and a loss for bounding box regression,'' Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658-666, 2019.

Z. Luo, The Classification Research of the Symptomatic Units of TCM (in Chinese), M.S. Thesis, Shandong University of Traditional Chinese Medicine, 2012.

R. M. Yan, Essence of Yan Runming's 60 Years of Clinical Experience in Acupuncture and Moxibustion (in Chinese), China Press of Traditional Chinese Medicine, 2013.

S. Z. Gao and J. Yang, Therapeutics of Acupuncture and Moxibustion (in Chinese), China Press of Traditional Chinese Medicine, 2016.

M. L. Zhang and Z. H. Zhou, ``A review on multilabel learning algorithms,'' IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 8, pp. 1819-1837, 2013.

J. Nam, E. L. Mencía, H. J. Kim, and J. Fürnkranz, ``Maximizing subset accuracy with recurrent neural networks in multi-label classification,'' Advances in Neural Information Processing Systems, vol. 30, pp. 5413-5423, 2017.

P. C. Yang, X. Sun, W. Li, S. Ma, W. Wu, and H. F. Wang, ``SGM: Sequence qeneration model for multi-label classification,'' Proc. of the 27th International Conference on Computational Linguistics, pp. 3915-3926, 2018.

W. Liao, Y. Wang, Y. Yin, X. Zhang, and P. Ma, ``Improved sequence generation model for multilabel classification via CNN and initialized fully connection,'' Neurocomputing, vol. 382, pp. 188-195, 2020.

K. Greff, R. K. Srivastava, J. Koutník, B. R. Steune- brink, and J. Schmidhuber, ``LSTM: A search space Odyssey,'' IEEE Transactions on Neural Networks & Learning Systems, vol. 28, no. 10, pp. 2222-2232, 2016.

K. Saito, Deep Learning from Scratch 2 (in Japanese), O'Reilly Japan, Inc., 2018.

Author

Hang Yang

Hang Yang received his B.E. degree from Shanghai University of Engineering Science, China, in 2014, and an M.E. degree from Zhejiang Sci-Tech University, China, in 2020. He is currently a Ph.D. candidate at the Graduate School of East Asian Studies, Yamaguchi University, Japan. His research interest includes artificial intelligence, system modeling and modeling of acupuncture and moxibustion treatment in traditional Chinese medicine.

Ren Wu

Ren Wu received her B.E. and M.E. degrees from Hiroshima University, Japan, in 1988 and 1990, respectively, and a Ph.D. from Yamaguchi University, Japan, in 2013. She was with Fujitsu Ten Ltd., West Japan Information Systems Co., Ltd. and Yamaguchi Junior College from 1991 to March 2024. Since April 2024, she has been an Associate Professor at Shunan University, Japan. Her research interest includes information processing systems, linguistic information processing and system modeling. She is a member of the Institute of Electronics, Information and Communication Engineers (IEICE) and the Institute of Information Processing Society of Japan (IPSJ).

Mitsuru Nakata

Mitsuru Nakata received his B.E., M.E., and Ph.D. degrees from Fukui University, Japan, in 1992, 1994 and 1998, respectively. He was a Lecturer from 1998 to 2004 and an Associate Professor from 2004 to 2014 both at Yamaguchi University, Japan. Since October 2014, he has been a Professor at Yamaguchi University. His research interest includes database system, text processing and program net theory and information education. He is a member of the Institute of Electronics, Information and Communication Engineers (IEICE), the Institute of Information Processing Society of Japan (IPSJ) and the Institute of Electrical and Electronics Engineers (IEEE).

Qi-Wei Ge

Qi-Wei Ge received his B.E. degree from Fudan University, China, in 1983, his M.E. and Ph.D. degrees from Hiroshima University, Japan, in 1987 and 1991, respectively. He was with Fujitsu Ten Limited from 1991 to 1993. He was an Associate Professor at Yamaguchi University, Japan, from 1993 to 2004. Since April 2004, he has been a Professor at Yamaguchi University, Japan. He is currently a Trustee at Yamaguchi University, Japan. His research interest includes Petri nets, program net theory and combinatorics. He is a member of the Institute of Electronics, Information and Communication Engineers (IEICE), the Institute of Information Processing Society of Japan (IPSJ) and the Institute of Electrical and Electronics Engineers (IEEE).