Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 14, No. 02, p.268-279

ISSN (online) :

2287-5255

Received : 9 July 2024Revised : 23 August 2024Accepted : 19 November 2024

DOI :

https://doi.org/10.5573/IEIESPC.2025.14.2.268

Regular Paper

Feature Extraction of Mazu Pattern Elements in B&B Space Design Based on SURF Algorithm

YanMingfeng¹

(Arts & Design College, Putian University, Putian, 351100, China)

^*Corresponding Author: Minfeng Yan, MingfengYan@outlook.com

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

The design of B&B space presents challenges in extracting and matching Mazu pattern elements. This study introduces an efficient approach based on the fast robust feature algorithm to address this issue. Initially, an image-aware hash model is created, incorporating the fast robust feature algorithm. Subsequently, an enhanced Siamese network model is established, integrating the fast robust feature algorithm. Results indicate that the improved fast robust feature algorithm exhibits superior robustness compared to the traditional approach, achieving a matching ratio of 0.4 to 0.6. The proposed algorithm attains 93.53% and 93.91% image retrieval accuracy on self-built and Mnist datasets, surpassing other comparison algorithms. Through grayscale histogram and perceptual hash algorithm integration, the method enhances recognition accuracy during image deformation, especially under rotation and scale changes. Although encoding times are longer at 8.12 and 5.25 seconds, respectively, the proficient handling of rotation and scale invariance remains unaffected. This study offers an effective solution for precise feature extraction in intricate patterns within B&B space design, particularly in managing image rotation and scale alterations, presenting robust technical support for image processing and pattern recognition.

Keywords

Fast robust features, Mazu, Pattern elements, Feature extraction, Perceptual hashing

1. Introduction

In the current design field, especially in the design of B&B space, the feature extraction and matching of pattern elements play a key role in achieving accurate design and communication of cultural elements. As a culturally rich and meaningful design element, the application of Mazu pattern in the design of homestay space is particularly prominent ^[1,^2]. However, accurately extracting and matching the features of these pattern elements is a challenging task under the existing state of the art. Especially under complex conditions such as image rotation and scale transformation, traditional manual feature extraction and matching methods are often difficult to achieve efficient and accurate results. Although the manual local descriptor method is effective to a certain extent, it has limitations in dealing with complex deformations and environmental changes ^[3,^4,^5]. Similarly, methods that rely on deep learning, while capable of processing large-scale data, are still lacking in terms of adaptability to specific tasks such as rotation and scale changes ^[6]. Therefore, an effective method based on Speeded-Up Robust Features (SURF) algorithm is proposed to solve this problem. Firstly, an image-aware hash model is constructed and the SURF algorithm is integrated to improve the model's adaptability to image rotation and scale changes. In addition, the stability and accuracy of the model when dealing with complex image transformations are further enhanced by introducing the Siamese network model with the improved SURF algorithm. The overall structure of the study consists of four parts: the first part summarizes the relevant research results and shortcomings of SURF and feature extraction at home and abroad, the second part introduces in detail the construction process of pattern element feature extraction matching and image perception hash model, and the integration of SURF algorithm and Siamese network model, the third part compares and analyzes the research experiments through the proposed algorithm, and the fourth part summarizes the experimental results, points out the shortcomings of the research, and proposes future research directions.

2. Related Works

In the field of image processing, the SURF algorithm plays an important role in feature extraction and matching tasks. The SURF algorithm is widely used due to its efficient feature detection ability, and deep learning, especially convolutional neural networks, have shown great potential in automatically learning complex image features ^[7]. Here are some of the relevant studies by scientists and scholars. Cho E et al. proposed a dynamic improvement algorithm for high-definition image pyramids, which completely removed the redundancy of key point detection without losing the original function of SURF. Compared with the key point detection in the original SURF algorithm, the proposed method reduced memory usage by 45%-51%, execution time by 37%-50%, and power consumption by 18%-42% ^[8]. The research by Yasin A S M et al. proposed a unique approach that allowed autonomous robots to locate themselves in a specific area based solely on a pre-given set of images, without the need for external assistance. This method involved building an image database and storing it in the robot's memory to achieve fast image matching. In the performance evaluation, the SURF algorithm outperformed the other two algorithms in terms of accuracy and runtime ^[9]. Nawaz S A et al. proposed an image zero watermark algorithm based on SURF-DCT perceptual hashing, which first used the SURF features of the image for watermark embedding and extraction. Then, a hash sequence was generated through perceptual hashing and quantization processing to capture semi-global geometric features. Experimental results showed that the algorithm performed well in terms of robustness ^[10]. An Q et al. developed a power-function weighted image stitching method that integrated SURF optimization and acceleration units. The approach involved preliminary feature point screening using cosine similarity, followed by fine matching with the MSAC algorithm, and weight calculation of center points using the power function weighted fusion algorithm. Results indicated an approximately 11% improvement in matching accuracy compared to the traditional SURF algorithm, along with a reduction in matching time by around 1.6 seconds ^[11].

Radha R et al. proposed that the key to a content-based image retrieval system was to automatically index, search, retrieve, and browse images in a database. In this study, the performance of well-known feature extraction techniques such as SIFT, SURF, directional FAST and rotary BRIEF in sketch and painting-based image retrieval was discussed. The experimental results identified the most suitable extraction techniques and highlighted the importance of SIFT, SURF, and ORB features ^[12]. Xu H et al. introduced a static gesture recognition method that combines an enhanced SURF algorithm with a Bayesian regularized BP neural network. The approach involves extracting gesture features using the improved SURF algorithm and employing the Bayesian regularized BP neural network for classification. In addition, a classification method based on finger angle was introduced, and finally the final recognition was obtained by combining the two results. The results showed that the proposed method performed well in terms of robustness and accuracy of gesture recognition ^[13]. Yohannes Y et al. proposed two stages of SURF feature formation: point-of-interest detection, which included integrated images and detection based on Hessian matrices, and point-of-interest description, which involved orientation assignment and description subgeneration. Applying these features to random forest classification, 345 images of the Palembang Songkit pattern were tested. The test results showed that the overall accuracy was 68.89%, the accuracy rate of each class was 79.26%, and the precision and recall rates were 69.27% and 68.89%, respectively ^[14]. Bansal M et al. compared and evaluated the performance of the Shi-Tomasi corner detector with the SIFT and SURF feature descriptors. In order to speed up the processing, the local projection method was used to reduce the amount of feature calculation. The extracted features were classified by K-NN, decision tree and random forest classifier, and experiments were carried out using the Caltech-101 image dataset. The results showed that the accuracy of the random forest, decision tree, and K-NN classifier combined with Shi-Tomasi, SIFT, and SURF features was 85.9%, 80.8%, and 74.8%, respectively ^[15].

In summary, although previous studies have made progress in SURF algorithm and neural network, they are limited in revealing the high-level semantic features of images, and the time complexity in image feature extraction and matching retrieval applications is high. In this study, the improved SURF algorithm and the Siamese network model are fused to improve the anti-rotation and scale transformation performance, and to improve the accuracy of feature extraction and match retrieval.

3. Methods

In this study, the construction of pattern element feature extraction matching and image-aware hash model is proposed. On this basis, a perceptual hash model combined with SURF algorithm is further introduced. In order to enhance the rotation and translation invariance, a Siamese network model with improved SURF algorithm is proposed.

3.1 Pattern element feature extraction and matching and image perception hash model construction

When studying the feature extraction of Mazu patterns in the design of homestay spaces, the SURF algorithm was used to extract Mazu features. Compared to other feature extraction algorithms, SURF algorithm can reduce computational complexity while maintaining the stable performance of SIFT algorithm. At the same time, the SURF algorithm has good robustness and a higher feature point recognition rate than SIFT. It is generally superior to SIFT in situations such as viewpoint, lighting, and scale changes. The research first focuses on the identification and classification of key areas in the image, especially when the object is moved, distorted, and occluded. Compared with global features, local features perform better at adapting to these changes, effectively reducing mismatches. The process of local feature extraction involves key point detection, feature description, matching and classification. The detection of key points relies on the definition of salient points, and their descriptions are based on the characteristics of proximity pixels. In addition, the Gaussian difference operator is proposed, which is more efficient than the Hessian-Laplace operator in processing the local structure of the image, as shown in Eq. (1).

(1)

$ \frac{\partial G}{\partial \sigma } =\sigma \nabla ^{2} G . $

In Eq. (1), the expression on the left can be approximated as shown in Eq. (2).

(2)

$\begin{align} \left\{\begin{aligned} & \frac{\partial G}{\partial \sigma } \approx \frac{G(x,k\sigma )-G(x,\sigma )}{(k-1)\sigma },\\ & (k-1)\sigma ^{2} \nabla ^{2} G\approx G(x,k\sigma )-G(x,\sigma ). \end{aligned}\right.\end{align} $

In the field of feature point detection, both Harris-Affine and Hessian-Affine operators are derived from Harris-Laplace algorithms. These two operators improve the isotropy limit of the original algorithm and introduce the invariance of affine transformation. They enhance the recognition and retrieval capabilities of images by iteratively adjusting the position and scale of feature points. In contrast, the MSER operator achieves affine invariance through binarization ^[16]. The stable extreme regions identified by MSER are defined in Eq. (3).

(3)

$ q(i)=\frac{|Q_{i+\nabla } |-\left|Q_{i-\nabla } \right|}{|Q_{i} |}. $

In the face of huge image datasets, traditional data processing methods are no longer suitable. In this study, a Perceptual Hashing (PH) model is proposed, which can convert key image content into unique binary sequences, thereby simplifying the storage and management of images, especially in the field of image search, which significantly improves query efficiency ^[17,^18]. The perceptual hash model is shown in Fig. 1.

Fig. 1 shows that the perceptual hash model resembles the digitization of multimedia content, creating a one-way connection as a unique feature of multimedia content and ensuring the security and robustness of the technology, the feature volume of datasets A1, A2, and A3 is significantly reduced after perceptual hashing, while these processed features still carry the key information of the original data. The perceptual hashing algorithm can effectively convert large-scale data objects into smaller binary formats, and this conversion maintains a certain consistency among similar data objects. The core of perceptual hashing technology, i.e., its mapping mechanism, can be defined in Eq. (4).

(4)

$ h=PH(I) . $

In Eq. (4), $I$ represents the input data and $h$ represents the result of the mapping; $PH$ stands for Perceived Hash Model. In image retrieval, the hash encoding of the image is stored in the database, and then similar images are found by comparing the hash value of the image to be retrieved with the value in the database. Comparing hashes often uses Hamming distance, which is one of the standard methods used in machine learning to measure the similarity of two pieces of data.

Fig. 1. PH mapping model structure diagram.

3.2 Feature extraction of pattern elements based on SURF algorithm

In the design of B&B space, the key to feature extraction of Mazu patterns lies in the use of effective image analysis methods. Traditionally, image retrieval has relied on perceptual hashing techniques, including grayscale basis thresholds, frequency thresholds based on discrete cosine transforms, and multi-dimensional global feature methods. These are mainly focused on the global nature of the image, with low sensitivity to local details. On the contrary, the SURF algorithm uses Hessian matrix to identify local extremum, which improves the accuracy of feature extraction. Although SURF lacks real-time performance, its accuracy is remarkable. In this study, based on the feature extraction and matching of pattern elements and the image perception hash model, a perceptual hash model combined with the SURF algorithm is further proposed, which aims to improve the retrieval efficiency and accuracy, and its core principle is shown in Eq. (5).

(5)

$ H(\varepsilon ,\partial )=\left[\!\!\begin{array}{cc} {L_{xx} (\varepsilon ,\partial )} & {L_{xy} (\varepsilon ,\partial )} \\ {L_{xy} (\varepsilon ,\partial )} & {L_{yy} (\varepsilon ,\partial )} \end{array}\!\!\right] . $

In Eq. (5), the derivative convolution of the point $\varepsilon $ and Gaussian equations is represented as $L_{xx} (\varepsilon ,\partial )$. In the specific application, this expression is approximated because the Gaussian equation needs to be discretized, as shown in Eq. (6).

(6)

$ Det(H_{approx} )=D_{xx} D_{yy} -(0.9D_{xy} )^{2} . $

In Eq. (6), $D_{xx} $ and $D_{yy} $ are approximate to the box filters of $L_{xx} $ and $L_{yy} $, respectively. This approximation makes the recognition of point $\varepsilon $ as local extremum points more efficient, especially in the case of accelerated computation with integrated images. The SURF algorithm establishes the scale space by adjusting the size of the filter, and selects the extreme points as the feature points. In addition, SURF also optimizes the accuracy of feature points through interpolation operations, and uses Hessian matrix and interpolation algorithm to locate feature points in detail. The final selection of feature points depends on their stability, and unstable points are discarded, as shown in Eq. (7) ^[19].

(7)

$ x=-\frac{\partial ^{2} H^{-1} }{\partial x^{2} } \frac{\partial H}{\partial x} . $

The circular area around the feature point is analyzed, and the Harr wavelet response within $x$ and $y$ on the axis is calculated. These reactions are summarized using a Gaussian weighted sector window to determine the cardinal orientation of the feature points, as shown in Fig. 2.

Fig. 2. Main direction diagram of feature points.

Next, for each feature point, a square area is selected around its principal direction, which is divided into 16 sub-areas. The Harr wavelet response is calculated for each region and is Gaussian-weighted. The resulting 4-dimensional vectors are assembled into 64-dimensional eigenvectors. These vectors are normalized to ensure rotation, illumination, and scale invariance, as shown in Fig. 3.

Fig. 3. SURF feature points describe the substructure diagram.

The direct application of the SURF algorithm in image retrieval is limited by its high time complexity. It consists of two steps, feature detection and descriptor definition, to ensure scale and rotation invariance, respectively. The construction of the scale pyramid is a time-consuming step, and although the scale adaptability is improved, the neighborhood differences of feature points are still significant. In order to reduce this effect, this study fuses scale transformation and perceptual hash encoding to create a hashing algorithm that is resistant to rotation and scale transformation. SURF algorithm to locate the image feature points, as shown in Eq. (8).

(8)

$ P=\{ (x_{1} ,y_{1} ),~(x_{2} ,y_{2} ),~(x_{3} ,y_{3} ),~...,~(x_{k} ,y_{k} )\} . $

In Eq. (8), the total number of identified feature points is described as $k$. Subsequently, the K-means algorithm determines the center point in the set $P$, which is defined in Eq. (9).

(9)

$ (x_{z} ,y_{z} )=\frac{1}{k} \sum _{i=1}^{k}P_{i}. $

Next, the Euclidean distances from all elements in the set $P$ to the pixels $(x_{z} ,y_{z} )$ are calculated, and these distances are arranged in ascending order. The result of the sorting is shown in Eq. (10).

(10)

$ D=\{ d_{1} ,~d_{2} ,~d_{3} ,~...,~d_{k} \}. $

Next, $R$ is set to $10/k$ and a series of concentric circles are drawn with $(x_{z} ,y_{z} )$ as the center, the radius is $R/64$, $R/32$, ..., $R$ in turn. The number of feature points is calculated within each ring, as shown in Eq. (11).

(11)

$ N=\{ n_{1} ,~n_{2} ,~n_{3} ,~...,~n_{64} \} . $

In order to enhance the anti-rotation and anti-scale variation characteristics of the encoding, the image is scaled at a scale of 0.5 to 4. The transformed image is processed in steps (8) to (11) of equations to calculate the $N_{i} $ values and form a $K$ set as $\{ N_{1} $, $N_{2} $, $N_{3} $, ..., $N_{8} \} $. The $K$ is then hashed, the resulting encoding is $h_{2} $, and the final encoding is synthesized as $h=[\!\!\begin{array}{cc} {h_{1} } & {h_{2} } \end{array}\!\!]$, as shown in Eq. (12).

(12)

$\begin{align} h(i)=\left\{\begin{aligned} &{1},&&\text{if } n_{i} \ge \bar{N},~i=1,~2,~...,~64,\\ & &&\hskip 3.5pc N=\{ n_{1} ,~n_{2} ,~...,~n_{64} \},\\ &{0},&&\text{if } n_{i} <\bar{N},~i=1,~2,~...,~64,\\ & &&\hskip 3.5pc N=\{ n_{1} ,~n_{2} ,~...,~n_{64} \}. \end{aligned}\right.\end{align} $

3.3 Grayscale histogram and Siamese network feature extraction based on improved SURF algorithm

To improve the accuracy of the retrieval algorithm, the global features of the image were included in the coding results. Through the grayscale histogram, the vector representation of the image is generated, and the cosine similarity measures the similarity between the vectors. The grayscale histogram counts the frequency of each grayscale value in the image and reflects the global color distribution. This method, combined with the above-mentioned SURF base-perceptual hashing algorithm, can effectively improve the sensitivity to global features ^[20]. Firstly, the number of pixels in the grayscale image is counted, the histogram is divided into 64 regions to generate vectors, and finally the global grayscale feature similarity is compared by cosine distance, as shown in Eq. (13).

(13)

$\begin{align} \left\{\begin{aligned} & \cos (\theta )=\frac{\sum _{i=1}^{n}(x_{i} \times y_{i} ) }{\sqrt{\sum _{i=1}^{n}(x_{i} )^{2} } \times \sqrt{\sum _{i=1}^{n}(y_{i} )^{2} } }\\ &\hskip 2.4pc =\frac{a\cdot b}{\|a\|\times \|b\|},\\ & \theta =\arccos (\cos (\theta )). \end{aligned}\right.\end{align} $

In Eq. (13), $\theta $ represents the angle between vectors, and $a$ and $b$ represent the histogram vectors of the two images, respectively. Considering that the traditional SURF algorithm mainly recognizes the underlying features such as edge and brightness changes, it is not enough to reveal the high-level semantics. This limits the accuracy of the search results to reflect user intent. In order to make up for this shortcoming, a Siamese network model with improved SURF algorithm was proposed. The model transforms the input into the target space by a mapping function, where the Euclidean distance is used for similarity comparison. The training phase aims to minimize the loss between samples of the same class and maximize the loss between different classes. Convolutional neural networks process images through local feature abstraction, but the features are significantly different when rotated or translated at large angles. In order to enhance the rotation and translation invariance, a module is integrated in front of the Siamese network, which first extracts SURF features, then matches them by the nearest neighbor algorithm, and finally calculates the parameters of the correlated affine transformation model, as shown in Fig. 4.

Fig. 4. Anti-rotation and anti-translation conversion module.

Due to the fixed image size caused by the fully connected layer, it is recommended to add a space pyramid pool (SPP) before the fully connected layer of the network. This enables the network to process input images of any size, enhancing their scale invariance. SPP can extract fixed-dimensional feature vectors from feature maps, for example, in a simple double-layer network, regardless of the size of the input image, 21-dimensional feature vectors can be extracted, as shown in Fig. 5.

Fig. 5. SPP diagram.

Each image within the initial image and pyramid group is used as a match. Unlike the earlier Siamese network, this network adds a regularization term to the objective function to enhance scale invariance. In this way, images of the same scale group produce more consistent features. The parameters of the scale-invariant layer are denoted as $(W_{a} ,W_{b} )$, and its output is shown in Eq. (14).

(14)

$ O_{a} =\kappa (W_{a} O_{m} (x_{i} )+B_{a} ) . $

In Eq. (14), $\kappa $ is described as $\max (x,0)$; $O_{m} $ represents the input of the scale invariant layer; $B_{a} $ represents the regularization term. In this study, the Siamese network model based on the SURF algorithm is improved, and its invariance to rotation, scale and translation is enhanced. The network can process images of any size and consists of two branches that share parameters. The image is corrected by the rotation and translation module and processed by a convolutional neural network. The SPP layer extracts fixed features from the feature map and outputs them to the fully connected layer. Finally, the network outputs feature vectors with the aim of minimizing the loss of similar images and maintaining the consistency of the feature vectors in training, as shown in Fig. 6.

Fig. 6. SIAMESE network model structure diagram.

4. Results and Discussion

In this research experiment, a self-built dataset was created, which was used as the basis for the analysis of the nearest neighbor ratio and robustness, as well as the evaluation of the algorithm's performance in resisting rotation and scale changes. Then, the Mnist dataset was introduced, and the self-built dataset was combined for image feature extraction and matching analysis.

4.1 Nearest neighbor ratio and robustness results based on SURF feature extraction matching

In order to explore the feature extraction of Mazu patterns in homestay space design based on SURF algorithm, this study created a database containing various pattern categories, such as architectural images and Mazu ceramic patterns. The images in the self built dataset were all relevant images collected by Sony A7C II, and the collection location was Quanzhou City, Fujian Province. These images are of similar size and are divided into 10 categories such as architecture, ceramics, packaging, sculpture, etc., totaling 1000 images of different categories. And the object in the diagram occupies the main position. Fig. 7 shows some of the images in the database. Ten categories were selected from it, and 10 images were randomly selected from each category. These images are rotated, noise-added, resized, reshaped, and logo-added, resulting in 25 test images per raw image. The experiment was conducted using MATLAB R2020a. In order to verify the performance of the proposed algorithm, a comparison was made between the image aware hashing algorithm combined with SIFT and PCA and the algorithm in reference ^[10]. Additionally, the algorithm proposed in the study was compared with the basic algorithm discrete cosine transform (DCT) algorithm in reference ^[10].

Fig. 7. Partial image database.

This study focused on feature extraction and matching of Mazu patterns based on SURF algorithm. Experiments compared the feature extraction and retrieval effects using different nearest neighbor ratios in SURF feature matching, and compared the robustness of the improved SURF algorithm based on the perceptual hash coding model and the traditional SURF algorithm, as shown in Fig. 8.

Fig. 8. Different nearest neighbor ratios and SURF accuracy before and after improvement.

As can be observed in Fig. 8(a), the robustness of the algorithm is highest when the nearest neighbor matching ratio is set between 0.4 and 0.6. Setting above this range preserves too many false matches, while values below this preclude many correct matches, all of which affect robustness. Fig. 8(b) shows that the original SURF algorithm outperforms the improved algorithm in terms of robustness. However, the average encoding time using the improved SURF algorithm is only 0.10 seconds, which is much lower than the 0.50 seconds of traditional algorithms. This is mainly due to the fact that the improved algorithm omits the scale pyramid construction step, which significantly reduces the time complexity. Next, the study compared the robustness and real-time performance of the algorithm with other existing algorithms, and the specific results are shown in Fig. 9.

Fig. 9. Comparison of four algorithms.

As shown in Fig. 9(a), compared to other algorithms, the proposed image feature matching algorithm has higher precision. As for the Top 100, the precision of the proposed method is 0.98, which is higher than other methods. As shown in Fig. 9(b), the proposed image feature matching algorithm consistently has a higher plagiarism detection rate than other algorithms. Taking the Top 100 as an example, the plagiarism detection rate of this algorithm is 0.96, which is higher than other algorithms. As shown in Fig. 9(c), the accuracy recall curve of the proposed image feature matching algorithm completely envelops the accuracy recall curves of other algorithms, indicating that the performance of the proposed image feature matching algorithm is superior to other algorithms. To verify the algorithm performance in resisting rotation and scale changes, the transformed images were tested for feature extraction and matching retrieval on the self-built dataset to measure its accuracy. Finally, a single image was selected as the output of the detection, and Fig. 10 shows the results of related experiments.

Fig. 10. Various image feature recognition accuracy rate.

In Fig. 10, the average retrieval accuracy of the proposed algorithm, the Method A, the Method B, and the Method C are 93.27%, 65.52%, 84.78%, and 67.72%, respectively. Results showed that the proposed algorithm was significantly better than other algorithms in resisting rotational change, and performed better in the case of image deformation. This was made possible by incorporating a grayscale histogram comparison between images into the final evaluation. It can be found that the proposed algorithm mainly strengthened the anti-rotation characteristics of hash coding, and was comparable to other algorithms in terms of scale invariance.

4.2 Siamese network based on SURF algorithm for image feature extraction and matching analysis

The datasets used in the study include a self built image set and an MNIST dataset, with the MNIST dataset sourced from the National Institute of Standards and Technology in the United States. The MNIST dataset contained 245${\sim}$250 numbers handwritten by individuals. During the training phase, for the sake of efficiency, anti-rotation modules that do not involve parameter adjustment were temporarily excluded. This module was mainly used for image correction. When training on a self-managed dataset, two images are randomly selected, with the same category marked as a match and different categories marked as a mismatch. The model was updated by stochastic gradient descent method, and the learning rate was tested. As the number of iterations increased, the loss function gradually decreased to close to 0, as shown in Fig. 11, which indicated that the model effectively reduced the distance between similar images and expanded the distance between different images.

Fig. 11. The learning rate training process under different iterations.

In this study, the self-built image library contained 10 categories, with a total of 1000 images in different categories, which were used for image retrieval experiments. 10 images were randomly selected from each category, and the images were subjected to a variety of treatments, including keeping them in their original shape, applying rotation, increasing noise, adjusting size and brightness, changing shape, and adding a logo, resulting in a different test image for each original image. These processed image samples are shown in Fig. 12.

Fig. 12. Image feature transformation.

For both datasets, 1,000 images were randomly selected. In addition, an additional 100 images were selected as the query set, and the images were rotated, translated, and scaled to facilitate feature extraction and matching, as shown in Fig. 13.

Fig. 13. Experimental results of self-built data set and Mnist data set.

As can be observed in Fig. 13, the algorithm studied outperforms the others. Other algorithms attempted to improve the adaptability to rotation and scale through data augmentation, but this approach did not fundamentally improve model's performance in these aspects. The core operations of CNNs limited their ability to handle rotational and scale transformations, and the effect was limited. From Table 1, the average accuracy of the proposed algorithm was 93.53% and 93.91%, respectively, which was better than other algorithms. This was due to the advantages of SURF in processing the underlying image features.

Next, the average time required for image feature extraction and match retrieval was calculated, and the real-time performance of different algorithms was compared. Table 2 shows the results on the self-managed dataset and the Mnist dataset. In Table 2, the average processing time of the proposed algorithm was 8.12 s and 5.25 s, respectively, which was relatively long. This was mainly due to the fact that the algorithm needed to perform SURF feature point extraction and matching on the input image. In contrast, other algorithms had faster feature extraction due to the simple structure of the model. The DCT algorithm was the shortest among all algorithms, with an average time of only 0.013s and 0.011s, respectively, due to its simple calculation of the similarity between images through Hamming distance. To verify the impact of SPP on the scale robustness of the proposed algorithm, ablation experiments were conducted, and the results are shown in Table 3.

According to Table 3, the recognition accuracy of the algorithm with SPP module is much higher than that of the algorithm without SPP module. Taking the rotation operation as an example, the recognition accuracy with and without SPP is 95.43% and 78.53%, respectively. The above results indicate that SPP can effectively enhance the scale robustness of the algorithm.

Table 1. Recognition of various image transformations in two data sets.

Data set	Type	Ours	Method A	Method B	Method C
Build your own data set	Rotate	94.23%	68.22%	67.13%	60.12%
	Noise	98.17%	98.12%	98.15%	62.02%
	Scale Variation	93.12%	83.18%	87.10%	62.40%
	Shape	83.47%	81.59%	80.12%	70.09%
	Shape change	97.12%	97.10%	98.02%	60.11%
	Add LOGO	95.04%	96.12%	97.14%	80.45%
MNIST data set	Rotate	94.23%	68.22%	68.30%	62.11%
	Noise	98.22%	98.32%	98.43%	63.42%
	Scale Variation	94.22%	84.28%	88.21%	64.42%
	Shape change	83.78%	82.46%	84.32%	73.13%
	Luminance	97.55%	97.62%	98.42%	67.23%
	Add LOGO	95.43%	96.72%	97.82%	82.66%
Average		93.72%	87.66%	88.60%	67.35%

Table 2. Comparison of feature extraction and matching retrieval time of four algorithms in two data sets.

Data set	Ours	Method A	Method B	Method C
Build your own data set	8.12s	3.254s	5.213s	0.013s
MNIST data set	5.25s	1.363s	3.417s	0.011s

Table 3. Results of the ablation experiments.

Type	SPP exists	No SPP
Rotate	95.43%	78.53%
Noise	98.19%	82.12%
Scale Variation	93.22%	76.45%
Shape change	89.47%	72.81%
Luminance	98.12%	83.34%
Add LOGO	96.21%	86.71%

5. Conclusion

The main challenge of this research is how to effectively extract and match the element characteristics of the Mazu pattern in the design of the homestay space. In order to solve this challenge, an image-aware hash model is first constructed and the SURF algorithm is integrated into it. Subsequently, the improved SURF algorithm SIAMESE network model is further developed, focusing on improving model's performance in rotation and translation invariance. The results showed that the improved SURF algorithm performed best in terms of robustness when using a nearest neighbor matching ratio of 0.4 to 0.6, surpassing the traditional SURF algorithm. In the self-built dataset and the Mnist dataset, the improved algorithm achieved 93.53% and 93.91% image retrieval accuracy, respectively, showing higher efficiency than other algorithms. However, the algorithm was relatively long in terms of encoding time, averaging 8.12 seconds and 5.25 seconds, respectively. Despite this, its recognition accuracy when dealing with image distortion was significantly improved, especially in terms of resistance to rotation and scale changes. In summary, although the algorithm has made remarkable achievements in enhancing the invariance of rotation and scale transformation, the length of encoding time is still the main limitation of its application. Future research will focus on improving coding efficiency, with the aim of shortening the time of feature extraction and matching process and making it more suitable for rapid response scenarios. In addition, this study will explore the ability of the algorithm to parse the high-level semantic features of images, so as to improve its overall performance and application scope. Considering the diverse image data and application scenarios, future work will be extended to a wider range of datasets and application fields to enhance model's universality and practicability.

Funding

The research is supported by 2023 Putian City Science and Technology Plan Project: Application Strategy of Mazu Pattern Elements in Homestay Space Design - Taking Meizhou Island as an Example. (Project number: 2023SZ3001PTXY06).

REFERENCES

R. Singh, A. Acharya, and S. Tiwari, ``Pose and illumination invariant hybrid feature extraction for Nnewborn,'' Recent Advances in Computer Science and Communications, vol. 14, no. 2, pp. 368-375, 2021.

F. M. El-Ghamry, W. El-Shafai, and M. I. Abdalla, ``Gauss gradient and SURF features for landmine detection from GPR images,'' Computers, Materials, and Continuum, vol. 71, no. 3, pp. 4457-4487, 2022.

A. Sukumaran and T. Brindha, ``Nature-inspired hybrid deep learning for race detection by face shape features,'' International Journal of Intelligent Computing and Cybernetics, vol. 13, no. 3, pp. 365-388, 2020.

G. Chetwynd, ``Shell pores over SURF bids for Gato do Mato,'' Upstream: The International Oil & Gas Newspaper, vol. 27, no. 40, pp. 18-19, 2022.

B. Jindal and S. Garg, ``FIFE: Fast and indented feature extractor for medical imaging based on shape features,'' Multimedia Tools and Applications, vol. 82, no. 4, pp. 6053-6069, 2023.

S. A. Suandi and S. Setumin, ``Characterising local feature descriptors for face sketch to photo matching,'' International Journal of Computational Vision and Robotics, vol. 10, no. 6, pp. 522-544, 2020.

P. Preethi and H. R. Mamatha, ``Region-based convolutional neural network for segmenting text in epigraphical images,'' Artificial Intelligence and Applications, vol. 1, no. 2, pp. 119-127, 2023.

E. Cho and Y. Kim, ``Dynamic optimization of hessian determinant image pyramid for memory‐efficient and high performance keypoint detection in SURF,'' IET Image Processing, vol. 15, no. 13, pp. 3392-3399, 2021.

A. S. M. Yasin, M. M. Haque, M. N. Adnan, S. Rahnuma, and A. Hossain, ``Localization of autonomous robot in an urban area based on SURF feature extraction of images,'' International Journal of Technology Diffusion, vol. 11, no. 4, pp. 84-111, 2020.

S. A. Nawaz, J. Li, J. Liu, U. A. Bhatti, J. Zhou, and R. M. Ahmad, ``A feature-based hybrid medical image watermarking algorithm based on SURF-DCT,'' Proc. of International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, vol. 978, no. 3, pp. 1080-1090, 2020.

Q. An, X. Chen, and S. Wu, ``A novel fast image stitching method based on the combination of SURF and cell,'' Complexity, vol. 21, no. 2, pp. 1-14, 2021.

R. Radha and M. Pushpa, ``A comparative analysis of SIFT, SURF and ORB on sketch and paint based images,'' International Journal of Forensic Engineering, vol. 5, no. 2, pp. 102-110, 2021.

H. Xu and H. Cao, ``A static gesture recognition method based on improved SURF algorithm and Bayesian regularization BP neural network,'' Taiwan Academic Network Management Committee, vol. 22, no. 3, pp. 707-714, 2021.

Y. Yohannes, S. Devella, and A. H. Pandrean, ``Penerapan speeded-up robust feature pada random forest funtuk llasifikasi motif songket palembang,'' Jurnal Teknik Informatika dan Sistem Informasi, vol. 5, no. 3, pp. 360-369, 2020.

M. Bansal, M. Kumar, M. Kumar, and K. Kumar, ``An efficient technique for object recognition using Shi-Tomasi corner detection algorithm. Soft computing: A fusion of foundations,'' Methodologies and Applications, vol. 25, no. 6, pp. 4423-4432, 2021.

L. Wang, B. Hao, J. Huang, Z. Liu , and C. Liu, ``The 3D reconstruction of ROI based on the improved feature fusion and matching strategy,'' Journal of Nonlinear and Convex Analysis, vol. 22, no. 10, pp. 2041-2051, 2021.

D. Rondao, N. Aouf, M. A. Richardson, and O. Duboismatra, ``Benchmarking of local feature detectors and descriptors for multispectral relative navigation in space,'' Acta Astronautica, vol. 172, no. 7, pp. 100-122, 2020.

L. Zhang, K. Li, Y. Qi, and F. Wang, ``Local feature extracted by the improved bag of features method for person re-identification,'' Neurocomputing, vol. 458, no. 11, pp. 690-700, 2021.

X. Liu, ``Research on intelligent visual image feature region acquisition algorithm in Internet of Things framework,'' Computer Communications, vol. 151, no. 2, pp. 299-305, 2020.

Y. Song, L. Su, and X. Wang, ``Abnormal noise recognition of door closing for passenger car based on image processing,'' Recent Patents on Mechanical Engineering, vol. 14, no. 4, pp. 505-514, 2021.

Author

Mingfeng Yan

Mingfeng Yan obtained a master's degree in Visual Communication Design from Xiamen University in 2011. Currently, he is engaged in teaching and research work in the Department of Environmental Design, School of Arts and Crafts, Putian University. He has been invited to serve as a consultant and has delivered various technical speeches on issues such as the principles of environmental design, the fundamentals of landscape design, and interior design. He has published articles in more than 10 well-known peer-reviewed journals and conference proceedings both at home and abroad. His research areas include environmental design, landscape design, interior design, and more.