Mobile QR Code QR CODE

2024

Acceptance Ratio

21%


  1. (Physical Education College, Qiqihar University, Qiqihar 161006, China)



Sports public service, 3D scene reconstruction, Color clustering, Multi-target detection

1. Introduction

How to make better use of competitive sports and ``feed back'' sports public services through its advanced development has become an urgent problem to be solved [1]. By improving the development level of competitive sports, rationally allocating Sports Public Service (SPS) resources and realizing the linkage development of competitive sports and SPS are of great practical significance for the sustainable development of competitive sports [2]. In recent years, 3D reconstruction technology has emerged in an endless stream, and its accuracy has been continuously improved, and it has been combined with movies, games, sports, urban construction and other fields [3]. Among them, the three-dimensional reconstruction of plane scene has been paid more and more attention. The reconstruction of the pitch plane can reduce the controversy and misjudgment of the game, so as to ensure the fairness of the game decision. In addition, it can be used as a coaching system for players' training to provide data support for daily training [4]. In SPS, 3D scene reconstruction can further promote the participation of all people in sports competitions, and provide more advanced technical support for people who love competitive sports daily, such as the construction of intelligent public sports venues [5]. In this study, the reconstructed scene was preprocessed first. Then the scene plane and multi-object detection are carried out. Finally, the optimized camera dynamic calibration technology is used to restore the target's position in 3D coordinates and achieve scene reconstruction. The research aims to promote the sports through the development of competitive sports and increase the enthusiasm of the whole people for sports. The innovative points of the research include the following two points: First, the application of 3D reconstruction technique in the development of public sports; Second, a plane scene extraction method combining nonparametric color clustering and image local entropy is proposed. The research includes three parts:

First, the literature review part, some contents of the design of the institute at home and abroad analysis of the current situation, and summarized.

Second, the method part, the technology used in 3D reconstruction model and related optimization are described.

Third, the result analysis part, the performance of the research design model is analyzed.

2. Related Works

Sports plays an increasingly important role in human society, and governments of various countries begin to devote themselves to the research of sports serving the people. By examining the percentage of sports service facilities in universities, Niu H et al. developed a data envelopment analysis (DEA) model to improve the physical education of college students. The matching efficiency of university public service facilities was assessed using a model that builds input and output indicators for such facilities and chooses 20 institutions as decision-making units [6]. Yue Z selected 150 women who practiced dancing at different ages and compared them with 150 ordinary people. Based on the psychological characteristics of different ages, this study discussed the design of public service products for sports art construction [7]. To cope with the rising demand for sports services in Guangdong Province, Chaoqun H et al. adopted the method of literature, comparative research, case analysis and inductive analysis to analyze and study the actual demand. The research concluded that the different economic benefits lead to the unbalanced distribution of sports service supply [8]. Zhou L et al. used structural equation model to investigate the impact of community sports services on participants' satisfaction. Community sports service had a strong impact on participants' satisfaction in sports facilities, grassroots sports organizations and sports activities [9].

Pyramid Lucas-Kanade Optical Flow (LKOF) is an optical flow estimation algorithm based on pyramid structure, which can process images at different scales, thereby improving the accuracy and stability of optical flow estimation. Hambali R et al discussed the application of the LKOF method to rainfall motion tracking using $150\times150$ m resolution radar images. PLKOF's ability to predict the displacement position of rain cells is ideal [10]. Zhao D. et al. discovered that interference optical flow field would be produced when a moving item was on a stationary ground, decreasing the accuracy of measuring a vehicle's speed in relation to the ground. To reduce the influence caused by moving objects, the PLKOF algorithm is combined with the gray consistency method, and the color image information is fully used. CNNS are then used to identify objects in relative motion and cover them with a mask [11]. Song T et al. proposed an accurate algorithm combining PLKOF and corner features for almost image feature matching. Firstly, the orientation feature is applied to easily extract the picture feature points, and then the displacement vector is calculated using the local feature window. Finally, PLKOF is applied to track the feature points [12]. To construct dynamic descriptors to recognise gesture characteristics, Suni S S et al. merged the basic pyramid histogram of gradients in three orthogonal planes with dense optical flow [13].

An approach known as the Random sample consistency algorithm (RANSAC) determines the model parameters using a collection of sample data that includes anomalous data before obtaining the correct sample data. Canaz Sevgen S et al. suggested an enhanced RANSAC and extracted images of the roof planes of buildings from liDAR data. The model first randomly selects a point in the radar image, then searches its neighbors within the given threshold range, and identifies and removes outliers [14]. Hossein-Nejad Z et al. used the RANSAC algorithm to match objects in the template with images from the test library. Then, the region growth algorithm is used to determine the boundary and performance of objects in the image to realize the recognition of objects in the image [15]. Hossein-Nejad Z et al. suggested a new four-step method for image Mosaic to solve the problem that traditional image Mosaic requires a lot of computation and time. Firstly, the key points are identified by the redundant key points elimination-scale-invariant feature transform algorithm. Then the small window descriptor is used to reduce the running time, and the adaptive threshold is determined by RANSAC algorithm. Finally, the images are mixed [16]. Afsal S et al. proposed a hybrid RANSAC algorithm and M-estimator sample consensus method to reliably estimate outliers and outliers in video captured by cameras. With this method, the motion between frames is estimated by putting the matched point pairs into a bionic transformation model [17].

Based on the above literature, it can be seen that there are few attempts to combine computer technology in public sports construction services. And few combine competitive sports with public sports services. To this end, this study based on the pyramid LK optical flow method and RANSAC algorithm to achieve the three-dimensional scene reconstruction of competitive sports. The objective of this paper is to provide more technical support for athletes' daily training, and then use competitive sports to ``feedback'' SPS, so as to promote the development of SPS.

3. 3D Scene Reconstruction Technology of Competitive Sports Based on Pyramid LK Optical Flow Method and RANSAC Algorithm

Competitive sports play the role of "vanguard" in the development process of sports cause. Its competitive sports venues and facilities are becoming more and more perfect, which provides a certain facility foundation for SPS and makes more and more people participate in sports exercise. This research takes the plane scene in the field of sports as the research object of 3D reconstruction, combined with different match scenes such as football and tennis, and studies its object extraction, camera self-calibration and feature matching.

3.1. 3D Scene Reconstruction Information Preprocessing

Generally, the camera shooting scene video has a wide field of view, and the data information that needs to be processed includes not only the scene plane area to be reconstructed, but also the surrounding complex environment [18-20]. The information has no analytical significance and interferes with the reconstruction process to some extent. Therefore, a detection method combining nonparametric color clustering and image local entropy is proposed to extract the plane scene. At the same time, using the extracted results, the scene surface target detection is carried out through the characteristics of background elimination and target salience. The YCbCr color space is used to estimate the color probability density through sample data. Firstly, we extract a pair of two-channel color difference signals (Cb, Cr) from each pixel of the frame image and calculate its two-dimensional histogram. The obtained two-dimensional probability density estimation results are shown in Fig. 1.

Fig. 1. 2D probability density estimation results.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig1.png

In this study, Gaussian window function is applied to estimate the probability density function, and the probability density of (Cb, Cr) is obtained through the convolution calculation of Gaussian window and two-dimensional histogram. The specific method is shown in Eq. (1).

(1)
$ \left\{\begin{aligned} & \hat{p}(x,y)=\sum _{k,l}h(x-k,y-l)w(k,l) ,\\ & -\frac{N-1}{2} \le k,~l\le \frac{N-1}{2},\\ & w(k,l)=\exp \left(-\frac{1}{2} \left(\frac{\alpha }{N/2} \right)(k^{2} +l^{2} )\right). \end{aligned}\right. $

In Eq. (1), $w(k,l)$ is the Gaussian window function; $N$ and $\alpha $ are parameters determining the width of the Gaussian window; $h(x,y)$ is a two-dimensional histogram; $\alpha $ is the inverse of the standard deviation. To improve the algorithm speed, the two-dimensional probability density function is decomposed into two one-dimensional probability density functions. The local maximum of probability density function is searched by mountain climbing. To enhance the velocity of the mountain climbing, a descending horizontal standard line is set to reduce the number of points used in the execution of the mountain climbing. This standard line is horizontally tangent to the probability density curve at the local minimum. The mountain climbing algorithm flow after setting the standard line is shown in Fig. 2.

Fig. 2. Hill climbing process after setting the standard line.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig2.png

After searching through the mountain climbing algorithm, preliminary clustering results have been obtained. In this study, each cluster was classified into one of the four categories. For each cluster, the number of pixels divided into four regions is counted. Clusters belonging to the same class were combined and marked with different gray pixel values for each class. If only color clustering method is used in the detection of plane scene area, there will be some interference pixels. When these interference factors are similar to the color of the site area, it is easy to produce false detection. The complexity of local image pixels can be represented by image local entropy. Due to the lower complexity of the site plane area, its image entropy is also smaller than other areas in the scene. The calculation method of local entropy is shown in Eq. (2).

(2)
$ H=-\sum _{i=1}^{m}\sum _{j=1}^{n}f(i,j)\log _{2} p_{ij}. $

In Eq. (2), $f(i,j)$ is the gray value of pixel $(i,j)$; $p_{ij} $ is the probability of occurrence of the gray value of the pixel in the domain. The common local entropy threshold selection is to manually extract a template from the site plane region, calculate the local entropy of the template, and use it as the threshold for dividing two regions. To improve the adaptability of the threshold, this study uses threshold processing based on maximum inter-class variance (Otsu). To select the adaptive entropy threshold, firstly, the probability distribution of the local entropy of the frame image needs to be calculated, as shown in Eq. (3).

(3)
$ p_{i} =\frac{n_{i} }{N'} ,~p_{i} \ge 0,~\sum _{i=1}^{L}p_{i} =1 . $

In Eq. (3), $N'$ is the total number of pixels; $n_{i} $ is the number of pixels whose local entropy is $i$. Otsu method is used to choose the threshold, and the local entropy image is divided into two different category regions. As mentioned above, the study further improves the accuracy of plane scene extraction by combining nonparametric color clustering and image local entropy detection to distinguish and mitigate the effects of disturbing pixels such as color similarity. Color clustering method extracts the color information in the scene by using the two-channel color difference signal (Cb, Cr) in the YCbCr color space to estimate the two-dimensional probability density. Then, the mountain climbing algorithm is used to search the local maximum of the probability density function, and the initial clustering of pixels is realized. The clustering results are classified into four different categories, and each cluster cocooning frame is statistically analyzed. If only relying on color clustering method to eliminate interference effect is not ideal. Therefore, the concept of image local entropy is introduced to represent the complexity of image local pixels. By calculating the local entropy of each pixel and using the threshold processing method based on Otsu, the automatic classification of local entropy images is realized. In this way, the plane scene area and other complex areas can be distinguished to reduce the influence of interfering pixels on the scene reconstruction process. The target position on the plane scene is an important information for 3D reconstruction. Common methods based on background difference are not suitable for detecting objects in sports scenes. Because the camera will move with the target. In view of this, background elimination and FT significance detection are used to achieve complete multi-object extraction. Based on the above operations, the object detection of the plane scene is studied and realized, and Fig. 3 illustrates the specific process.

Fig. 3. Target detection process in planar scenes.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig3.png

3.2. Camera Parameter Calibration Based on Bottom-up and Top-down Thinking

Camera calibration is an important part of the 3D reconstruction process of plane scene [21-23]. According to bottom-up and bottom-up ideas, the camera calibration process is realized by using the geometric features of the plane scene. In sports scenes, most of the field lines of the field plane are marked with white lines, and these lines will be used as the criteria for determining the result of the game. After the detection of the scene plane area, if there is no obvious site edge line, the white line can be used as the geometric feature of the plane scene for calibration. Top hat transformation was used to extract white lines in the scene, and the extraction method was shown in Eq. (4).

(4)
$ T_{hat} (f)=f-(f\circ b) . $

In Eq. (4), $b$ is a structural element in the image; $f$ is the source image; $f\circ b$ is the open result of both. First of all, it is necessary to determine whether the candidate pixels belong to the field line, and the research restricts the candidate pixels from the horizontal and vertical directions respectively, as shown in Eq. (5).

(5)
$ l(i,j)=\left\{\begin{aligned} & 1,&&g(i,j)\ge \sigma _{l} \wedge g(i,j)-g(i\pm \tau ,j)>\sigma _{d},\\ & 1,&&g(i,j)\ge \sigma _{l} \wedge g(i,j)-g(i,j\pm \tau )>\sigma _{d},\\ & 0,&& \text{else}. \end{aligned}\right. $

In Eq. (5), $g(i,j)$ is the brightness component value of the pixel; $\tau $ is the maximum number of pixels for the width of the field line; The value of $\sigma _{l} $ is 128; The value of $\sigma _{d} $ is 20. The above method can exclude a large range of white pixels in the region. For some texture areas, the structure tensor can be used to filter the white pixels that do not belong to the field line. The filtration method is shown in Eq. (7).

(6)
$ J_{\rho } =G_{\rho } *\left(\nabla f\nabla f^{{\rm T}} \right). $

In Eq. (7), $\rho $ is the candidate pixel; $G$ is a Gaussian function. Through this equation, the domain gradient information of candidate small-stock listing can be calculated. Then the field line is refined by edge detection operator (Sobel). There are a lot of line set problems when Hough transformation is performed. Therefore, the best fitting line is calculated by RANSAC algorithm. To optimize the precision of the linear parameters, the least square method is introduced to optimize the model. A new line model is reconstructed by using all the inner points of the model. In camera model, the calibration process involves the interconversion of world coordinate system, camera coordinate system, image physical coordinate system and image pixel coordinate system. The conversion process is shown in Fig. 4.

Fig. 4. Schematic diagram of coordinate system conversion process.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig4.png

Eq. (7) is satisfied between the coordinates of a certain point in the image pixel and the real coordinates.

(7)
$ \lambda \left[\begin{array}{c} {u} \\ {v} \\ {1} \end{array}\right]=\left[\!\!\begin{array}{cccc} {\alpha } & {0} & {u_{0} } &0\\ {0} & {\beta } & {v_{0} } & 0 \\ {0} & {0} & {1}& 0 \end{array}\!\!\right]\left[\!\!\begin{array}{cc} {R} & {t} \\ {0} & {1} \end{array}\!\!\right]\left[\begin{array}{c} {X_{w} } \\ {Y_{w} } \\ {Z_{w} } \\ {1} \end{array}\right]=H\left[\begin{array}{c} {X_{w} } \\ {Y_{w} } \\ {Z_{w} } \\ {1} \end{array}\right] . $

In Eq. (7), $(X_{c} ,Y_{c} ,Z_{c} )$ denotes the camera coordinate system; $(x,y)$ is the physical coordinate system of the image; $(X_{w} ,Y_{w} ,Z_{w} )$ denotes the world coordinate system; $(u,v)$ is the pixel coordinate system; $u_{0} $ and $v_{0} $ are offsets; $\alpha $ and $\beta $ are the scale factors of pixels in the image plane $x$ axis and $y$ axis; $\lambda $ is a scale factor. In this paper, the bottom-up camera calibration method is used to obtain the geometric features of the plane scene from the image coordinate system captured by the camera, and the intersection of two sites is used as the standard point of the camera. The length and width of the field area in the scene is generally smaller than that of the real court, but it needs to be greater than a quarter of the real court proportion. According to this geometric constraint, the standard fixed point combination whose field difference is too far is excluded, and the standard fixed point matching is carried out quickly and accurately. The marked points in the scene correspond to those shown in Fig. 5.

Fig. 5. Schematic representation of calibration points in a football scene.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig5.png

Then, the corresponding features are found and matched in the real scene model coordinate system. After calculating the projection matrix according to the mapping relationship between them, the projection matrix of the minimum solution is obtained by using the singular value decomposition (SVD) method. This gives the parameters of the camera. After the bottom-up camera calibration is completed, the projection matrix corresponding to multiple feature matching is obtained. However, due to various factors, there may be potential errors or inaccuracies in these matrices. In order to further improve the accuracy and stability of camera parameters, a top-down approach is introduced to evaluate and optimize these matrices. First, we use prior probability to model the problem. According to the geometric features of football scene and the prior knowledge of camera calibration, different projection matrices are assigned different weights. Next, the study reverse-maps the estimated parameter states to the current frame. Through this, the performance of the parameter estimation on the image is observed, and the degree of matching between it and the actual scene is judged. In the process of evaluating the matching results, the anisotropic scaling parameter is introduced to eliminate the error of the estimated camera parameters, so as to reduce the calculation time of finding the optimal solution of the projection matrix. By comparing the coverage of each projection matrix to the white pixels, the matching degree is scored. After all calibration parameters are estimated, the projection matrix with the highest matching score is selected as the candidate optimal solution. However, due to the noise and mismatching points in scene images, the matrix with the highest score may not be the global optimal solution. In this paper, the Levenberg-Marquardt nonlinear least square method is used to calculate the minimum distance between the three-dimensional point and the corresponding point in the real scene three-dimensional coordinate projection. The minimization problem is transformed into a solution equation to further minimize the global reprojection error. Based on the above content, the optimization and correction of camera parameters are realized by combining bottom-up and top-down methods.

3.3. Three-dimensional Scene Reconstruction Based on Pyramid LK Optical Flow Method

After camera parameter calibration and feature extraction, the research uses the extraction results of plane scene and camera calibration parameters to determine the three-dimensional position and color information of the plane object to realize the visualization of the scene. The transformation matrix of two consecutive scene sequence images can be calculated by the relationship of feature points in the images. The Shi-Tomasi algorithm is used to detect feature points. After extracting suitable feature points, the feature matching point pair between scene sequence images is a necessary prerequisite for global motion parameter estimation. Since the two frames meet the requirements of constant brightness, continuous time and spatial consistency, LKOF is used to track the feature points.

(8)
$ I_{x} u+I_{y} +I_{t} =0 . $

In Eq. (8), $I(x,y,t)$ is the brightness value of A pixel passing time $t$. According to the motion consistency of the local pixel region, the motion of the central point is solved by using the system equation of the pixel in the field of the current point $5\times5$, as shown in Eq. (9).

(9)
$ \left[\!\!\begin{array}{cc} I_{x} (p''_{1} ) & I_{y} (p''_{1} )\\ I_{x} (p''_{2} ) & I_{y} (p''_{2} )\\ {...} & {...}\\ I_{x} (p''_{25} ) & I_{y} (p''_{25} ) \end{array}\!\!\right]\left[\begin{array}{c} {u} \\ {v} \end{array}\right]=\left[\begin{array}{c} {I_{t} (p''_{1} )} \\ {I_{t} (p''_{2} )} \\ {...} \\ {I_{t} (p''_{25} )} \end{array}\right] . $

The least square method is used to solve the equation, and the optical flow vector of pixel point $p''$ is obtained, as shown in Eq. (10).

(10)
$ d=\left(A^{T} A\right)^{-1} A^{T} b . $

To solve the ambiguity generated by optical flow tracking, the multi-layer pyramid is generated by the image. The optical flow calculation is performed with the same size window in each layer to correct the initial speed assumption. The optical flow after iteration is shown in Eq. (11).

(11)
$ d'=\sum _{L=0}^{L_{m} }2^{L} d^{L} . $

In Eq. (11), $d$ is the optical flow vector of pixel points; $L$ is the number of pyramid layers. Mismatching pairs are inevitable in the process of global motion parameter estimation. To find the optimal parameter model of the pair of feature points, the study combined with RANSAC to eliminate the wrong matching points. After the above operation, the scene target has been completely extracted, but the location of the candidate target needs to be obtained. The study finds the location of the candidate target by marking the connected area inside the site. To save storage space and reduce recursive calls, the scanning seed filling algorithm is used to extract the connected region. The white foreground pixel region is then continuously scanned according to the raster sequence. The connected regions in the binary graph are not all single targets on the plane, but there may be large regions formed by multiple targets blocking each other. Considering that the foreground target region has obvious proportional characteristics in shape, shape features are introduced to separate multiple targets from the candidate region. First, calculate the minimum length and width of the external curve box in the candidate area, as shown in Eq. (12).

In Eq. (12), $h$ and $w$ denotes the length and width of the external rectangular frame; $T_{h} $and $T_{w} $ are the thresholds for the length and width of the rectangular box; $T_{\max } $ and $T_{\min } $ are the maximum and minimum values of the number of foreground pixels; $T_{asp} $ is the width/height ratio threshold of the target region; $T_{rastio} $ is the threshold of the area ratio occupied by the foreground pixel in the external rectangle; $num$ is the number of foreground pixels. This paper uses the principle of geometric distance (GM) to calculate the centroid of the target region, and then inversely deduces the world coordinates of the target, as shown in Eq. (13).

(12)
$ \left\{\begin{aligned} & h=|R_{i} (y\_ top)-R_{i} (y\_ bottom)|, \\ & w=|R_{i} (y\_ left)-R_{i} (y\_ right)|,\\ & h>T_{h} ,~w>T_{w},\\ & w/h<T_{asp},\\ & num/(h\times w)>T_{ratio},\\ & T_{\min } <num<T_{\max }, \end{aligned}\right. $
(13)
$ \left[\begin{array}{c} {X_{w} } \\ {Y_{w} } \\ {1} \end{array}\right]=H^{-1} \times \left[\begin{array}{c} {u} \\ {v} \\ {1} \end{array}\right] . $

In Eq. (13), $H$ is the local entropy of the image. After obtaining the 3D coordinates of the moving target in the real world, the research judges the category of the target through color feature training. The process is shown in Fig. 6. Finally, the model is reconstructed in Visual Studio 2010 with OpenGL.

Fig. 6. Process for determining the category of a target.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig6.png

As shown in Fig. 6, the research first realizes the target recognition by extracting the color center of the target area according to the characteristics of the target with distinct color features in the scene. This paper studies the conversion of RGB to HSV color space, and in order to improve the efficiency of image processing, it is necessary to reduce the dimensionality of HSV space. According to the difference of hue, saturation and brightness recognition by human eyes, the three components in HSV space were quantified inuniformly, and the HSV was divided into 9, 4 and 4 parts respectively, which were quantified into 144 color channels in total. Finally, the tonal component of HSV space is used to calculate the normalized statistical histogram, which is regarded as the color feature of the target. Based on the above content, a 3D modeling method of sports scene based on pyramid LK optical flow method was studied and established. The system flow of this method is shown in Fig. 7.

Fig. 7. Process of 3 D modeling method of sports scene based on pyramid LK optical flow method.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig7.png

In practical application, it is found that the problem of insufficient calibration points still occurs in complex and changeable scenes. To solve this problem, the research combines deep learning methods and uses a large number of training data to optimize the calibration process. In addition, the study further combines optical sensors and infrared sensors in the scene, and the data obtained from these sensors provides additional information to compensate for the lack of calibration points.

4. Performance Analysis of 3D Reconstruction Model of Plane Competitive Scene Based on LKOF and RANSAC Algorithm

The study used four types of sports game scenes downloaded from video sites as the experimental subjects. The experimental video data included an international friendly football match between France and England in 2017. The video format is AVI, resolution is 480*360, frame rate is 15 fps; In 2015, Lin Dan versus Lee Chong Wei in the China Open men's singles semi-final, the video format is RMVB, the resolution is 640*360, the frame rate is 15 fps; The video format of the Asian Games women's table tennis singles final between China's Sun Yingsha and Japan's Hina Hayata in October 2023 is AVI, with a resolution of 1280*720 and a frame rate of 25fps. In this study, 20% of the video data set is used as a training set to evaluate the optimal range of parameters, and the remaining data set is used as a test set for effect verification and comparison.

To test the detection effect of the detection method combining color feature and texture feature designed by the research institute, The experiment uses the research design method and three methods of histogram main color detection (HDCD) in literature [24] and adaptive mixed Gaussian model (AMG) in literature [25] to detect various scenes. Fig. 8 illustrates the test results.

Fig. 8. Comparison of detection accuracy of various extraction methods in different scenarios.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig8.png

In Fig. 8, the average detection precision of the suggested method is 93.88%, which is 11.96% and 7.58% higher than HDCD and AMG algorithms, respectively. The HDCD method is not sensitive to the change of luminosity, resulting in misjudgment. AMG has good accuracy in scene 1 and scene 4, but it can't distinguish between scenes with similar colors. However, the performance of the proposed algorithm in the four scenes is more excellent, which indicates that the research design algorithm combined with color features and texture features can effectively improve the detection accuracy.

To quantitatively compare and analyze the comprehensive index extraction effect of different algorithms, the accuracy rate, recall rate and F1 value of the three extraction methods were compared in the football scene test video. In Table 1, the accuracy and Recall values of this algorithm are 93.56% and 92.45%, respectively. The number of false checks is significantly less than the other two algorithms. F1 value is 92.47%, which is more than 5% higher than the other two algorithms. The detection algorithm designed in the research can detect the target more accurately. To test the precision of camera parameter calibration, the calibration algorithm in this paper (Method 1) and the camera calibration method based on directional hierarchical target (method 2) are used for parameter calibration. 300 feature points were randomly sampled in the image to compare the calibration errors, as shown in Fig. 8.

Table 1. Performance comparison of various extraction methods.

Project

Target number

Misinspection quantity

Accuracy (%)

Recall (%)

F1(%)

Algorithm in this article

986

63

93.56

92.45

92.47

HDCD

986

183

81.44

80.13

80.62

AMG

986

138

86.00

85.37

85.42

In Fig. 9, the parameter calibration of method 1 has better accuracy than that of method 2. The error of the four parameters of method 1 is 15.35%, 32.15% and 29.45% lower than that of method 2. This shows that Method 1 has a good calibration effect, and can dynamically calibrate the internal and external parameters of the camera, and update the parameter matrix of each frame, to further improve the 3 D modeling accuracy. To verify the reliability of the research parameter estimation method, and because the camera is constantly moving, the experiment is tested by the change of camera field of view. Fig. 10 illustrates the plane range obtained by the camera projection matrix and the reverse mapping of the edge of the plane scene.

In Fig. 10(a)-10(c), the camera's field of view gradually moves from the right half field to near the middle circle, consistent with the camera movement in the scene sequence. By calculating the reprojection error of 200 frame image sequence, the maximum error is less than 0.56 pixels. The global motion estimation method designed by the institute is more reliable. To verify the motion recovery results of the plane target in the scene sequence, the proposed global motion recovery method was used in the experiment to extract the actual position change information of the target, and the results were compared with the actual motion trajectory, as shown in Fig. 11.

Fig. 9. Scatter plots of errors between two methods.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig9.png

Fig. 10. Camera global motion estimation effect.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig10.png

Fig. 11. Position change information obtained from global motion recovery method.

../../Resources/ieie/IEIESPC.2025.14.4.457/fig11.png

In Fig. 11, the changes of $X$ and $Y$ coordinates of the target obtained by the target motion recovery method are basically consistent with the actual trajectory. And the relative bottom error results of the two are less than 1.5% and 1.3% respectively. The proposed method can recover the target information in scene sequence well.

In order to reflect the reconstruction effect of the 3D reconstruction model (Model 1) designed by the research in practical application, the study carried out a complete 3D reconstruction of the football scene in the data set. The study marked ten target players in the original drawing with different colored rectangular boxes according to different pairs. Then the camera parameter matrix is obtained according to the camera calibration method, and the players of each team are represented by different cuboid models. In the process of scene sequence reconstruction, due to the constant movement of the camera, the experiment verified the reconstruction accuracy by comparing the absolute error (AE) and relative error (RE) between the modeling coordinates and the actual coordinates of each football player in the reconstructed model in different frames. In order to analyze the reconstruction results more comprehensively, the research uses the commonly used 3D scene reconstruction method for comparative analysis. The comparison methods include: plane 3D reconstruction based on cavity convolution kernel multi-scale feature fusion (Model 2), scene 3D reconstruction based on channel and space attention fusion (Model 3), and point cloud matching (model 4). The comparison results of player model reconstruction accuracy of three key positions are shown in Table 2.

Table 2. Comparison of 3D scene reconstruction performance of various models.

Rebuild frame serial number

Project

Players 1

Players 2

Players 3

AE(mm)

RE(%)

AE(mm)

RE(%)

AE(mm)

RE(%)

20

Model 1

2.11

0.42

2.45

0.51

2.47

0.48

Model 2

3.54

0.92

3.87

0.94

3.65

0.93

Model 3

4.12

1.13

4.08

1.08

4.10

1.12

Model 4

3.88

0.98

3.89

0.97

3.92

0.99

80

Model 1

2.15

0.44

2.10

0.47

2.32

0.38

Model 2

3.72

3.94

3.48

3.28

3.29

2.78

Model 3

4.34

4.23

4.35

4.00

4.27

4.02

Model 4

4.10

3.00

4.08

3.02

4.12

3.00

As can be seen from Table 2, AE values and RE values of each model have increased as the number of video frames has changed from 20 to 80. This is due to the accumulation of errors in the 3D reconstruction model as the number of video frames increases, resulting in a decline in modeling accuracy. Model 1 has the smallest decrease, and in frame 80, AE increases by only 0.02%, which is not a large error magnitude. This is because model 1 uses the globally optimized projection matrix solution method and bottom-up parameter estimation method to realize the self-calibration of the camera in the plane scene, and maintains a high calibration accuracy in the video process. The AE value and RE value of the other three models increased by more than 3% at the 20th frame and the 80th frame. The two indexes of model 1 are obviously better than those of the other three models. Based on the contents in the table, it can be seen that the 3D reconstruction model designed by the research institute has good applicability and stability.

The important contribution and novelty of the 3D reconstruction method of sports scenes designed in the research are: the proposed global motion estimation method can effectively deal with the parameter changes of the camera during the motion process, which not only improves the accuracy of 3D modeling, but also makes the model adapt to complex dynamic scenes; A bottom-up parameter estimation method is used to realize the self-calibration of the camera, simplify the operation process, and improve the flexibility and accuracy of the calibration. The experimental results show that the research design model has high precision and robustness in 3D scene reconstruction.

5. Conclusion

Through improving the development level of competitive sports, rationally allocating SPS resources and realizing the linkage enhancement of competitive sports and SPS are of great practical significance for the sustainable enhancement of competitive sports. In this study, the reconstructed scene was preprocessed first. Then the scene plane and multi-object detection are carried out. Finally, the optimized camera dynamic calibration technology is used to restore the target's position in 3D coordinates and achieve scene reconstruction. According to the experimental analysis, the average detection precision of the designed scene detection method is 93.88%, which is 1.96% and 2.58% higher than HDCD and AMG algorithms, respectively. The accuracy and Recall values of the algorithm are 93.56% and 92.45%, respectively. The number of false checks is significantly less than the other two algorithms. F1 value is 92.47%, which is more than 5% higher than the other two algorithms. In summary, it can be seen that the detection method designed by the research institute can achieve a more accurate detection effect. The error of the four parameters of method 1 is 15.35%, 32.15% and 29.45% lower than that of method 2, respectively, and has a good calibration effect. The average AE and RE values of model 1 were 2.11mm and 0.42%, respectively. In the 80th frame after changing camera parameters, AE only increased by 0.02%, and the magnitude of error is not large, which has good applicability and stability. Considering the rapid development and changes of competitive sports, the reconstruction methods need to be continuously optimized and updated in future studies. For example, we can study how to introduce virtual reality (VR) or augmented reality (AR) technology into the 3 D reconstruction of sports scenes to provide a more immersive watching experience or training assistance.

Funding

The research is supported by: 1) The basic scientific research business project of Heilongjiang Provincial universities in 2022, ``Research on the Intelligent Development Strategy of Heilongjiang Ice and Snow Sports Tourism Public Service in the Era of Digital Economy''. (No. 145209162); 2) 2022 Qiqihar University Degree and postgraduate Education and Teaching reform Research project ``Innovation Research on the Focus and Practice Path of Ideological and Political Construction of Professional Courses for Master of Physical Education''. (No. JGXM_QUG2022006). 3) The General Project of the 2023 Higher Education Research Project of Heilongjiang Provincial Higher Education Society: ``Research on the Focus, Main Problems and Promotion Strategies of Ideological and Political Education in Physical Education Courses in Colleges and Universities''. (No. 23GJYBB205).

REFERENCES

1 
R. Liang, ``Urban sports service structure from the public health context,'' Revista Brasileira de Medicina do Esporte, vol. 27, pp. 108-110, 2021.DOI
2 
M. Zi and Z. Tang, ``Construction of community sports service system for the aged under the background of ``Healthy HeiLongjiang'','' International Journal of Social Science and Education Research, vol. 3, no. 5, pp. 26-29, 2020.DOI
3 
W. L. Nowinski, ``Bridging neuroradiology and neuroanatomy: NOW in BRAIN—a repository with sequences of correlated and labeled planar-surface neuroimages,'' The Neuroradiology Journal, vol. 36, no. 1, pp. 94-103, 2023.DOI
4 
X. Pan and T. Y. Yang, ``3D vision‐based out‐of‐plane displacement quantification for steel plate structures using structure‐from‐motion, deep learning, and point‐cloud processing,'' Computer‐Aided Civil and Infrastructure Engineering, vol. 38, no. 5, pp. 547-561, 2023.DOI
5 
E. Liscio, P. Bozek, H. Guryn, and Q. Le, ``Observations and 3D analysis of controlled cast‐off stains,'' Journal of Forensic Sciences, vol. 65, no. 4, pp. 1128-1140, 2020.DOI
6 
H. Niu and Y. Zhang, ``The proportion of sports public service facilities based on the dea model in colleges and universities,'' Revista Brasileira de Medicina do Esporte, vol. 27, pp. 97-100, 2021.DOI
7 
Z. Yue, ``Design of public service products of sports art fitness under the psychological characteristics of different ages,'' Revista de Psicología del Deporte (Journal of Sport Psychology), vol. 30, no. 4, pp. 183-189, 2021.URL
8 
C . Huang, and N. Peng, ``Research on the innovation of multiple participation mechanism of public sports service in Guangdong Province,'' Academic Journal of Humanities & Social Sciences, vol. 3, no. 11, pp. 145-155, 2020.DOI
9 
L. Zhou, J. J. Wang, X. Chen, B. Cianfrone, and N. D. Pifer, ``Community-sport service provision, participant satisfaction, and participation: Experience and perspective of Guangdong, China,'' International Journal of Sports Marketing and Sponsorship, vol. 21, no. 1, pp. 127-147, 2020.DOI
10 
R. Hambali, D. Legono, R. Jayadi, ``The application of pyramid Lucas-Kanade optical flow method for tracking rain motion using high-resolution radar images,'' Jurnal Teknologi, vol. 83, no. 1, pp. 105-115, 2020.DOI
11 
D. Zhao, Y. Wu, C. Wang, C. Shen, J. Tang, J. Liu, and Z. Lu, ``Gray consistency optical flow algorithm based on mask-R-CNN and a spatial filter for velocity calculation,'' Applied Optics, vol. 60, no. 34, pp. 10600-10609, 2021.DOI
12 
T. Song, B. Chen, F. M. Zhao, Z. Huang, Z. Huang, and M. J. Huang, ``Research on image feature matching algorithm based on feature optical flow and corner feature,'' The Journal of Engineering, vol. 2020, no. 13, pp. 529-534, 2020.DOI
13 
S. S. Suni and K. Gopakumar, ``Fusing pyramid histogram of gradients and optical flow for hand gesture recognition,'' International Journal of Computational Vision and Robotics, vol. 10, no. 5, pp. 449-464, 2020.DOI
14 
S. C. Sevgen and F. Karsli, ``An improved RANSAC algorithm for extracting roof planes from airborne lidar data,'' The Photogrammetric Record, vol. 35, no. 169, pp. 40-57, 2020.DOI
15 
Z. Hossein-Nejad and M. Nasri, ``Adaptive RANSAC and extended region-growing algorithm for object recognition over remote-sensing images,'' Multimedia Tools and Applications, vol. 81, no. 22, pp. 31685-31708, 2022.DOI
16 
Z. Hossein-Nejad and M. Nasri, ``Natural image mosaicing based on redundant keypoint elimination method in SIFT algorithm and adaptive RANSAC method,'' Signal and Data Processing, vol. 18, no. 2, pp. 147-162, 2021.DOI
17 
S. Afsal and A. Linsely, ``Optimal process of video stabilization using hybrid RANSAC-MSAC algorithm,'' International Journal of Intelligent Systems and Applications in Engineering, vol. 11, no. 2, pp. 564-571, 2023.URL
18 
Y. Guo, Z. Mustafaoglu, and D. Koundal, ``Spam detection using bidirectional transformers and machine learning classifier algorithms,'' Journal of Computational and Cognitive Engineering, vol. 2, no. 1, pp. 5-9, 2022.DOI
19 
F. Masood, J. Masood, H. Zahir, K. Driss, N. Mehmood, H. Farooq, ``Novel approach to evaluate classification algorithms and feature selection filter algorithms using medical data,'' Journal of Computational and Cognitive Engineering, vol. 2, no. 1, pp. 57-67, 2023.DOI
20 
F. Smarandache, ``Plithogeny, plithogenic set, logic, probability and statistics: A short review,'' Journal of Computational and Cognitive Engineering, vol. 1, no. 2, pp. 47-50, 2022.DOI
21 
G. J. Yoon, J. Song, Y. J. Hong, and S. M. Yoon, ``Single image based three-dimensional scene reconstruction using semantic and geometric priors,'' Neural Processing Letters, vol. 54, no. 5, pp. 3679-3694, 2022.DOI
22 
A. Maccarone, K. Drummond, A. McCarthy, U. K. Steinlehner, J. Tachella, D. A. Garcia, A. Pawlikowska, R. A. Lamb, R. K. Henderson. S. McLaughlin, Y. Altmann, and G. S. Buller, ``Submerged single-photon LiDAR imaging sensor used for real-time 3D scene reconstruction in scattering underwater environments,'' Optics Express, vol. 31, no. 10, pp. 16690-16708, 2023.DOI
23 
X. Liu, J. D. Rego, S. Jayasuriya, and S. J. Koppal, ``Event-based dual photography for transparent scene reconstruction,'' Optics Letters, vol. 48, no. 5, pp. 1304-1307, 2023.DOI
24 
P. Łabędź, K. Skabek, P. Ozimek, and M. Nytko, ``Histogram adjustment of images for improving photogrammetric reconstruction,'' Sensors, vol. 21, no. 14, pp. 4654-4658, 2021.DOI
25 
X. Nie, Y. Hu, X. Shen, and Z. Su, ``Reconstructing and editing fluids using the adaptive multilayer external force guiding model,'' Science China Information Sciences, vol. 65, no. 11, pp. 212102-212108, 2022.DOI

Author

Qiufen Yu
../../Resources/ieie/IEIESPC.2025.14.4.457/au1.png

Qiufen Yu obtained her master's degree from Qiqihar University in 2015. Currently, she is an associate professor and the head of the Department of Physical Education at the School of Physical Education at Qiqihar University. During her tenure as a teacher, she has published over 30 academic papers, mainly focusing on research in the fields of physical education teaching and humanities and social sciences in sports.