3.1 Design of ios painting image style classification model for painting teaching
Beginners often lack a solid foundation in drawing and painting techniques, resulting
in their works lacking a distinct style. Challenges arise in objectively assessing
students' progress in online modern art teaching. Consequently, they encounter difficulties
in receiving specific guidance and training, which hampers effective teaching interaction.
Consequently, the development of students' art skills may be impeded [20,21,22]. To address this issue, a painting style classification model and a painting description
generation model are designed using the portable platform IOS. It is used to evaluate
students' works in the process of painting teaching, providing targeted support for
teaching interaction during the learning process. The style classification model is
used to locate the direction of students' future painting practice. The drawing description
generation model can classify students with similar descriptions, making it convenient
for students to communicate with each other during the learning process.
The design of the painting image style classification model mainly uses image information
entropy technology to classify image art styles. The color entropy and block entropy
contained in the participants' works are analyzed for the distribution of painting
colors and lines. Based on the image information entropy characteristics, the study
classifies the participants' painting images into four basic categories, which are
simple line drawing category, complex line drawing category, conservative style category
and exaggerated style category. Simple line painting refers to the painting of students
who lack a coloring foundation and have simple line composition. Complex line painting
refers to student paintings that lack a color foundation but have complex and dense
line compositions. The conservative style category refers to students' paintings that
use monotonous colors, even with colder tones transitioning between color blocks.
Exaggerated style refers to the use of colors in students' paintings that are more
exaggerated, without too many color blocks, and the overall color system tends to
be warm, as shown in Fig. 1.
Fig. 1. Classification of painting styles.
It is worth noting that it is often difficult for students to notice the objective
characteristics contained in their own works during the drawing process. So it is
necessary to classify the students' drawings and to conduct targeted teaching interaction.
The study uses the IOS programming language to build a resource library of students'
drawings. The specific construction is shown in Fig. 2.
Fig. 2. Resource library construction.
The research focus of the iOS programming section is on the Objective-C programming
language. It originated from the C language and is divided into four key features:
message mechanism, protocol, category, and memory management method. The message mechanism
is a mechanism that determines the method to be called at runtime, while the execution
method is determined through function calls at compile time. The protocol is a method
declaration that cannot provide the implementation method itself, but can be provided
through specific references to the protocol. And using this proxy protocol model is
also more convenient for maintaining model code. The category is a way to add methods
to the current category directly without the need of inheritance. This coding method
allows developers to add custom methods to existing categories. Memory management
methods are divided into two main categories, i.e. manual memory reference management
and automatic memory reference management. Both mechanisms are implemented through
the automatic reference counting mechanism [23,24,25]. The specific programming language features are shown in Fig. 3.
Fig. 3. Programming language features.
The study mainly uses the LAB color model when performing the color entropy calculation
of the college works. L denotes the luminance information of the painting. A represents
the information interval between the transition of a painting from a green area to
a magenta area in color. B represents the information interval between the transition
of the painting from the blue area to the yellow area. Color entropy technology more
accurately describes the color of an image and can fully capture the color features
of the image. Therefore, it has a good classification effect in painting style classification.
Since the degree of color saturation is an important criterion in the evaluation of
paintings, the study is mainly conducted in HSV color mode. The S channel can be used
exclusively for analyzing color saturation. Firstly, RGB images are converted into
HSV images, and in this image mode, the range of all channels can be simplified within
[0,1], as shown in Equation (1).
In (1), $i$ denotes the hue index and $P$ denotes the overall probability of that hue appearing
in the picture of the college painting. A weighted gray scale function is introduced
as shown in (2).
In turn, (3) can be obtained.
In (3), $H$ represents the hue channel. If the color entropy calculation only considers
the hue element, the participants' painting styles are classified as highly similar.
This is mainly due to the lack of saturation determination, where the same hue is
interpreted as the same color. To address this limitation, the study analyzed the
S channel, which is exclusively dedicated to color saturation. Equation (4) can be used to calculate the average color saturation of the college painting images.
In (4), $x$ and $y$ denote the pixel indexes. $S_{x,y} $ denotes the saturation value at
the first $\left(x,y\right)$ pixel, which can be used as a weighting function. The
color entropy calculation formulae in HSV mode can be combined in the form of (5).
Since the participant's drawing image comprises not only the color component but also
the composition element, evaluating and categorizing the participant's drawing based
solely on color entropy would not be accurate. Therefore, additional analysis of drawing
lines and composition information is required. In this part of the study, the block
entropy technique is used for model construction. The two-dimensional entropy model
mainly distinguishes images that obtain basic information entropy through a blocking
strategy, and then constructs two-dimensional image information entropy. It achieves
the effect of two-dimensional student drawing images by minimizing the interference
of color information in the images. Supposing the potential segmentation boundary
point of the image is $S$, and the probability condition of the image gray level is
represented by $pL$, which can be divided into two probability distributions. The
first probability distribution is shown in (6).
The second probability distribution is shown in (7).
The $l$ in (7) indicates the number of gray levels. The entropy value after image segmentation is
shown in (8).
$E_{A} $ in (8) can be calculated in the form of (9).
In (10), $E_{B} $ can be calculated as the form of (10).
Then the optimal threshold of the segmented image can be expressed in the form of
(11).
In addition, in the block entropy section, research mainly uses variance form to characterize
the spatial distribution of images and distinguishes students' painting images based
on specific rules. The image is divided into small blocks and the information entropy
of each individual block is calculated. Finally, the information entropy variance
and block grayscale distribution of all blocks are calculated to analyze the connections
between the blocks. The overall architecture of the image entropy style classification
model is shown in Fig. 4.
Fig. 4. Overall architecture of image entropy style classification model.
3.2 Painting description generation model design
The painting description generation model utilized in this study primarily utilizes
a multi-attention mechanism to generate image descriptions via a triple-attention
mechanism. The specific structure is shown in Fig. 5.
Fig. 5. Structure of triple attention mechanism.
As shown in Fig. 5, ATT1 represents the attention mechanism for the spatial information of the image.
V represents the attention context vector of the previous moment. ATT2 represents
the attention mechanism for the hidden unit of the previous moment. ATT3 represents
the attention mechanism of the image information with the hidden unit of the LSTM
model at the current moment. The study uses the LSTM model to generate description
statements for the participant's painting. In the input stage, the weight matrix can
be described as (12).
In (12), $Z_{it} $ denotes the weight matrix. $W_{iv} $, $w_{ih} $, and $W_{th} $ denote
the parameters obtained by the training method, respectively. $V$ denotes the feature
elements of the image space. $h_{t-1} $ denotes the hidden state of the LSTM model
at the previous moment. Then the mapping of the attention model in the image mapping
space dimension can be expressed in the form of (13).
The attention mechanism can be expressed in the form of (14).
The design of the model first extracts data from words that can describe college painting,
and then summarizes it into a database of painting description words. Next, by indexing,
the word vector of each word can be set to correspond to its position in the word
matrix. The attention mechanism design part adopts the method of feature vector matching
to match the text feature vectors of the subject's painting with the visual features
of the painting image, forming a vector matrix with correlation. The elements in this
matrix are strongly correlated with each other. The study utilizes the text feature
and visual feature vectors as input for the attention module, respectively. The final
attention vector is then formed through a weighted sum calculation, as demonstrated
in (15).
In (15), $w$ represents the output formed after the parameter matrix is input to the SoftMax
layer. $v$ represents the weights of the visual feature vector.