2. Logistics Center Site Selection Model
Decision variable: Let ${x}_{j} $ be the binary variable of whether or not to build
a cold chain logistics center at the $i$-th candidate location.
$y_{ij} $ is a binary variable for whether the $j$-th demand point is served by the
$i$-th cold chain logistics center, and $z_{ij} $ is the demand of the $i$-th cold
chain logistics center in the $j$-th period [8,9].
Objective function: Let $p_{j}$ be the unit price of the product at the $j$-th demand
point, $T$ be the upper limit of the transportation time of the cold chain logistics
center, $S$ be the total number of the planning period, $V$ be the variance of the
demand quantity, $W$ be the weight coefficient of the demand quantity. Then the objective
function is specific as in Eq. (1) [10,11].
This objective function is a minimization problem with four parts:
1. The first part $\sum\limits _{i=1}^{n}C_{i} x_{i} $ denotes the sum of the costs
of the candidate locations, where $C_{i} $ is the cost of the $i$-th candidate location,
and $x_{i} $ is the decision variable for choosing that location;
2. The second part $\sum\limits _{i=1}^{n}\sum\limits _{j=1}^{m}D_{ij} Q_{j} y_{ij}
$ denotes the sum of transportation costs from the candidate location to the demand
point, where $D_{ij} $ is the distance from the $i$-th candidate location to the $j$-th
demand point, $Q_{j} $ is the quantity demanded at the $j$-th demand point, and $y_{ij}
$ is the distance from the $i$-th candidate location to the $j$-th demand point. $Q\_j$
is the quantity demanded at the $j$-th demand point, and $y_{ij} $ is the allocation
decision variable from the $i$-th candidate location to the $j$-th demand point;
3. The third component $\sum\limits _{i=1}^{n}\sum\limits _{j=1}^{m}R_{j}P_{j} Q_{j}
D_{ij} y_{ij} $ denotes the sum of transportation costs from the candidate location
to the demand point, where ${\mathbb R}_{j} $ is the price at the $j$-th demand point,
$j$ is the price of product at the $j$-th demand point, $D_{ij}$ is the distance between
the $i$-th candidate location and the $j$-th demand point, and $D_{ij}$ is the distance
between the $i$-th and $j$-th locations. $D_{ij}$ is the distance from the $i$-th
candidate location to the $j$-th demand point, and $y_{ij} $ is the allocation decision
variable from the $i$-th candidate location to the $j$-th demand point;
4. The fourth component $\sum\limits _{i=1}^{n}\sum\limits _{j=1}^{m}\sum\limits _{s=1}^{S}W
V_{ijs} z_{ijs} $ denotes the sum of the transportation time from the candidate location
to the demand point, where $W$ is the weighting factor, $V_{ijs} $ is the transportation
time from the $i$-th candidate location to the $j$-th demand point during the sth
planning period, and $z_{ijs} $ is the transportation time from the $i$-th candidate
location to the $j$-th demand point. where $W$ is the weighting factor, $V_{ijs} $
is the transportation time from the $i$-th candidate location to the $j$-th demand
point in the s-th planning period, and $z_{ijs} $ is the allocation decision variable
from the $i$-th candidate location to the $j$-th demand point in the s-th planning
period.
Constraints: Let $n$ be the number of candidate locations, $m$ be the number of demand
points, and $p$ be the upper limit of the number of cold chain logistics centers.
Then the constraints are Eqs. (2)-(8):
$x$ is a binary ($0$-$1$) decision variable indicating whether the $i$-th candidate
location is selected as a cold chain logistics center. If $= 1$, the location is selected;
if $x_{i} = 0$ means that the location is not selected. It appears in the first part
of the objective function and several other constraints, directly related to the cost
of establishing a cold chain logistics center and the feasibility of providing services
to demand points. $y_{ij} $ is also a binary ($0$-$1$) decision variable, but it focuses
on the assignment relationship from the $i$-th candidate location to the $j$-th demand
point. When $= 1$, it means that there is a distribution relationship or a supply
relationship from location $i$ to demand point $j$; otherwise, if $= 0$, it means
that there is no such supply relationship. It appears in the second and third parts
of the objective function and in many constraints, and is closely related to transportation
costs and whether the supply demand at each demand point can be met. In short, $x$
determines which candidate locations will become actual cold chain logistics centers,
and $y$ determines how these centers allocate services or resources to different demand
points. These two variables work together to optimize the cost and efficiency of the
entire cold chain logistics system.
The model innovatively takes period variables and demand variables into account, aiming
to comprehensively address the dynamics and uncertainties in the problem of cold chain
logistics center location. It is able to flexibly adapt to changes in market demand,
fluctuations in the supply chain, and the effects of climate, thus providing a more
accurate and practical solution [12,13]. Through the use of AI-based prediction and optimization methods, the model is able
to accurately predict the demand in different periods and further optimize the siting
decision by setting the weight coefficients and variances of the demand. This advanced
algorithm not only improves the solution efficiency, but also ensures the quality
of the site selection results, making the layout of the cold chain logistics center
more scientific and reasonable, and the operation more efficient. Overall, with its
in-depth understanding and effective handling of dynamics and uncertainty, as well
as its powerful prediction and optimization capabilities with the help of AI technology,
the model provides a brand new solution idea and tool for the siting problem of cold
chain logistics centers, which is expected to produce significant economic benefits
and social value in practice. The specific process of site selection is shown in Fig. 3.
Fig. 3. Site selection process.
However, the model is a multi-objective and multi-variable problem, and its solution
is an NP-difficult problem, so how to design a solution framework for this problem
based on AI is the focus and difficulty of this paper. This paper adopts the following
methods: (1) This paper uses the technology of artificial intelligence to analyze
and model the historical data in order to predict the demand in different periods.
(2) This paper uses the technology of artificial intelligence to solve the mathematical
model of the problem of selecting the location of cold chain logistics centers in
order to generate the optimal or near-optimal solution of the problem of selecting
the location of cold chain logistics centers [14,15].
3. Data-based Demand Forecasting
An important input data for the cold chain logistics center location problem is the
demand volume in different periods, i.e., the consumption of agricultural products
at each demand point in each period. The forecast of demand not only affects the location
and service scope of the cold chain logistics center, but also affects the transportation
cost and cargo damage cost of the cold chain logistics center. Therefore, demand forecasting
is a key step in the problem of cold chain logistics center location. However, demand
forecasting is not easy because demand is affected by many factors, such as changes
in market demand, fluctuations in the supply chain, and the influence of climate.
In order to solve this problem, this paper uses the technology of artificial intelligence
to analyze and model the historical data in order to predict the demand volume in
different periods.
In this paper, historical data were collected from multiple data sources, including
production, price, sales, inventory, variety, and quality of agricultural products,
as well as population, income, consumption habits, and preferences at the point of
demand, and external factors such as climate, holidays, and policies. In order to
analyze and model the preprocessed data using TCN to extract the features and patterns
of the demand volume, the following process is carried out in this paper:
First, let the input sequence be $\mathbf x=(x_{1}$, $x_{2}$, $x_{3}$, ..., $x_{n}
)$ and the output sequence be $\mathbf y=(y_{1} $, $y_{2} $, $y_{3} $, ..., $y_{T}
)$, where $T$ is the length of the sequence, $x_{t} $ and $y_{t} $ are the demand
at time $t$, and $t=1$, $2$, $\ldots$, $T$ [16,17].
Second, this paper constructs multiple TCN models based on different demand points,
where the input of each model is the historical demand sequence of a demand point
and the output is the future demand sequence of that demand point. In this paper,
we use a residual-connected TCN architecture consisting of multiple convolutional
and activation layers, and the expansion factor of each convolutional layer grows
exponentially to increase the sensory field. This paper also uses a dropout layer
and a batch normalization layer to prevent overfitting and accelerate convergence.
Let the input of the lth layer be $\mathbf h^{(l-1)} =(h_{1}^{(l-1)}$, $h_{2}^{(l-1)}$,
$\ldots$, $h_{T}^{(l-1)} )$, the output be $\mathbf h^{(l)} =(h_{1}^{(l)}$, $h_{2}^{(l)}$,
$\ldots$, $h_{T}^{(l)} )$, where $l=1$, $2$, $\ldots$, $L$, $L$ is the number of layers,
$h(0)=x$, $h(L)=y$ then the formula for the $l$-th layer is shown in Eq. (9). Where $W(1)$ is the convolution kernel parameter of the $l$-th layer, and $F$ is
the nonlinear transformation function of the $l$-th layer, including convolution,
activation, dropout and batch normalization operations. Specifically, for each moment
$t$, as shown in Eq. (10) [18].
where $k$ is the convolutional kernel size, ${d}_{l} $ is the expansion factor of
layer l, $w_{i} {}^{l} $ is the convolutional kernel weight of layer $l$, and $\sigma$
is the activation function.
Third, let the real demand sequence on the training set be $y(train)=(y_1(train)$,
$y_2(train)$, $\ldots$, $y_T(train))$ and the predicted demand sequence be $\hat{y}^{(train)}
=(\hat{y}_{1}^{(train)}$, $\hat{y}_{2}^{(train)}$, $\ldots$, $\hat{y}_{T}^{(train)}
)$. Then the MSE loss function is shown in Eq. (11).
where $\mathbf W=(\mathbf W^{(1)}$, $\mathbf W^{(2)}$, $\ldots$, $\mathbf W^{(L)}
)$ is the convolution kernel parameter for all layers. $\mathbf W$ is updated using
the Adam optimizer to minimize $\mathcal L(\mathbf W)$.
Fourth, this paper also calculates confidence intervals for the predicted values,
which indicate the uncertainty of the prediction, using a confidence level of 0.95,
and upper and lower bounds of the confidence intervals based on the assumptions of
variance and normal distribution of the predicted values. Let the true demand sequence
on the test set be $y^{(test)} =(y_{1}^{(test)}$, $y_{2}^{(test)}$, $\ldots$, $y_{T}^{(test)}
)$, the sequence of forecasted demand is $\hat{y}^{(test)} =(\hat{y}_{1}^{(test)}$,
$\hat{y}_{2}^{(test)}$, $\ldots$, $\hat{y}_{T}^{(test)} )$, the sequence of variance
of forecasted demand is $s^{(test)} =(s_{1}^{(test)}$, $s_{2}^{(test)}$, $\ldots$,
$s_{T}^{(test)} )$, then the RMSE and MAPE are calculated as in Eqs. (12) and (13). Confidence intervals are calculated as shown in equations $\hat{y}_{t}^{(test)}
z_{\alpha /2} \sqrt{s_{t}^{(test)} } $ [19].
The data-based demand forecast can truly and effectively respond to the real demand
of agricultural cold chain logistics, and the modeling analysis of data combined with
TCN model can fully explore the demand information contained in the data. In this
paper, we firstly analyze the demand based on data and TCN model, and then use intelligent
algorithm for further site selection operation [20].
4. Intelligent Algorithm-based Model Solving
In this paper, a data-driven ACO algorithm is used to solve the problem of logistics
center siting for agricultural logistics transportation, and the model in this paper
improves the traditional ACO algorithm with the following three points. (1) The data-driven
idea is introduced, and historical data and real-time data are utilized to dynamically
adjust the parameters of ACO algorithm, such as pheromone volatility coefficient,
heuristic factor, etc., so that the algorithm can better adapt to the changes in demand
and environmental changes, and improve the efficiency and stability of the algorithm.
(2) A multi-objective optimization approach is adopted, which comprehensively considers
multiple objectives of logistics center location selection, such as minimizing the
total transportation cost, maximizing the service level, and minimizing the environmental
impact [21].
The basic principle of ant colony algorithm is the probability of ants transferring
from cities $i$ to $j$ , where $\eta _{ij} $ is the heuristic function, which denotes
the inverse of the distance between citites $i$ and $j$, $\tau _{ij} $ is the pheromone
concentration; $\alpha $ and $\beta $ are the pheromone factor and the heuristic function
factor, and $allowk$ is the set of cities to be visited by the antsk. The details
are shown in Eq. (14). The amount of pheromone released by the ant on the path between the cities $i$ and
$j$ , where $Q$ is the pheromone constant, and $L$ is the length of the path that
ant $k$ passes through. Specifically as shown in Eq. (15). The total amount of pheromone released by all ants on the path between cities $i$
and $j$, as specified in Eq. (16). The updating formula of pheromone, where $\rho$ is the pheromone volatilization
coefficient, as specified in Eq. (17). The specific process is shown in Fig. 4 [22].
Fig. 4. Flow of ACO algorithm.
In order to use the ant colony algorithm to solve the cold chain logistics center
location problem, we need to design the appropriate coding method, fitness function,
neighborhood structure, parameter settings, etc., and carry out the process of initialization,
iteration and termination of the algorithm. The specific steps are as follows:
Neighborhood structure: We use a single-point variation to define the neighborhood
structure, i.e., we randomly choose a position in the encoding of a solution, invert
it, and obtain a new solution as a neighborhood solution of the original solution
[23].
Parameter setting: We set the number of ants to $N$ , the pheromone volatilization
coefficient to $\rho $, the pheromone intensity to $\tau $, the heuristic factor to
$\eta $ , the maximum number of iterations to $T$, and other related parameters.
Algorithm initialization: We randomly generate $N$ feasible solutions as the initial
solution set, calculate the objective function value and fitness function value of
each solution, initialize the pheromone intensity as $\tau _{0} $, record the current
optimal solution and optimal objective function value. I set the pheromone volatilization
coefficient as a variable instead of a constant, and its value is determined by the
historical data and real-time data, the specific calculation formula is shown in Eq.
(18), and the heuristic factor $\beta $ is shown in Eq. (19) [24].
where $f_{best} $ denotes the objective function value of the optimal solution in
the current iteration, $f_{avg} $ denotes the average value of the objective function
values of all solution in the current iteration, and e denotes the base of the natural
logarithm. When the gap between the objective function value of the optimal solution
and the average value is larger, the pheromone volatilization coefficient is smaller,
which retains more pheromone and enhances the positive feedback effect; when the gap
between the objective function value of the optimal solution and the average value
is smaller, the pheromone volatilization coefficient is larger, which reduces the
pheromone accumulation and avoids premature convergence [25].
Algorithm initialization: We randomly generate $N$ feasible solutions as the initial
solution set, calculate the objective function value and fitness function value for
each solution, initialize the pheromone intensity to $\tau _{0} $, and record the
current optimal solution and optimal objective function value.
Algorithm iteration: The implication of Eq. (20) is that the non-dominated solutions at level $i$ are those that are not dominated
by any other solution, i.e., they are either optimal or worst on all objectives, where
$d_{i} $ is the degree of crowding of the $i$-th solution, $m$ is the number of objective
functions, $f_{j}^{(i+1)} $ is the value of the $j$-th objective function for the
$i$-th solution, and $f_{j}^{max} $ and $f_{j}^{mp}$ are the maximum and minimum values
of the $j$-th objective function. The meaning of this formula is that the crowding
degree of the $i$-th solution is the sum of its differences with the neighboring solutions
on each objective, normalized. The larger the crowding degree, the better the diversity
of the solution, i.e., the fewer its neighbors in the objective space. This is shown
in Eq. (21) [26].
We use four kinds of neighborhood actions, respectively flip one bit, swap two bits,
exchange two adjacent bits, flip two bits four operations as follows:
Neighborhood action 1: Flip a bit, i.e., for a solution $s=(s_{1} $, $s_{2} $, $\ldots
$, $s_{n} )$, randomly choose a position $i\in \{1$, $2$, $\ldots$, $n\} $ and invert
$s_{i} $ to get a new solution.
$s^{'}=(s_{1}$, $s_{2}$, $\ldots$, $s_{i-1}$, $1-s_{i}$, $s_{i+1}$, $\ldots$, $s_{n}
)$ as a neighborhood solution of the original solution. The neighborhood is the set
of all possible solutions obtained by flipping one bit, i.e., $N_{1} (s)=\{s^{'}\mid
s^{'}=(s_{1}$, $s_{2}$, $\ldots$, $s_{i-1}$, $1-s_{i}$, $s_{i+1}$, $\ldots$, $s_{n}
)$, $i\in \{1$, $2$, $\ldots$, $n\} \} $.
Neighborhood action 2: Swap two bits, i.e., for a solution $s=(s_{1}$, $s_{2}$, $\ldots
$, $s_{n} )$, choose two positions at random and $i$, $j\in \{1$, $2$, $\ldots$, $n\}
$ swap $s_{i} $ and $s_{j} $ to get a new solution $s^{'}=(s_{1}$, $s_{2}$, $\ldots$,
$s_{i-1}$, $s_{j}$, $s_{i+1}$, $\ldots$, $s_{j-1}$, $s_{i}$, $s_{j+1}$, $\ldots$,
$s_{n} )$ as a neighborhood solution of the original solution. The neighborhood is
the set of all possible solutions obtained by swapping two bits, i.e., $N_{2} (s)=\{s^{'}\mid
s^{'}=(s_{1}$, $s_{2}$, $\ldots$, $s_{i-1}$, $s_{j}$, $s_{i+1}$, $\ldots$, $s_{j-1}$,
$s_{i}$, $s_{j+1}$, $\ldots$, $s_{n} )$, $i$, $j\in \{1$, $2$, $\ldots $, $n\} \}
$.
Neighborhood action 3: Swap two adjacent bits, i.e., for a solution $s=(s_{1} $, $s_{2}$,
$\ldots$, $s_{n} )$, randomly choose a position $i\in \{1$, $2$, $\ldots$, $n\} $,
and swap $s_i$ and $s_{i+1}$ to get a new solution $s^{'}=(s_{1}$, $s_{2}$, $\ldots$,
$s_{i-1}$, $s_{i+1}$, $s_{i} $, $s_{i+2}$, $\ldots$, $s_{n} )$, which serves as a
neighborhood solution of the original solution. The neighborhood is the set of all
possible solutions obtained by swapping two neighboring bits, i.e., [27,28]. $N_{3} (s)=\{s^{'}\mid s^{'}=(s_{1}$, $s_{2}$, $\ldots$, $s_{i-1}$, $s_{i+1}$, $s_{i}$,
$s_{i+2}$, $\ldots$, $s_{n} )$, $i\in \{1$, $2$, $\ldots$, $n-1\} \} $.
Neighborhood action 4: Flip two bits, i.e., for a solution $s=(s_{1}$, $s_{2}$, $\ldots$
$s_{n} )$, choose two positions $i\in \{1$, $2$, $\ldots$, $n\} $, at random, and
invert $s_{i} $ and $s_{j} $ to obtain a new solution $s^{'}=(s_{1}$, $s_{2}$, $\ldots$,
$s_{i-1}$, $1-s_{i}$, $s_{i+1}$, $\ldots$, $s_{j-1}$, $1-s_{j}$, $s_{j+1}$, $\ldots$,
$s_{n} )$, which serves as a neighborhood solution of the original solution. The neighborhood
is the set of all possible solutions obtained by flipping two bits, i.e. $N_{4} (s)=\{s^{'}\mid
s^{'}=(s_{1}$, $s_{2}$, $\ldots$, $s_{i-1}$, $1-s_{i}$, $s_{i+1}$, $\ldots$, $s_{j-1}$,
$1-s_{j}$, $s_{j+1}$, $\ldots$, $s_{n} )$, $i$, $j\in \{1$, $2$, $\ldots$, $n\} \}
$ [3].
Then, we can describe the process of neighborhood search and variable neighborhood
search with the following equations:
Crossover operation: for two solutions $s_{1}$, $s_{2} $, choose a cut point $i\in
\{1$, $2$, $\ldots$, $n\} $ randomly in their encodings and exchange their second
half at the cut point to get two new solutions $s_{1}^{'} =(s_{1}^{1}$, $s_{1}^{2}$,
$\ldots$, $s_{1}^{i}$, $s_{2}^{i+1}$, $s_{2}^{i+2}$, $\ldots$, $s_{2}^{n} )$ and $s_{2}^{'}
=(s_{2}^{1}$, $s_{2}^{2}$, $\ldots$, $s_{2}^{i}$, $s_{1}^{i+1}$, $s_{1}^{i+2}$, $\ldots$,
$s_{1}^{n} )$ which are used as crossover solutions of the original solutions. If
the objective function value of the crossover solution is less than the objective
function value of the original solution, then replace the original solution with the
crossover solution, otherwise keep the original solution. In this way, we can use
the crossover information to increase the diversity of the solutions and generate
more solution space, i.e., $s_1\leftarrow s_1^{'} $ and $s_2\leftarrow s_2^{'} $,
if $f(s_{1}^{{'} } )<f(s_{1} )$ and $f(s_{2}^{{'} } )<f(s_{2} )$ [29,30].
Mutation operation: for a solution $s$, randomly select a number of positions in its
encoding $i_{1}$, $i_{2}$, $\ldots$, $i_{k} \in \{1$, $2$, $\ldots$, $n\} $, invert
them to obtain a new solution $s^{'}=(s_{1}$, $s_{2}$, $\ldots$, $s_{i_{1} -1}$, $1-s_{i_{1}
}$, $s_{i_{1} +1}$, $\ldots$, $s_{i_{2} -1}$, $1-s_{i_{2} }$, $s_{i_{2} +1}$, $\ldots$,
$s_{i_{k} -1}$, $1-s_{i_{k} } $, $s_{i_{k} +1}$, $\ldots$, $s_{n} )$, which serves
as a mutation solution of the original solution. If the objective function value of
the mutated solution is less than the original solution's objective.
Epsilon-constraint method is an important multi-objective optimization strategy when
dealing with complex decision-making problems with multiple conflicting objectives,
especially suitable for finding optimal equilibrium solutions from Pareto frontier
solutions. By fixing the lower bound (usually called $\varepsilon$-level) of one objective
function and then maximizing the values of the remaining objective functions, the
method gradually explores and refines the solution space to meet the optimal selection
requirements under different preferences.
The specific operation steps are as follows: First, set an objective function $f_{k}
$ as the main optimization object, and set its lower bound to $\grave{\rm O}$ , that
is, $f_{k} \ge {\grave{\rm O}}$. In this case, the problem is transformed into maximizing
or minimizing the other objective function $f_{i}$ ($i\ne k$) while keeping $f_{k}
$ not lower than $\grave{\rm O}$. This process can be expressed as a series of single-objective
optimization problems [31,32]:
where $x$ represents the set of decision variables and $X$ is the set of constraints
defining the feasible region of the problem [33,34].
The advantage of this approach is that it systematically explores all possible combinations
of objectives, and by gradually changing the value of the decision maker can obtain
a series of optimal solutions, each representing the optimal configuration at different
objective levels. This approach not only reveals trade-offs between objectives, but
also provides intuitive decision support for decision makers, enabling them to flexibly
adjust objective priorities according to changes in the organization's strategic priorities
or external environment [35].