1. Introduction
In the rapidly evolving AEC (Architecture, Engineering, and Construction) industry,
innovative technologies are continually reshaping how projects are designed, managed,
and executed. One such technology, Building Information Modelling (BIM), has emerged
as a game changing tool by enabling comprehensive digital representation of the physical
and functional characteristics of construction projects. At the same time, Decision
Support Systems (DSS) are increasingly utilized to assist architects and engineers
during the design stage by processing vast amounts of data and providing actionable
insights. The convergence of DSS and BIM、a fusion of cutting edge technology with
construction practices、presents a transformative force that could redefine project
design, management, and cost effectiveness in an industry where sustainability, safety,
and accuracy are paramount (Ayman et al., 2022). This combination also offers remarkable opportunities to enhance the evaluation
and selection of BIM models.
A DSS is a technological tool designed to assist individuals and organizations in
making informed and effective decisions (Whyte, 1986). It analyzes large amounts of data to offer actionable insights that can enhance
decision making processes. By integrating various data sources, a DSS helps users
evaluate complex scenarios and make more accurate, data driven choices. An evaluation
metric, a core component of any DSS, is a procedure used to rank items in an extensive
database based on user specifications. It plays a central role in a DSS due to its
ability to sort items according to user preferences. Several evaluation metrics exist
to rank lists of items based on specific user needs. Concisely, scoring functions
are measures that condense multidimensional records into a single value, simplifying
complex data for recommendation ranking (Skiena, 2017).
The combination of DSS and BIM presents a remarkable opportunity to enhance DSS, optimize
workflows, and ultimately transform the construction sector (Asudeh et al., 2019). To ensure that the recommendations provided by these systems are accurate and aligned
with user preferences and project objectives, it is essential to have robust assessment
metrics in place. This is particularly important when there are multiple options,
each with varying degrees of significance, to choose from.
Among the well known metrics, the shortcomings of the frequently use Euclidean scoring
is that it suffers from the curse of dimensionality where it becomes outlier-sensitive
and assumes orthogonality (Yang & Alessandrini, 2019). The second frequently used distance metric is Cosine similarity which suffers from
insensitivity to vector magnitude (Xia et al., 2015).
This study tackles a significant research challenge by exploring the integration of
BIM and DSS, both of which have demonstrated considerable potential in their respective
domains. As these methodologies have rapidly evolved, there is an increasing demand
for their convergence to address complex applications requiring joint outputs. However,
the key challenge lies in optimizing their combined efficacy (Tan et al., 2021). The objective of this research is to investigate the optimal use of Euclidean and
Cosine similarity measures within a BIM-centric DSS, particularly for managing high-dimensional
BIM datasets. The study aims to elucidate how these computational metrics can enhance
DSS recommendations, particularly in the construction industry.
2. Literature Review
2.1 Integration of DSS and BIM
The integration of DSS and BIM has amassed attention due to its potential to enhance
recommendation systems, particularly in context of AEC industry. DSS provides analytical
techniques to process complex data, while BIM offers detailed digital representation
of construction projects. Combining these two paradigms enables refined decision making
regarding design, cost estimation and project management.
For instance, Fazeli et al. (2020) proposed an integrated BIM-DSS framework for cost estimation, enabling real-time
budgeting and financial recommendations. This semi-automated, BIM-based cost estimation
offered a flexible and adaptive framework for assessing the financial impact of various
design scenarios. The BIM-centric approach demonstrated superior performance compared
to traditional methods, particularly in terms of efficiency and accuracy.
A critical aspect of a DSS's effectiveness in processing large datasets is its evaluation
metric or scoring function (Boukhayma et al., 2020). These metrics, which are quantifiable measures, play a pivotal role in assessing
the efficiency, utility, and output quality of a system within a DSS framework (Hossin & Sulaiman, 2015). By delivering objective, numerical insights, evaluation metrics provide a clear
understanding of how well the system achieves its intended goals and requirements.
Moreover, they form a crucial link between theoretical models and real-world applications
(M. Zhang, 2015; Hastie, 2012). These metrics facilitate a deeper quantitative analysis of the DSS's internal mechanisms,
including its accuracy, computational efficiency, and user satisfaction (ENSIAS, Mohammed
V, Morocco et al., 2020). Such comprehensive evaluations enable continuous improvement
of DSS, ensuring it remains an indispensable tool for decision-making in increasingly
complex environments.
Another application of this is in sustainability and energy efficiency in building
design by (Fenz et al., 2023) and (Shen et al., 2023), where BIM models are linked with DSS to optimize energy performance. The system
simulates various design options, such as HVAC systems or material choices, and provides
a decision matrix to help engineers choose the most energy efficient solution. Common
tools for this include Green Building Studio and EnergyPlus, with decision evaluation
indicators such as energy consumption (kWh/m²/year), CO2 emissions, and LEED certification
levels guiding the decision process.
BIM models that incorporate time (4D) and cost (5D) allow project managers to evaluate
the effects of schedule changes, cost fluctuations, and resource allocation is yet
another case of BIM-DSS integration is in construction scheduling and cost estimation
(Kubba et al., 2012). DSS tools like Primavera and Synchro help project managers analyze project timelines,
labor costs, and material expenses to develop alternative project scenarios. Decision
indicators in this context often include total project cost, project completion time,
labor hours, and budget variance. Furthermore, BIM-DSS integration proves useful in
disaster management and emergency planning by simulating scenarios such as earthquakes
or fires, which help design effective response strategies (Alavi et al., 2022). Systems like ALOHA and Simudyne use BIM data to provide realtime feedback on evacuation
routes and structural integrity, with indicators such as time to evacuate, structural
safety index, and risk mitigation score.
In facility management and asset maintenance, BIM models linked with DSS are used
to predict and plan the maintenance needs of building systems like HVAC or electrical
networks. Tools like IBM Maximo and Archibus help managers make data driven decisions
on equipment repairs, replacements, and upgrades by evaluating equipment lifespan,
failure rates, and maintenance costs (Dashti et al., 2021). Key indicators include equipment downtime, maintenance costs, and life cycle cost.
Lastly, urban planning and infrastructure development are increasingly benefiting
from BIM-DSS linkages. City planners use these systems to evaluate transportation
systems, public facilities, and utilities, allowing for more informed long-term planning.
Tools such as InfraWorks and ArcGIS help assess transportation flow, population growth,
and environmental impacts, with decision indicators including traffic density, infrastructure
cost, environmental impact score, and population growth rate.
The integration of BIM with DSS offers a powerful way to optimize decision making
in AEC projects. By linking comprehensive digital models with sophisticated analytical
tools, stakeholders can evaluate design alternatives, costs, risks, and sustainability
metrics, leading to more informed and data informed decisions across various domains
of construction industry.
2.2 Evaluation Metrics in DSS for BIM Assessment
In the context of BIM centered DSS, evaluation metrics are not only essential for
processing and interpreting BIM data but also play a pivotal role in ensuring the
system provides accurate and meaningful recommendations. The selection of appropriate
metrics is critical, as it directly influences the ability of recommendation systems
(RS) to effectively rank and compare BIM models based on user specifications, design
constraints, and project requirements. A well chosen metric enables the system to
capture the nuances of the data. Thereby, aiding in the precision of recommendations
and aligning them more closely with user needs.
Although a variety of metrics have been investigated for such applications, Euclidean
distance and Cosine similarity remain two of the most frequently employed due to their
computational efficiency and effectiveness in handling low dimensional datasets. Euclidean
distance calculates the direct, straight line distance between two points in multidimensional
space, making it intuitive for spatial comparisons. In contrast, Cosine similarity
measures the cosine of the angle between two vectors, which makes it particularly
suitable for evaluating the orientation of data points, regardless of their magnitude.
This makes Cosine similarity useful in applications where the relative positioning
of data, rather than their absolute values, is important.
However, when these metrics are applied to high dimensional BIM datasets, they encounter
notable limitations. Euclidean distance, for instance, suffers from what is commonly
known as the "curse of dimensionality," where its ability to differentiate between
data points diminishes as the number of dimensions increases. This phenomenon causes
distances between points to converge, reducing the metric's effectiveness in distinguishing
between BIM models with complex, high dimensional attributes. Similarly, Cosine similarity,
while proficient at identifying directional relationships in low dimensional spaces,
may fail to provide significant differentiation between vectors in high dimensional
BIM data, where many points may appear to have similar orientations despite their
distinct attributes. These challenges underscore the need for advanced or hybrid metrics
to address the intricacies of high dimensional BIM models (Yang & Alessandrini, 2019). To address this, advanced techniques such as dimensionality reduction methods Principal
Component Analysis, t-SNE) or hybrid metrics combining multiple measures could be
explored. These approaches can better capture the intricate relationships within the
data
The concept of "garbage in, garbage out" (GIGO) is particularly pertinent when discussing
evaluation metrics for data driven systems, including artificial intelligence and
machine learning models (Geiger et al., 2020; Canbek et al., 2022). Inadequate or improperly chosen evaluation metrics can degrade the quality and reliability
of a system’s output, potentially leading to flawed decision making. Without appropriate
metrics to regulate the system's performance, several critical issues may arise, such
as misleading performance assessments or failure to detect input biases (Kilkenny et al., 2018). To mitigate these risks, conducting a thorough sensitivity analysis of the evaluation
metrics used in DSS is essential. This involves selecting metrics from a diverse set
that captures different dimensions of model performance, depending on the specific
problem and domain context. Generic metrics may fail to robustly capture the unique
output requirements of complex systems like BIM and DSS, underscoring the need for
developing domain specific metrics tailored to the particular nuances of these systems.
In a digital landscape where information is generated at an unprecedented volume and
velocity, scoring functions play a crucial role in extracting meaningful insights,
thereby facilitating informed, fact based decision making (Asudeh et al., 2019; Stojkovic et al., 2017). These metrics are instrumental in enabling more accurate and effective decisions
that lead to improved outcomes (Ayman et al., 2022). Additionally, scoring functions promote the paradigm of continuous improvement by
systematically evaluating the system's performance, allowing for ongoing enhancements
and adaptability in dynamic environments.
The scope of this study focuses on a comprehensive examination of Minkowski like distance
functions, specifically Euclidean and soft cosine distance metrics, which are fundamental
evaluation measures widely used in fields such as machine learning, data analysis,
and RS. These metrics play a critical role in real world applications like clustering
and text document analysis, where they facilitate the measurement of similarity or
dissimilarity between data points, contributing to more accurate and efficient model
performance (Korenius et al., 2007). Therefore, refining these metrics for domain specific applications will further
enhance their effectiveness in complex data environments.
In the context of the modern AEC industry, such study holds undeniable significance.
The integration of BIM and DSS is driving a paradigm shift within the construction
sector (Nursal et al., 2015a). The fusion of these two technologies offers substantial potential for optimizing
workflows across the industry. However, to fully realize these benefits, the establishment
of a suitable evaluation criterion is essential, ensuring that the system generates
accurate recommendations based on user input. Furthermore, this need aligns with broader
technological advancements in BIM, AI, and the construction industry, addressing critical
concerns such as safety and sustainability (Boukhayma et al., 2020). BIM based DSS has the potential to dramatically enhance project design and management,
ultimately reducing costs and minimizing delays in the high stakes construction environment
(Geil, 2011). The integration supported by robust evaluation metrics, represents a transformative
approach that can significantly elevate the efficiency and sustainability of the construction
industry.
2.3 Practical Implications of Metric Selection
Similarity metrics are extensively utilized in recommendation systems, especially
for items with multiple features. These metrics can be optimized and tailored to enhance
the performance of DSS in various scenarios. By effectively applying Euclidean and
Cosine similarity, the system can better manage complex, multidimensional data and
deliver more precise and relevant recommendations.
From a practical standpoint, the selection of evaluation metrics within a DSS has
significant implications for the outcomes and overall success of a project. Accurate
ranking and selection of BIM models based on user preferences can lead to improved
design decisions, directly influencing the quality and efficiency of AEC projects,
which are inherently complex and resource-intensive. Consequently, investigating the
impact of various evaluation metrics on the performance of BIM-centric DSS is crucial
to optimizing decision-making processes in these demanding environments.
In a recommendation system used for suggesting BIM models, user satisfaction is heavily
influenced by the accuracy of the system’s recommendations. Therefore, it is crucial
to understand the specific conditions under which the aforementioned metrics perform
well and identify instances where they may fall short. The literature review examines
scenarios where these two metrics demonstrate clear advantages, while also highlighting
their potential limitations and drawbacks. This enables professionals to make informed
decisions about when and how to apply these measures effectively.
A thorough examination of Euclidean and Cosine similarity metrics across various applications,
including BIM models, reveals a significant gap in the literature regarding the lack
of comprehensive analysis on how these metrics can be used together within BIM based
DSS. This omission is crucial, as it overlooks the potential interactions that may
arise from combining the two metrics, especially in handling high dimensional data
typical of BIM systems. A deeper understanding of the complementary strengths and
potential enhancements offered by integrating these metrics could lead to more effective
decision support and recommendation systems in BIM applications (Omar, 2014). This makes the exploration of their combined use a highly valuable area for further
research.
Focusing on their effectiveness in handling multifeatured user input, this study aims
to clarify how Euclidean and Cosine similarity metrics can be optimized for maximum
impact in BIM model DSS. Given that user satisfaction relies on recommendation accuracy,
it is crucial to identify the conditions where these metrics perform well and where
they fall short (Inhan et al., 2024). A deeper understanding of these metrics will provide valuable insights into their
strengths and limitations in complex data environments.
A DSS is a sophisticated information system engineered to assist in making data driven,
decision making. DSS leverages advanced algorithms, data analytics, and modeling techniques
to streamline decision making processes. By integrating and synthesizing data from
multiple sources, DSS processes complex datasets to generate insights, offering clear
and actionable information. These systems enhance decision making efficacy by providing
an intuitive interface that translates raw data into comprehensive, accessible formats
tailored to strategic planning and operational needs. By delivering critical, context
specific knowledge they play a pivotal role in recommendations. This contributes to
improved overall organizational reputation (Yasnoff & Miller, 2014; W. Holsapple, 2008). Additionally, DSS employs evaluation metrics and scoring functions to systematically
assess alternatives, allowing for more informed and quantitatively-backed decision
outcomes.
2.4 Theoritical Review
Each evaluation metric offers distinct advantages and limitations depending on the
specific context (Linnet, 1988). Drawing from the extensive body of literature on cluster analysis, Euclidean similarity
is frequently employed as a reliable method. The effectiveness of Euclidean distance,
which underpins Euclidean similarity, is consistently highlighted for its robustness
and accuracy in pattern recognition and data clustering applications (Dhawan et al., 2015).
Due to arbitrary scaling caused by small errors in input vectors, Euclidean similarity
can inaccurately reflect dissimilarity between vectors. This makes Euclidean distance
impractical for use in high dimensional spaces or with large datasets without appropriate
adjustments. When dealing with multidimensional vectors, it is crucial to normalize
each vector component to mitigate the influence of one component over another, as
Euclidean similarity is sensitive to differences in scale.
Cosine similarity is another widely used scoring metric, which ranks items by measuring
the angle between their vectors (J. Zhang et al., 2022). Unlike Euclidean distance, this metric is scale-invariant and insensitive to the
magnitude of the vectors, making it particularly useful for comparing data regardless
of size. Due to its focus on direction rather than magnitude, cosine similarity is
considered an unbiased and effective method for analyzing multidimensional data.
The combined Cosine-Euclidean similarity distance measure is a hybrid metric that
integrates both cosine similarity and Euclidean distance. This measure captures both
angular relationships and magnitude differences between vectors. By balancing the
direction sensitive nature of cosine similarity with the magnitude aware Euclidean
distance, this metric provides a more comprehensive assessment of similarity. It is
particularly useful in situations where both vector orientation and distance contribute
to meaningful comparisons, enhancing accuracy in multidimensional or complex datasets.
The combined Cosine-Euclidean similarity distance measure is given by
The equation combines both distance measures, effectively balancing their respective
influences on the overall similarity score. This hybrid metric enables the consideration
of both the position (from Euclidean distance) and direction (from cosine similarity)
of data points within a unified distance framework. Consequently, the resulting similarity
score is more refined and comprehensive than those derived from either individual
metric alone (Mohd & Abdullah, 2018). Alternatively, a parallel approach can be applied, where both similarity metrics
are calculated independently, and N items are selected from each ranked list after
the ranking process.
Despite advancements in BIM and DSS technologies, there remains a significant gap
in comprehensive studies comparing different scoring methods used within DSS for BIM
model assessment. Existing research tends to focus either on the implementation of
BIM or the independent development of DSS, overlooking the critical influence that
various scoring methodologies have on the evaluation processes (Mattiussi et al., 2014, Liu et al., 2019). This gap underscores the need for a systematic study to evaluate and compare these
scoring methods, assessing their effectiveness and suitability across diverse application
scenarios. There is a notable gap in research regarding studies that utilize a combination
of the previously discussed scoring metrics. The potential advantages of integrating
Euclidean distance and cosine similarity for item ranking in DSS remain largely unexplored.
Existing literature indicates that, when used independently, both metrics often provide
inadequate recommendations within DSS frameworks. Despite the scaling sensitivity
of Euclidean distance and the magnitude insensitivity of cosine similarity, the two
are seldom combined. However, studies suggest that a hybrid scoring function could
offer improved similarity scores and more accurate recommendations.
3. Methodology
This study explores the application, benefits, and limitations of evaluation metrics
within a DSS for BIM models, focusing on scenarios where one metric may outperform
the other. It provides an in-depth analysis of recent advancements and modifications
in these distance metrics. Alongside highlighting the strengths of Euclidean distance
and cosine similarity, the study also addresses their limitations, particularly in
handling high-dimensional data, where challenges related to scaling, normalization,
and preprocessing significantly impact their effectiveness.
Fig. 1. Methodology for the study
The methodology (as shown in <Fig. 1>) begins with a review of DSS and relevant lit to establish evaluation criteria.
Next, evaluation metrics are selected, and the system is assessed based on these metrics.
The results are analyzed using ANOVA to identify significant differences. Finally,
the metrics are ranked, prioritizing key factors for system performance. This approach
ensures systematic evaluation and refinement of the metrics used to assess DSS or
similar systems.
Evaluation metrics alone can help identify areas of performance, but without statistical
analysis, it's difficult to say if the differences in performance are significant.
On the other hand, ANOVA helps to statistically validate whether the differences are
meaningful. Combining these tools allows for a comprehensive assessment by measuring
performance and then by ensuring that any observed differences are statistically significant.
This helps to identify which metrics show variations that are statistically meaningful.
Finally, in the last stage, the ranking of evaluation metrics is performed. This ranking
allows prioritization of the most important metrics, providing insight into which
metrics are most influential in the system's performance. The methodology, as outlined
in <Fig. 1> ensures a systematic approach to evaluating and refining the metrics used to assess
DSS.
To explore the potential advantages of the hybrid methodology, this study aims to
demonstrate how combining Euclidean and cosine similarity metrics can lead to more
accurate recommendations within a BIM recommendation system. While the actual implementation
of the recommendation system is beyond the scope of this study, the research focuses
on establishing a scoring framework that can be easily integrated into such systems.
By comparing the performance of individual scoring methods, such as Euclidean distance
and cosine similarity, against the combined scoring function, the study evaluates
the effectiveness of the hybrid approach. This comparative analysis offers valuable
insights into the flexibility of the hybrid scoring algorithm, particularly in enhancing
context-awareness and improving suggestion accuracy in recommendation systems.
Fig. 2. A list of scoring measures use in the study
<Fig. 2> illustrates the evaluation process for the methodology that leverages multiple similarity
scoring methods for data analysis. Data from a central database is processed through
different similarity metrics, including Euclidean similarity, cosine similarity, and
a combined similarity approach that integrates both. Each of these scoring methods
generates results that are then fed into an evaluation framework, where a final score
is produced. This score represents a comprehensive assessment, taking into account
both quantitative similarities between data points.
The selected metrics were chosen due to their distinct advantages. Euclidean distance
is highly sensitive to feature magnitudes, making it suitable for cases where absolute
differences between features are critical. In contrast, cosine similarity focuses
on the directional relationship between vectors, which is particularly effective when
feature magnitudes vary, but the overall direction is the key to similarity. By integrating
these two metrics, the hybrid model aims to capitalize on their complementary strengths:
Euclidean distance distinguishes items based on feature magnitude, while cosine similarity
adds context by emphasizing vector orientation.
The hybrid recommendation approach integrates Euclidean distance and cosine similarity
to improve both accuracy and contextual sensitivity. Euclidean distance captures variations
in feature magnitudes, whereas cosine similarity emphasizes the orientation of vectors,
offering a balance between absolute differences and proportional relationships. This
yields accurate recommendations. Alternative metrics, such as Jaccard similarity,
Manhattan distance, and Pearson correlation, were excluded due to their limitations
in handling complex, multidimensional data or their lack of the versatility and contextual
depth afforded by the hybrid model.
3.1 User Input Requirements
The input for DSS consists of 9 choices with multiple sub-options. The menu items
include number of floors, minimum area, maximum area, balcony, toilets per floor,
number of basements, parking type, façade type and roof type. The input range for
each input is constrained. Firstly, the number of floors varies from two to ten, the
area varies from 90㎡ to 300㎡. Secondly, the balcony option has only two choices,
yes or no for the presence or lack of balcony and the toilets per floor ranges from
3 to 5 where as the number of basement ranges from none to 4. Thirdly, there are two
available parking types, namely ground and basement and a total of 8 façade types
with only two specific roof types, that is flat and sloped.
The data type for input to the DSS is as following, the building's available space
can be anywhere from 90 square meters to 300 square meters and he option for a balcony
is a straightforward "Yes" or "No," indicating whether one is present or not. Users
can choose between 3 and 5 options for the number of restrooms per floor. There can
be as few as one basement and as many as four. While there are two distinct roof types
"flat" and "sloped", and a total of eight façade types to choose from when customizing
the building design, parking options include "ground" and "basement." This combination
of choices leads to a total of (21×2×3×5×8×2) 10,080 unique design choices.
The system is designed to recommend BIM models for office and residential building
design based on the input parameters described above. The system links user defined
preferences with the BIM model, enabling precise and context aware design recommendations.
The system guide final recommendation by presenting the most suitable design options
that align with user requirements.
Two scoring functions are used by the DSS to make design recommendations. First, it
scores and ranks the options in the database using the weighted norm (weighted Euclidean
distance), resulting in the top two design options. In order to arrange the database
entries and present the chosen design option, the RS secondly uses cosine similarity.
These two scoring methods were selected to make up for each other's shortcomings.
When the user input vector is far from the target, the weighted Euclidean distance
frequently performs well as compared to cosine similarity. Euclidean distance is given
by:
Where xu is the user input vector, xi is the input vector of ith model file in the
database, wi is the weight of a particular input feature and is the distance between
input features. This can be seen in <Fig. 5> where the distance to all points is calculated for a given point.
Fig. 3. Euclidean similarity score in a feature cluster
For the weighted Euclidean distance, a score normalization procedure is established
to ensure consistency between the scores produced by these two metrics. The scores
from various metrics are made compatible with one another by using the following normalization
procedure, which can be universally applied to other metrics if necessary.
The scoring criteria uses Eq. (3) to rank the database in order and yields the first three results to the DSS. An alternative
method for the recommendation is to utilized both the scoring metrics independently.
The pseudocode for such scoring is as
Similarly, the cosine similarity is given by the following equations.
Both the similarity scores as used to generate two ranking for the database. The evaluation
criteria return a total for three designs. The first two are returned from the Euclidean
distance ranking whereas the last one is selected from the cosine similarity ranking.
This is done so as to complement the flaws of the evaluation metrics, as the Euclidean
scoring is good at differentiating between spaced clusters of data whereas cosine
similarity gives better results for points inside the cluster. Overall, the result
from Eq. (3) gives better results and is much simpler to implement.
Fig. 5. Cosine similarity score in a feature cluster
If a user's input vector is [6 floors, 200㎡, Balcony: Yes, 4 toilets per floor, 2
basements, Parking Type: Basement, Façade Type 5, Roof Type: Flat], the DSS might
identify BIM Model A as an exact match with a distance of zero, BIM Model B as a close
second with a minimal distance, and BIM Model C as a highly similar alternative based
on cosine similarity.
Fig. 6(a). A DSS Utilizing the Scoring Metric
<Fig. 6(a)> depicts a user interface for an office space recommendation system. The interface
allows users to specify various preferences related to office building design. The
tool is intended to help users generate or select office designs based on their individual
needs. In the central section of the interface, users can input specific requirements,
such as the number of floors, the minimum and maximum area in square meters, whether
the office should include a balcony, and the number of toilets per floor. Additionally,
there is an option to specify the location of the core, referring to where essential
building elements like stairs and elevators should be placed.
The final recommendations presented to the user include three BIM model alternatives:
two from the Weighted Euclidean Distance ranking and one from the Cosine Similarity
ranking. This ensures that the user receives a comprehensive set of options that cater
to both precise and nuanced design requirements. For instance, Design Option 1 would
perfectly align with the user's specifications, while Design Option 2 offers a slight
variation in area without compromising other key parameters. Design Option 3 provides
an alternative that maintains high contextual similarity, beneficial in scenarios
where clustered design options offer additional benefits not captured by an exact
match.
Fig. 6(b). An NLP based DSS Utilizing the Scoring Metric for Recommending BIM
<Fig. 6(b)> continues the theme of the office space recommendation system and features another
DSS interface. This segment appears to be designed for clients to enter specific details
or specifications regarding the office they wish to design. There's a prompt instructing
users to "Enter specifications of office," indicating that this section is focused
on textual input. Nevertheless the evaluation metrics are utilized in this system
to achieve the same result as the DSS shown in <Fig. 6(c)>.
Fig. 6(c). Result of the DSS from hybrid scoring metric
4. Results and Discussion
This study presents an evaluation metric designed specifically for BIM models. using
a combination of both Euclidean distance and Cosine similarity metrics to deliver
a precise design recommendation. A carefully constructed uniformly sampled dataset
is established to improve the accuracy of the DSS.
The third image represents the output of a DSS in the office space recommendation
application. It displays the result in the form a modern office building with sleek
glass facades, accompanied by a description of the design. A detailed floor plan outlines
the interior layout, illustrating the arrangement of spaces. The recommendation is
generated using a hybrid scoring function that assesses user inputs against predefined
criteria, ensuring alignment with their preferences. Overall, the evaluation metrics
effectively combines client design requirements with a structured decision making
approach.
When considering client requirements for a DSS for BIM, several key evaluation indicators
must be addressed to ensure that the system meets their expectations and project needs.
First, system performance is crucial, as clients will expect the DSS to process and
analyze large volumes of BIM data efficiently, providing timely and accurate results
to support client needs. The usability of the system is another critical factor, as
clients require an intuitive and user friendly interface that allows various stakeholders,
including those with limited technical expertise, to navigate and utilize the system
easily. This also extends to collaborative features that enhance team communication.
There are some constraints and challenges to consider while creating a DSS for BIM.
One limitation is that the effectiveness of the RS is heavily dependent on the database
quality and comprehensiveness. If the database is not diverse or is not updated on
a regular basis, the result of the system may become less accurate over time. Furthermore,
while the system takes into account a wide range of input parameters, it may not capture
highly specific or niche design requirements, which may necessitate manual adjustments.
ANOVA was employed to determine if the choice of distance metric significantly affects
the DSS's ability to recommend BIM models that meet user specifications. By comparing
the means of the distances calculated by each metric, we can assess whether one metric
performs better than the others in terms of accuracy and consistency. It was conducted
to asses whether there were significant differences between the means of difference
design options generated by the DSS. This analysis is aimed to validate the effectiveness
of a hybrid scoring method in offering consistent and reliable recommendations. The
ANOVA analysis is conducted to evaluate whether there were significant differences
between the means of three independent distance metrics, that is, Euclidean distance,
Cosine similarity, and a combined distance function, in a DSS. This analysis aimed
to validate the effectiveness of a hybrid scoring method that offers consistent and
reliable recommendations. The null hypothesis (H₀) assumed that the population means
of the three distance functions were the same. Specifically, the means of the Euclidean
distance, Cosine similarity, and the combined distance function were denoted by μ₀,
μ₁, and μ₂, respectively. The ANOVA test was carried out to determine whether these
means differed significantly, comparing the p-value to a significance level (α) of
0.05.
The null hypothesis denoted by H0 shows asserts that population mean of the three
distance functions is the same. To put it mathematically:
Where, μ0, μ1, μ2 represent the mean of Euclidean distance, Cosine similarity and
the combined distance function, respectively. A simple ANOVA test is conducted to
test the H0 by comparing the p-value at α=0.05. A value of 23.28 is obtained for F-Statistic
and p-value of 7.12x1e-9. Such a large value of F-statistic and small p-value indicate
strong evidence against null hypothesis, implying the difference in mean are highly
unlikely due to random chance. The results showed an F-statistic of 23.28 and a p-value
of 7.12×10⁻⁹. Given the large F-statistic and the very small p-value, there was strong
evidence against the null hypothesis, indicating that the differences in the means
were statistically significant and highly unlikely to be due to random chance. The
data for the three metrics showed that Cosine similarity had a mean of 0.1711 with
a standard deviation of 0.1962, Euclidean distance had a mean of 0.5871 with a standard
deviation of 0.2257, and the combined distance function had a mean of 0.3790 with
a standard deviation of 0.2892.
The scoring process ultimately provided a reference to BIM models. The combination
of Euclidean distance and Cosine similarity offered a balanced method for evaluation,
merging the strengths of both metrics. While Cosine similarity excels at capturing
semantic relationships, Euclidean distance is effective at recognizing proximity in
feature space. This combined approach resulted in more refined recommendations that
could better accommodate user preferences. Therefore, the hybrid method proved to
be a better evaluation metric as validated by the ANOVA analysis.
Table. 1 Distance metric statistics
Variables
|
Mean
|
Std. Dev
|
Normalized Cosine Distance
|
0.1711
|
0.1962
|
Normalized Euclidean Distance
|
0.5871
|
0.2257
|
Normalized Combined Distance
|
0.3790
|
0.2892
|
The result of the scoring process is a reference to BIM models that can be used as
a starting point for creating a detailed site-specific BIM model. For evaluation criteria,
combining Euclidean distance and cosine similarity offers a fair method that balances
accuracy with variety. While cosine similarity is good at capturing semantic similarities,
Euclidean distance is excellent at recognizing things that are close together in feature
space. Combining these metrics makes the results more comprehensive, accommodating
user preferences. All things considered; this combination makes better decisions by
utilizing the advantages of both measurements.
Fig. 7. Box plot for distance metrics
4.1 Implications of Findings
The metrics for BIM centered DSS has demonstrated promising accuracy and effectiveness
in aligning with user preferences and requirements. The system caters to the diverse
needs and preferences of user. The effective use of both weighted Euclidean distance
and cosine similarity algorithms ensures that the recommendations are not only varied
but also closely tailored to the specific user input.
However, it is critical to recognize that the effectiveness of the DSS is heavily
dependent on the scoring algorithm, database quality and diversity. To maintain accuracy
with the chosen scoring algorithms, the database must be updated and expanded on a
regular basis. Furthermore, the DSS may encounter limitations in situations where
input features are less distinguishable, necessitating additional research to address
these challenges. The findings demonstrate that this dual-scoring method is highly
effective in generating design recommendations that closely align with user preferences,
particularly in the context of BIM.
5. Conclusion
This work explores a hybrid methodology for a BIM DSS that combines Euclidean and
Cosine similarity metrics, with the goal of improving accuracy and context-awareness.
The methodology defines scoring criteria for user input, employs both Euclidean distance
and cosine similarity scoring, and assesses their utility. ANOVA test is setup to
test they hypothesis which reveals significant differences in the means of the distance
functions, with the hybrid approach showing promise. The study emphasizes the significance
of scoring metric to provide decision assistance and address potential limitations.
The implemented DSS offers an innovative approach to early design choices for BIM
models. This aligns with the concept of informed decision making, which has been largely
overlooked in the construction industry. Because of its adaptability and integration
with various 3D design tools, it is a valuable asset for intelligent design scenarios.
This study represents an important step toward improving modular BIM design by incorporating
DSS. It emphasizes the potential of DSS to improve the architecture and construction
industries by streamlining the design process, promoting sustainability, and ensuring
user-centered design practices.
In conclusion, the dual scoring approach, which combines weighted Euclidean distance
and cosine similarity, has proven effective in providing a range of design options
that align with user preferences in terms of accuracy and effectiveness. The success
of DSS in matching client requirements with appropriate designs demonstrates its potential
to transform the BIM design process, making it more efficient, user-centric, and regulatory-compliant.
However, continuous monitoring and updating of the database and algorithms are required
to maintain and improve its accuracy over time.