Title |
A Study on the Classification of Theft using K-modes Clustering |
Authors |
권나연(Kwon, Na-Yeon) ; 권은서(Kwon, Eunseo) ; 정성원(Jung, Sungwon) |
DOI |
https://doi.org/10.5659/JAIK.2020.36.8.81 |
Keywords |
Data Mining; K-modes Clustering; Crime Prevention; Theft; Smart city |
Abstract |
Data mining is receiving attention as a way to derive useful knowledge and patterns from crime data. Among the data mining techniques,
clustering is utilized in the criminal field mainly to analyze hot spots or occurrence patterns. However, most research is concluded at the
clustering stage and thus there is a lack of works that examine the relationship between the derived cluster and the surrounding environment.
As precedent research has shown that the factors in the occurrence of crimes are not only attributed to individual characteristics but also
reflect the environmental characteristics of an area, there is a need for research that goes further than simply deriving clusters to analyze the
relationships between clusters and other environmental factors. Of these environmental factors, land usage is a basic tool and a result of
urban planning. Therefore, clarifying the relationship between land usage and crime could provide basic data for crime prevention through the
improvement and management of urban spaces from an urban planning perspective. This research uses k-modes clustering to categorize
incidences of theft and then analyzes the derived space and time distribution pattern of crime types by land usage utilizing a geographic
information system (GIS). Dongjak-gu, which has a relatively low safety level among the areas of Seoul, was selected as the location for
analysis, and data on thefts from 2004 to 2015 were used. Repeating the analysis 1,000 times on each k value from the k-modes clustering
showed that there were four types of theft cluster in Dongjak-gu. In order to analyze the correlation between each cluster and land usage, a
regression analysis was conducted on the land usage variables in Dongjak-gu and the clustering data. The results showed that thefts that
occurred in Dongjak-gu could be categorized into four types. Cluster1 contained miscellaneous thefts that mainly occurred in commercial
facilities at night and targeted males, and it had the most significant relationship with commercial land. Cluster2 consisted of housebreaking
thefts that mainly occurred in the morning and targeted women, and it had the most significant relationship with type 2 general residential
areas. Cluster3 contained street thefts, mainly related to automobiles in the early morning and targeting men, and it had the most significant
relationship with commercial land. Cluster4 was made up of miscellaneous thefts in the afternoon mainly targeting men and it was the only
cluster to have a significant relationship with school and gas station land. These results can be used to contribute to strengthening crime
prevention measures to sufficiently respond to various types of theft, and as theoretical grounds to determine areas vulnerable to crime and
set protection spaces in the urban planning and design stages. |