Title A Study on the Classification of Theft using K-modes Clustering
Authors 권나연(Kwon, Na-Yeon) ; 권은서(Kwon, Eunseo) ; 정성원(Jung, Sungwon)
DOI https://doi.org/10.5659/JAIK.2020.36.8.81
Page pp.81-90
ISSN 2733-6247
Keywords Data Mining; K-modes Clustering; Crime Prevention; Theft; Smart city
Abstract Data mining is receiving attention as a way to derive useful knowledge and patterns from crime data. Among the data mining techniques, clustering is utilized in the criminal field mainly to analyze hot spots or occurrence patterns. However, most research is concluded at the clustering stage and thus there is a lack of works that examine the relationship between the derived cluster and the surrounding environment. As precedent research has shown that the factors in the occurrence of crimes are not only attributed to individual characteristics but also reflect the environmental characteristics of an area, there is a need for research that goes further than simply deriving clusters to analyze the relationships between clusters and other environmental factors. Of these environmental factors, land usage is a basic tool and a result of urban planning. Therefore, clarifying the relationship between land usage and crime could provide basic data for crime prevention through the improvement and management of urban spaces from an urban planning perspective. This research uses k-modes clustering to categorize incidences of theft and then analyzes the derived space and time distribution pattern of crime types by land usage utilizing a geographic information system (GIS). Dongjak-gu, which has a relatively low safety level among the areas of Seoul, was selected as the location for analysis, and data on thefts from 2004 to 2015 were used. Repeating the analysis 1,000 times on each k value from the k-modes clustering showed that there were four types of theft cluster in Dongjak-gu. In order to analyze the correlation between each cluster and land usage, a regression analysis was conducted on the land usage variables in Dongjak-gu and the clustering data. The results showed that thefts that occurred in Dongjak-gu could be categorized into four types. Cluster1 contained miscellaneous thefts that mainly occurred in commercial facilities at night and targeted males, and it had the most significant relationship with commercial land. Cluster2 consisted of housebreaking thefts that mainly occurred in the morning and targeted women, and it had the most significant relationship with type 2 general residential areas. Cluster3 contained street thefts, mainly related to automobiles in the early morning and targeting men, and it had the most significant relationship with commercial land. Cluster4 was made up of miscellaneous thefts in the afternoon mainly targeting men and it was the only cluster to have a significant relationship with school and gas station land. These results can be used to contribute to strengthening crime prevention measures to sufficiently respond to various types of theft, and as theoretical grounds to determine areas vulnerable to crime and set protection spaces in the urban planning and design stages.