KimMinTae1
KimByunWook2
-
(Department of Smart Manufacturing Engineering, Changwon National University, Changwon-si,
Gyeongsangnam-do, 51140, Korea kkimlee1217@gmail.com)
-
(Department of Information and Communication Engineering, Changwon National University,
Changwon-si, Gyeongsangnam-do, 51140, Korea bwkim@changwon.ac.kr )
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Display-to-camera communication, Complementary color barcode-based optical camera communications, Deep neural network
1. Introduction
The limited available spectrum for radio frequency (RF) communications will not be
able to meet the exponentially increasing demand for wireless Internet access in the
near future, leading to a major spectrum crisis. Owing to this undeniable situation,
the introduction of new spectrum for better energy efficiency, greater connection
bandwidth, and lower usage costs for 6$^{\mathrm{th}}$ generation (6G) communications
and future Internet connections becomes very important. Optical wireless communications
(OWC) [1,2] has recently been explored as a complementary alternative to RF links for development
of heterogeneous networks that enable peer-to-peer connectivity in a cost-effective
and reliable way. OWC uses a very wide spectrum of optical domains, including infrared,
visible light, and ultraviolet domains, to establish communication performance [2].
Recently, a variety of research into D2C communications has been conducted. Wang et
al. [7,8] proposed InFrame and InFrame++, which provide full-frame communication with imperceptible
video artifacts. For this, complementary frame composition and a hierarchical frame
structure were designed. HiLight [9] uses the orthogonal transparency (alpha) channel to transmit data without the need
for coded images. DisCo [10], based on a rolling-shutter camera, executes decoding by translating a temporal sequence
into a spatial pattern. Secure barcode-based visible light communication (SBVLC) [11] considers physical security issues and the design principles of 2D barcodes to add
security features. RainBar [12] and RainBar+ [13] were designed with a high-capacity barcode layout to enable flexible frame synchronization
and code extraction in VLC systems. In [14] and [15], the SoftLight scheme, presented over screen-camera links, uses color modulation
schemes with channel coding and a $\textit{soft hint}$ for data decoding through the
barcode layout. In [16], complementary color barcode-based optical camera communications (CCB-OCC) was proposed,
where symbols are sent with carefully designed complementary color pairs that are
perceived by the human eye as a white bar, but the color pattern is detectable by
a camera. Although the CCB-OCC method was first presented in [16], it is necessary to improve the detection performance of color barcodes for practical
use.
In this paper, we propose a new CCB-OCC system to improve data rate performance using
deep neural network (DNN)-based barcode detection and adaptive color-value extraction.
The contributions of this study are twofold. First, a novel DNN network is designed
for detecting robust and seamless barcode image regions in D2C links. Second, value
extraction from color histograms was refined to mitigate the effect of D2C noise and
synchronization jitter. In the conventional CCB-OCC technique, an image-processing-based
approach was used to detect the color barcode region, but this could result in barcode
detection failure or loss of information from the barcode. In addition, fixed-peak,
position-based signal extraction used in the conventional technique cannot solve the
problem caused by synchronization jitter between display and camera. However, the
proposed method uses a DNN-based model with real-time barcode detection and adaptive
color-value extraction, which can provide a robust D2C link for data transmission.
Experimental results validated the proposed CCB-OCC system’s significant improvement
in data rate performance, and the proposed scheme can be regarded as a potential candidate
for next-generation short-distance machine-to-machine (M2M) communications.
2. System Model of CCB-OCC
In the CCB-OCC scheme, the transmitting side encodes data into color barcode sequences
and displays a consecutive packet of color barcodes on its electronic display. When
the refresh rate of the display is greater than 120Hz, these barcode images are perceived
by the human eye as a white bar. When captured with a camera, however, they are perceived
as a color barcode owing to the rolling shutter mechanism of a CMOS sensor in a camera
device. The receiving side uses a camera to capture the display and acquires consecutive
images, including the color barcodes. This visible spatial pattern in the received
image represents the data. After detecting a color barcode, packet synchronization
is performed by checking the location of pilot symbols. After that, channel estimation
obtains a D2C channel based on the color space, and the transmitted data are decoded
by using the obtained channel information. The CCB-OCC system model is presented in
Fig. 1.
Fig. 1. The system model for complementary color barcode-based OCC.
For the D2C communications scenario, the main objective of the transmitter is to encode
the transmit data into color barcodes in complementary pairs and to display them on
the monitor without noticeable artifacts visible to the human eye. For this, the display
monitor’s refresh rate should be 120Hz or higher. In addition, a bit-to-color-mapping
process converts the input bit stream into a specific color. Here, binary bits are
mapped to specific colors according to well-designed symbolic constellation structures.
Each color-mapped symbol consists of a pair of complementary colors, constructing
pilot and data symbols. Images captured by the camera of the receiving device are
sequentially stored in a memory buffer to detect continuous bit signals. To extract
the signal from the received image, it is necessary to detect the pure color barcode
area, excluding the background. Packets are composed of pilot and data symbols in
red-green-blue (RGB) colors, and the pilot symbols are retrieved through color barcode
pixel value extraction for packet synchronization and channel estimation. The wireless
optical channel between the electronic display and the camera is estimated by obtaining
each component through histogram analysis of the RGB channel, and the remaining data
symbols are decoded using the predicted channel information.
CCB-OCC uses a unique color barcode design that makes the continuous information transmitted
from the display invisible to the human eye but detectable by devices equipped with
a rolling-shutter-based camera. This is done by sequentially presenting on the display
device the complementary colors opposite each other in the hue circle structure [16]. The packet structure of the CCB-OCC scheme is defined with pilot color symbols and
data symbols with complementary color pairs. These transmit symbols are encoded within
the color barcode area of each image frame that appears through the display device.
As shown in Fig. 2, the pilot symbol consists of six consecutive frames in the RR$^{\prime}$GG$^{\prime}$BB$^{\prime}$
color pattern sequence. The remaining data symbols are transmitted as packets consisting
of complementary color pairs of data. The image frames containing this encoded data
are displayed on the screen based on the refresh rate, and the data can be decoded
by capturing successive images with the camera or receiving device. When a pair of
complementary colors appearing in a continuous image is rapidly scanned, the human
eye does not recognize the original color but perceives it as a white bar representing
the sum of the complementary colors. Using these complementary colors, the proposed
color barcode pattern is visually unobtrusive and does not interfere with the overall
quality of the display content. To minimize the effect of bit errors when converting
from decoded symbols to binary numbers, we use a gray-coding-based code constellation.
Fig. 2. Packet structure of CCB-OCC.
Fig. 3 shows the complementary color pairs in a series of images from the display. When
a color barcode on a monitor with a refresh rate of 60 Hz is received by a CMOS camera
sensor at 30$\textit{fps}$, at least two colors appear in every frame. As seen in
the figure, the RR'GG'BB' color pattern appears in which R, G, and B indicate the
primary red, blue, and green, with R$^{\prime}$, G$^{\prime}$, and B$^{\prime}$ indicating
their complementary colors. Theoretically, the number of colors observed in a color
barcode area of one received image is the ratio of the camera's incoming capture rate
to the transmission rate, i.e. the ratio of the display refresh rate to the camera
capture rate. As shown in Fig. 3, six RR'GG'BB' colors are obtained along three consecutive image frames received
by the camera, since the camera’s receive rate is half the display transmission rate.
However, if the camera capture rate and refresh rate are out of sync, three colors
may appear simultaneously in the captured image. Considering both of these cases,
it is necessary to estimate the channel and detect the signal.
Fig. 3. An example of received images when the display’s refresh rate is 60$\textit{Hz}$ and the camera’s capture rate is 30$\textit{fps.}$
3. DL-based Color Barcode Detection
D2C communications requires accurate barcode detection to ensure successful decoding
from consecutive captured images. Since multimedia information other than the color
barcode is included in the captured image, along with the external part of the display
device, it is important to accurately detect only the barcode area for accurate signal
detection. In this section, we present an approach that can reliably detect barcode
regions using deep learning technology.
3.1 Barcode Detection
There are many methods available to detect an object with bounding boxes. Among those
methods, for real-time applications, the faster region-based convolutional neural
network (Faster R-CNN) [17] and You Only Look Once (YOLO) [18] are the most popular in various research fields. Faster R-CNN is a CNN that considers
the entire input image as one candidate region. Features are extracted by pooling
one feature map generated through the learned CNN. As a method for generating a candidate
region, an object is recognized after estimating the extracted feature map as a candidate
region by applying a separate region proposal network instead of the selective search
technique.
YOLO is an algorithm that can predict an object existing in an image (and the location
of the object) by looking at the image only once. Instead of detecting it as an object
to be classified, it approaches it as a single regression problem by dividing the
bounding box in a multidimensional manner and applying class probability. The input
image is divided into a grid through the CNN, and an object in the corresponding area
is recognized by generating a bounding box and a class probability according to each
section. Because YOLO does not apply a separate network for extracting candidate regions,
it provides performance in terms of processing time that is superior to a Faster R-CNN.
Based on this, we used a YOLO v3 model as a function to detect color barcode regions
from received images. In order to train the YOLO model, a data set including labeling
for the color barcode area was built using software called YOLO Mark, which is shown
in Fig. 4.
Fig. 4. Color barcode labeling in the dataset used to train the YOLO model.
To extract the signal contained in the color barcode detected using YOLO, an additional
post-processing step is required. First, an image-difference operation must be performed
in order to identify the pure barcode area. The area within the bounding box detected
through YOLO contains not only color barcodes but also a background that includes
the display’s external parts. If this background is not removed, noise will be created
when color values are extracted through histogram analysis from each color channel
at a later stage. If only the image-difference operation is performed, the image-difference
value appears large in areas other than the color barcode due to jitter in the camera
capture process or relative movement between the display and the camera. Then, by
using the binary image as a mask filter, only the color channel values within the
barcode image area are obtained from the original image to be analyzed as a histogram.
This series of processes is shown in Fig. 5.
Fig. 5. Processing detected color barcodes into histograms of color channels.
Fig. 6. Varying the peak position when obtaining a representative value of the color barcode.
3.2 Color-value Extraction
In the D2C communications link, a rolling shutter effect is observed in the received
image due to the difference between the scan rate of the display device and the capture
rate of the camera. This creates a barcode with multiple colors in one line-scan area
in the captured image. The proposed technique using this phenomenon can obtain representative
values of color barcodes containing transmission information through histogram analysis.
After checking the histogram information of the color area and the complementary color
area in the color barcode area, we find the component showing the maximum value, and
select it as the color value for the color barcode. The result becomes a multi-color
barcode on one line of the captured image. Considering this phenomenon, the proposed
method can obtain a representative value of a color barcode through histogram analysis.
Even if a single color barcode contains multiple color patterns, we can find the representative
value that is closest to the transmitted signal.
In the D2C communications environment, the rolling shutter effect is observed in the
received image due to the difference between the refresh rate of the display device
and the capture rate of the camera. In particular, a color barcode is generated in
which a plurality of colors appears continuously in the horizontal direction of the
captured image. Although there are multiple colors in a color barcode pattern, it
is necessary to obtain a representative value of the color barcode through histogram
analysis from continuous color change. This can be done by finding the symbol value
that most closely resembles the transmitted signal. For this, it is necessary to find
the larger value among the color components representing the maximum value of the
histogram of the RGB channel and the one representing the complementary color, and
to select it as the representative color value in the corresponding color barcode.
In addition, when extracting color barcode values using histogram information, the
peak position used for obtaining a representative value of the color barcode is gradually
changed. In the conventional CCB-OCC scheme [16], the peak position is set at a fixed value in an individual packet. Although, in
theory, the refresh rate of the display device should be an integer multiple of the
camera frame rate in order to obtain the rolling shutter effect, an exact relationship
is not established in practice. As shown in Fig.~6, the peak position for pilot signal
detection was 192 in the n-th packet; however, a peak position of 231 may appear in
the next packet. In this case, since synchronization jitter occurs between the two
consecutive packets, a peak position from which a color barcode representative value
can be extracted is set to be changed during the period when multiple symbols are
received. Through this, it is possible to more accurately extract the value of the
color barcode through histogram analysis, and ultimately improve signal detection
accuracy.
3.3 Optical Channel Estimation and Data Decoding
The start position of the packet can be determined by finding the combination of RR$^{\prime}$GG$^{\prime}$BB$^{\prime}$
in successive frames stored in the camera receiver buffer, and the pilot and data
symbols within the packet can be obtained according to the determined packet length.
Unlike the other frames, the color barcode area of the pilot frame is dominated by
pure red, green, and blue colors. Therefore, the value of the RGB component in the
barcode area of \hspace{0pt}\hspace{0pt}the pilot symbol packet is determined through
histogram analysis and is used as the value of the pilot symbol for channel estimation.
In terms of transmitting information, different electronic display types have different
color appearance and brightness distributions. Receivers with different lenses and
camera sensors can have different structures for processing color features. The reason
that the pilot frame is used in all packets is to obtain channel information between
the display and the camera by understanding the relationship between the transmitted
RGB value and the received RGB value. A D2C channel matrix can be obtained from an
inverse matrix operation by transmitting a predefined signal color through a pilot
frame. Using the estimated channel obtained from the received color barcode area and
the pixel values obtained from the data packet frame, the red, green, and blue values
affected by the wireless optical link between the display device and the camera are
corrected, and we then move on to the data decoding procedure.
By analyzing the histogram for RGB components in the data frame of the packet, the
peak position at each RGB component can be observed. By multiplying the inverse of
channel matrix H to the observed color vector, the value of the intended color symbol
sent by the display device can be estimated. The result of a barcode value composed
of RGB components is mapped to a bit-color constellation, and data decoding is performed
by calculating the minimum Euclidean distance from a symbol predefined in the constellation.
In this way, consecutive data symbols within that packet can be decoded using the
estimated channel matrix.
4. Experimental Result
In this section, we discuss the extensive experiments with YOLO-based barcode detection.
The representative-value position-correction technique proposed in this study can
more accurately detect a color barcode in the captured image. Note that a distance
of more than 110cm will result in distortion and the blur effect, making it more difficult
to extract the precise point of channel estimation and decoding from the RGB histogram.
As seen in Fig. 7, we investigated the impact of view angle on the achievable data rate (ADR) of the
proposed CCB-OCC scheme. Here, the distance parameter was set to 90cm. Table 1 presents the ADR based on the yaw angle. As can be seen, when the yaw angle between
the display and the camera lies between -20 and 20 degrees, the proposed scheme presents
no significant difference in the data rate. This is the opposite result from the existing
technique’s performance, which differs depending on the angle. Because the conventional
technique extracts a color barcode using the image processing method, if the yaw angle
is large, the barcode image region becomes too thin, causing color information loss.
But the proposed YOLO-based technique reliably detects the color barcode region through
bounding box detection even at a large yaw angle, so it can extract representative
color values more accurately. Therefore, the performance of the proposed method is
not significantly different, depending on the view angle, but presents a better data
rate than the existing method.
Fig. 7. Experimental environment with varying distances and angle values.
Fig. 8. Performance from the achievable data rate according to D2C distance.
We evaluated the performance of the CCB-OCC scheme using YOLO-based barcode detection.
Our receiving device was a Samsung Galaxy S9, which is a common Android smartphone
equipped with a standard camera featuring 1920${\times}$1080 resolution at a 30fps
video capture rate. At the transmitter, the resolution of the electronic display was
1920${\times}$1080@60Hz, and experiments were conducted indoors under normal lighting
conditions. In the experimental environment, as shown in Fig. 8, the performance can be measured by changing the distance between the display and
the camera (to 110cm from 90cm) and the view angle (from -20 to 20 degrees). To verify
the performance of the proposed method, it was compared with results from the conventional
CCB-OCC scheme [16].
Fig. 8 shows the achievable data rate from the proposed CCB-OCC scheme based on the distance
from display to camera. As the distance increases, the resolution of the captured
color barcode image decreases, the noise effect increases, and the channel estimation
and data decoding accuracy decrease. As can be seen, the proposed scheme outperformed
the conventional scheme when the distance was between 90cm and 110cm. By automatically
extracting the color bar code area through the YOLO model and with image post-processing,
it is possible to acquire pilot and data signals more precisely than with manual barcode
detection used in the existing technique. The proposed scheme provided a maximum data
rate of 79.7bps and a minimum rate of 76.2bps in the 90-110cm range.
In Table 1, we can see that the proposed CCB-OCC scheme showed better achievable data rates
than the conventional scheme at various angles. Even if the yaw angle between the
camera and display was different by 20 degrees, the proposed scheme provided a data
rate of 80.7bps or more. The YOLO model, which guarantees real-time object detection,
can guarantee robust data rate performance by stably detecting color barcode images
taken from various angles.
Table 1. Achievable data rate based on yaw angle.
Yaw angle
[degrees]
|
ADR of conv.
CCB-OCC
[bps]
|
ADR of prop.
CCB-OCC
[bps]
|
-20
|
67.2
|
80.9
|
-10
|
72.8
|
78.0
|
0
|
77.6
|
79.75
|
10
|
73.0
|
79
|
20
|
71.3
|
80.7
|
5. Conclusion
In this paper, we designed and implemented a new CCB-OCC system for data rate improvement
using DNN-based barcode detection and adaptive color extraction. We introduced a DNN-based
barcode detection concept where YOLO v3 was used to detect a color barcode without
losing information from within the barcode region. In addition, a color-value extraction
scheme obtained symbol values from color histograms in an adaptive manner to compensate
for synchronization jitter. The experimental results proved that the proposed CCB-OCC
scheme outperforms the existing CCB-OCC scheme for various distances and angles in
D2C communications links.
ACKNOWLEDGMENTS
This research was supported by a National Research Foundation of Korea (NRF) grant
funded by the Korean government (NRF-2022R1A2B5B01001543).
REFERENCES
Rahaim M., Little T.D.C., Sept. 2017, Interference in IM/DD optical wireless communication
networks, IEEE/OSA Journal of Optical Communications and Networking, Vol. 9, pp. d51-D63
Al-Kinani A., Wang C., Zhou L., Zhang W., thirdquarter 2018, Optical Wireless Communication
Channel Measurements and Models, IEEE Communications Surveys & Tutorials, Vol. 20,
No. 3, pp. 1939-1962
Chen C., Zhong W., Yang H., Du P., 15 Feb.15, 2018., On the Performance of MIMO- NOMA-Based
Visible Light Communication Systems, IEEE Photonics Technology Letters, Vol. 30, No.
4, pp. 307-310
Memedi A., Dressler F., Firstquarter 2021, Vehicular Visible Light Communications:
A Survey, IEEE Communications Surveys Tutorials, Vol. 23, No. 1, pp. 161-181
Luo et al. P., Oct. 2015, Experimental Demonstration of RGB LED-Based Optical Camera
Communications, IEEE Photonics Journal, Vol. 7, No. 5, pp. 1-12
Kim S. J., Lee J. W., Kwon D. -H., Han S. -K., Oct. 2018, Gamma Function Based Signal
Compensation for Transmission Distance Tolerant Multi-level Modulation in Optical
Camera Communication, IEEE Photonics Journal, Vol. 10, No. 5, pp. 1-7
Wang A., Peng C., Zhang O., Shen G., Zeng B., 2014, InFrame: Multiplexing full-frame
visible communication channel for humans and devices, in Proceedings of the 13th ACM
Workshop on Hot Topics in Networks, Los Angele,s USA
Wang A., Li Z., Peng C., Shen G., Fang G., Zeng B., 2015, InFrame++: Achieve simultaneous
screen-human viewing and hidden screen-camera communication, in Proceedings of the
13th International Conference on Mobile Systems, Applications and Service, Florence,
Italy
Li T., An C., Campbell A. T., Zhou X., 2014, HiLight: Hiding bits in pixel translucency
changes, ACM Workshop on Visible Light Communication Systems, Maui, Hawaii, USA
Jo K., Gupta M., Nayar S. K., 2016, DisCo: Display-Camera Communication Using Rolling
Shutter Sensors, ACM Transactions on Graphics, Vol. 35, No. 5, pp. 1-13
Zhang B., Ren K., Xing G., Fu X., Wang C., 2016, SBVLC: Secure barcode- based visible
light communication for smartphones, IEEE Transactions on Mobile Computing, Vol. 15,
No. 2, pp. 432-446
Wang Q., Zhou M., Ren K., Lei T., Li J., Wang Z., Columbus, OH, USA, 2015, Rain Bar:
Robust Application-Driven Visual Communication Using Color Barcodes, in Proceedings
of IEEE 35th International Conference on Distributed Computing Systems, pp. 537-546
Zhou M., Wang Q., Lei T., Wang Z., 2018, Enabling Online Robust Barcode-Based Visible
Light Communication with Realtime Feedback, IEEE Transactions on Wireless Communications,
Vol. 17, No. 12, pp. 8063-8076
Du W., Liando J. C., Li M., San Francisco, CA, USA, 2016, Softlight: Adaptive visible
light communication over screen-camera links, in IEEE INFOCOM, pp. 1620-1628
Du W., Liando J. C., Li M., 2017, Soft Hint Enabled Adaptive Visible Light Communication
over Screen-Camera Links, IEEE Transactions on Mobile Computing, Vol. 16, No. 2, pp.
527-537
Jung S.-Y., Lee J.-H., Nam W., Kim B. W., 2020., Complementary Color Barcode-Based
Optical Camera Communications, Wireless Communi- cations and Mobile Computing, Volume
2020, Article ID 3898427
Yang Y., Gong H., Wang X., Sun P., 2017., Aerial Target Tracking Algorithm Based on
Faster RCNN Combined with Frame Differencing, Aerospace, Vol. 4, No. 32
Redmon J., Divvala S., Girshick R., Farhadi A., 2015, You Only Look Once: Unified,
Real-time Object Detection, IEEE Conference on Computer Vision and Pattern Recognition,
pp. 779-788
Author
Min Tae Kim received the B.S. degree from Department of Information and Communication
Engineering, Changwon National University, Changwon, South Korea, in 2021, and in
progress the M.S degrees from the Department of Smart Manufacturing Engineering, Changwon
national University. His research interests include visible light communications,
machine learning, and deep learning.
Byung Wook Kim received a B.S. from the School of Electrical Engineering, Pusan
National University, Pusan, South Korea, in 2005, and an M.S. and a Ph.D. from the
Department of Electrical Engineering, KAIST, Daejeon, South Korea, in 2007 and 2012,
respectively. He was a Senior Engineer with KERI, Changwon-si, South Korea, from 2012
to 2013. He was an Assistant Professor with the School of Electrical Engineering,
Kyungil University, Gyeongsan-si, South Korea, from 2013 to 2016. He was an Assistant
Professor with the Department of ICT Automotive Engineering, Hoseo University, from
2016 to 2019. He is currently an Assistant Professor with the Department of Information
and Communication Engineering, Changwon National University, Changwon-si, South Korea.
His research interests include visible light communications, machine learning, and
deep learning.