Escolar Documentos
Profissional Documentos
Cultura Documentos
1
Department of Software and Information Systems Engineering, Ben-Gurion University, Beer-Sheva, Israel
2
Singapore University of Technology and Design, Singapore
Specific Device
Make and Model
Number of 3.2 Model Training
Device Type Type TCP Sessions Let D be the set of known devices (i.e., devices that we want
Baby Monitor IoT Beseye Baby Monitor Pro 2,072 to be able to identify based on their traffic). Deriving the
Motion Sensor IoT Wemo F7C028uk 254
Printer IoT HP Officejet Pro 6830 70 device identification model consists of the following steps.
Refrigerator IoT Samsung RF30HSMRTSL 7,008
Security Camera IoT Withings WBP02/WT9510 980
Induce single-session binary classifier. For each di D
Socket IoT Efergy Ego 342 we induce a single-session binary classifier, denoted by Ci ,
Thermostat IoT Nest Learning Thermostat 3 6,353 that given a feature vector of a single session (denoted by
TV IoT Samsung UA55J5500AKXXS 4,854
Smartwatch IoT LG Urban 687
s), outputs a posterior probability psi that the session was
PC Non-IoT Dell Optiplex 9020 3,138 generated by device di . The single-session classifiers C are
Laptop Non-IoT Lenovo X260 4,907 obtained using the DSs dataset. For training Ci for de-
Smartphone Non-IoT LG G2 2,178
Smartphone Non-IoT Galaxy S4 643
vice di we derive binary labels, such that all feature vectors
of sessions that belong to di are labeled as di , and feature
vectors of sessions that do not belong to di are labeled as
and port numbers, from SYN to FIN). Then, each session other. Thus, given an unlabeled feature vector extracted
was represented by a vector of features from the network, from a session s, we can apply all single-session classifiers C
transport, and application layers, and enriched with pub- to obtain a vector of posterior probabilities (ps1 , . . . , psn ).
licly available data such as Alexa Rank [1] and GeoIP [2]. Determine optimal thresholds for single-session clas-
sifiers. For each classifier Ci we determined the optimal
Data Partitioning. After constructing the labeled dataset, classification threshold (cutoff value), denoted by tri , for la-
we chronologically divided it into three mutually exclusive beling a given session s with probability psi as di or other.
sets. The first set, denoted DSs , is used for inducing a set We used the DSm dataset to evaluate the performance of C,
of single-session classifiers. The second set, denoted DSm , the set of single-session classifiers, and for setting the thresh-
is used for optimizing the parameters of a multi-session clas- old values of the classifiers. Each optimal threshold tri was
sifier. As practiced in machine learning research, the third selected such that it maximizes the accuracy of classifier Ci .
set was used as a test set (denoted DStest ) for evaluating
our proposed method and deriving performance measures. Determine optimal session sequence size si for each
classifier. Here we derive the optimal session sequence size
si for each classifier Ci that is used for defining the multi-
3. PROPOSED METHOD FOR IOT DEVICE session classifier. First, for each IP (device) in DSm we
apply the set C of single-session classifiers to all session fea-
IDENTIFICATION ture vectors for obtaining classifications. Then we utilize tri
We propose a multi-stage process in which a set of machine and DSm to analyze the classification results of each opti-
learning based classifiers are applied to a stream of sessions mized classifier. Afterwards we look for the minimal number
that originate from a specific device (i.e., a specific IP ad- of consecutive session classifications, based on which a ma-
dress). The goal is to determine whether the traffic belongs jority vote will provide zero false positives and zero false
to a PC, a smartphone or a specific (known) IoT device. negatives on the entire DSm . We denote this number by si
and refer to it as the optimal size of the moving window.
3.1 Notation The lower si is for a given di , the smaller number of consec-
The notation we use to describe our method and the means utive sessions we need to accurately determine whether the
of evaluating it are summarized below. sessions that emanated from an IP were generated by di or
not. Algorithm 1 describes how si is calculated.
D: Set {d1 , . . . , dn } of known IoT devices.
DSs : Dataset for inducing single-session classifiers. To conclude, for every device di we have a classifier Ci with
Ci : Single-session classifier for di , induced from DSs . threshold tri , and upon a majority voting on its si consec-
tri : Optimal classification threshold for Ci . utive classifications we can determine whether sessions that
DSm : Dataset, sorted in chronological order, for inducing
emanated from a given IP were generated by di with 100%
multi-session based classifiers.
DSm i : Subset of sessions in DSm , origin device di . accuracy. Note that it was easy to differentiate between IoT
DSmi [a]: The ath session, originating from di in DSm i . devices and PCs and smartphones based on a single session
|DSmi |: The number of sessions in DSm i . (see Section 4), so the rest of the discussion in this section
psi : Posterior probability of a session s to originate will focus on identifying the specific IoT device. Table 2
from di ; derived by applying Ci to session s. presents the performance of the single-session classifiers af-
si : The optimal (minimal) size of a sequence of ses-
ter being optimized with tri and their optimal si .
sions for which Ci classifies correctly most sessions
in any sequence of sessions of size si in DSm .
Sd: Sequence of sessions originating from device d. 3.3 Application for Device Identification
C: Set {(C1 , tr1 , s1 ), . . . , (Cn , trn
, s )} of single-session
n Algorithm 2, our final classification algorithm, is based on C:
classifiers for devices in D with optimal thresholds
tri and sequence sizes si . the trained classifiers and their corresponding parameters
DStest : Dataset used for evaluating the proposed method (C1 , tr1 , s1 ), . . . , (Cn , trn , sn ). The classification algorithm
(sorted in chronological order). receives a stream of session vectors that emanated from an
i
DStest : Subset of DStest , originating from device di . IP and were generated by an unknown device. It checks if
Algorithm 1: Calculating si Algorithm 2: IoT device classification
1: procedure findSiStar(D, DSm , Ci ) 1: procedure classifydevice(C, S d )
2: si 1 2: Sort C by ascending si
3: for dj in D do 3: for (Ci , tri , si ) in C do
j
4: DSm subset of DSm with origin dj 4: a1
5: a1 5: n0
6: s1 6: while a + si 1 <= |S d | do
j
7: while a + s 1 <= |DSm | do 7: for sess in {S d [a], ..., S d [a + si 1]} do
8: n0 8: psi classify(Ci , sess)
j j
9: for sess in {DSm [a], . . . , DSm [a + s 1]} do 9: if psi tri then
10: psi classify(Ci , sess) 10: nn+1
11: if psi > tri then 11: if n > si /2 then
12: nn+1 12: return di
13: if i = j and n > s/2 then 13: else
14: aa+1 14: aa+1
15: else 15: return unknown
16: a1
17: ss+2
18: if si < s then evaluating the performance for IoT device identification. We
19: si s note that Algorithm 2 is optimized to derive the type of an
20: return si IoT device by analyzing a minimal number of consecutive
sessions. In a worst case scenario it needs to analyze max(si )
consecutive sessions. To properly evaluate the performance
Table 2: Single-session based classifier performance of our method we reran Algorithm 2 multiple times, and each
time we omitted the first session of the sequence from the
IoT Device tr Method FNR FPR s previous run. This was done to compensate for a possible
Printer 0.35 GBM 0.3 0 11 bias that may occur when the sequence begins with differ-
Sec. Camera 0.5 Random Forest 0 0 1 ent sessions. Given dataset DStest sorted in a chronological
i
Refrigerator 0.2 XGBoost 0.001 0.001 3 order, let DStest be a subset of sessions in DStest originat-
i
Motion Sensor 0.2 XGBoost 0.012 0 3 ing from di , and let DStest [a] be the ath session originating
i
Baby Monitor 0.3 XGBoost 0.006 0 9 from di in DStest . We used DStest for evaluating the pro-
Thermostat 0.2 Random Forest 0.011 0.004 45 posed method, and for each device di D we repeated the
TV 0.1 GBM 0.026 0.001 23 evaluation by applying Algorithm 2 (i.e., the trained model)
i
Smartwatch 0.8 XGBoost 0.184 0 77 on all of the subsequences of the sessions in DStest , starting
Socket 0.25 Random Forest 0 0 1 i
from session a {1, . . . , |(DStest )| si + 1} and ending at
a + si 1 (with maximal value a + si 1 = |(DStest i
)|). so,
for each device di D we repeated the evaluation as follows:
the stream of sessions was generated by device di by clas-
sifying using Ci with si consecutive sessions, and checking
whether most of the si sessions were classified as di . In order i
1: for a in {1, . . . , (|(DStest )| si + 1)} do
to optimize the search for the device, the device inspection 2: d i
s {DStest [a], . . . , DStest i
[a + si 1]}
order is determined by si , so the algorithm starts to inspect 3: CLASSIFYDEVICE (C, sd )
devices with the lowest si , and continues with ascending or-
der of si . A possible modification is to take into account
also the prior probability of a device being observed. As seen in Table 3, the classification accuracy on DStest was
high. Out of 7,376 test cases (each defined by the first session
in the sequence) 19 cases were misclassified and 34 were
4. EVALUATION unclassified, thus the total accuracy was 99.281%. Note that
We evaluate our method using the third dataset, DStest . the classification accuracy on DStest was not 100%, so we
The results indicate that by analyzing network traffic we executed Algorithm 1 (previously run on DSm ) once again,
can distinguish between IPs that belong to IoT devices and this time on DStest . We then compared si s obtained from
IPs that belong to PCs and smartphones. Smartphones were DSm to si s obtained from DStest . Classification accuracy
classified by analyzing the user agent HTTP property, and measures on DStest , plus the recalculated si s, are presented
thus the classification accuracy for smartphones was 100%. in Table 4. We note that the required si for perfect results
The classification of PCs was performed by classifying a ses- for all devices in DStest should be higher. For perfect results
sion by a single-session classifier. The performance for PCs on DStest we recommend using an si which is 4.333 times
was almost perfect (a very low false positive rate of 0.003 higher than the ones computed by Algorithm 1 on DSm .
and a very low false negative rate of 0.003).
Having accurately classified smartphones and PCs, we ap-
plied Algorithm 2 (IoT device classification) on DStest for 5. RELATED WORK
ergy consumption) and provide a very preliminary proof of
Table 3: Accuracy (Algorithm 2) on DStest concept of their algorithm based on simulations.