A wide range of traffic classification approaches has been proposed in the last few years by the scientific community. However, the development of complete classification architectures that work directly in real-time on high capacity links is limited.

In this page you can download the implementation of the classification procedure of two machine-learning techniques (Naive Bayes with Kernel density estimation and single-class SVM) and a port-based classifier, that uses the port-application mapping file of CoralReef (developed by CAIDA). The two machine-learning classification approches have been chosen so as to represent two extremes in terms of computational complexity. Naive Bayes adopts simple protocol models and it is low-computational, while SVM-based classifier employs more accurate models and shows higher computational requirements. The classifiers are developed adopting the CoMo project infrastructure.

The sources you can download from this page are presented in A. Este, F. Gringoli and L. Salgarelli, On-line SVM traffic classification, TRAC 2011, 2nd International Workshop on TRaffic Analysis and Classification, Jul. 5-8, 2011 (to appear).

For an overview of the CoMo project, please refer to CoMo web page.

Download

CoMo: como-1.5
Modification to CoMo: como.patch.

Feature module to extract features from flows (grouping packets according to uni-directional definition of flow).
Port-based module to classify flows with PORT based procedure.
Bayes classification module to classify flows with NAIVE BAYES algorithm
SVM classification module to classify flows with SVM algorithm
Example of classifier that shows the functions to be implemented for adding a new classifier.
README file with notes for compiling and using the files.

Port-based classifier requires the Application_ports_Master.txt port-application mapping file of Coral Reef tool.
Bayesian and SVM classifiers require models achieved during a training phase. Example of file format of model file for Bayesian classifier and example of model file for SVM classifier we achieved after a training phase on the features of flows extracted from CAIDA traces (year 2002, pre-classified by ports). The set of features includes: the packet size (IP-level) of the first six packets, the transport protocol and the transport ports.

Example

In the following Figure we show an example of the ouput. Each line corresponds to a flow identified by the starting timestamp, source IP, destination IP, source port, destination port. We also report the duration, the number of packets and bytes that the system received up to the time the flow is stored with the classification verdicts. The last three columns report the classification output of the port-based, single-class SVM and naive bayesian classifiers. The evaluated trace is an anonymized CAIDA trace (year 2002), source and destination IP addresses have not the original values (as appear in Figure).

SVM and BAYES classifier create a thread for each protocol, exploiting efficiently multi-core architectures.