Kanishka Bhaduri

Member since: Sep 24, 2010, Mission Critical Technologies Inc

Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data

Shared by Kanishka Bhaduri, updated on May 05, 2011


Author(s) :
Kamalika Das, Kanishka Bhaduri, Petr Votava

There has been a tremendous increase in the volume of sensor data collected over the last decade
for different monitoring tasks. For example, petabytes of earth science data are collected from modern
satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data
is downloaded for different commercial airlines. These different types of datasets need to be analyzed
for finding outliers. Information extraction from such rich data sources using advanced data mining
methodologies is a challenging task not only due to the massive volume of data, but also because these
datasets are physically stored at different geographical locations with only a subset of features available
at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth.
To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the
entire data without moving all the data to a single location. The method we propose only centralizes
a very small sample from the different data subsets at different locations. We analytically prove and
experimentally verify that the algorithm offers high accuracy compared to complete centralization with
only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth
sciences and aeronautics by describing applications in these domains. The performance of the algorithm
is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a
simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).

show more info
Publication Name
Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data
Publication Location
Statistical Analysis and Data Mining Journal
Year Published


273.4 KB 213 downloads


Add New Comment

Kanishka's Projects (4)

Need help?

Visit our help center