Efficient Cyber Attack Detection in Industrial Control Systems Using Lightweight Neural Networks and PCA
论文地址:Efficient Cyber Attack Detection in Industrial
摘要
Industrial control systems (ICSs) are widely used and vital to industry and society. Their failure can have severe impact on both the economy and human life. Hence, these systems have become an attractive target for physical and cyber attacks alike. In this paper, we examine an attack detection method based on simple and lightweight neural networks, namely, 1D convolutional neural networks and autoencoders. We apply these networks to both the time and frequency domains of the data and discuss the pros and cons of each representation approach.The suggested method is evaluated on three popular public datasets, and detection rates matching or exceeding previously published detection results are achieved, while demonstrating a small footprint, short training and detection times, and generality.We also show the effectiveness of PCA, which, given proper data preprocessing and feature selection, can provide high attack detection rates in many settings. Finally, we study the proposed method’s robustness against adversarial attacks that exploit inherent blind spots of neural networks to evade detection while achieving their intended physical effect. Our results show that the proposed method is robust to such evasion attacks: in order to evade detection, the attacker is forced to sacrififice the desired physical impact on the system.
Introduction
Industrial control systems (ICSs)又名supervisory control and data acquisition (SCADA) systems,很重要又很容易受攻击。随着remote connectivity的兴起,cyber threats越来越多。利用ML建模是一个比较通用的办法,目前有三种:supervised,semi-supervised,unsupervised。supervised贴标签费时又费力还有种种限制,semi-supervised和unsupervised方法被很多人研究。
但这些研究,有以下几个问题,第一,single dataset;第二,这些研究几乎没有解决过对输入数据进行正确的预处理和进行特征选择,有的选择了a subset of features,有的对不同特征有不同detection mechanism,缺乏统一的、系统的特征选择定量标准;第三,不通用,没有频域分析,没有考虑对抗攻击。
Problem Description
threat model:an attacker that can only falsify some of the information(伪造sensory data,给actuator发送指令,控制network traffic),并且这个attacker知道我们的attack detection(AD)的存在并且想逃避detection。
the following research questions:
- Can we propose an effective and accurate ICS anomaly detection method based on lightweight neural networks or PCA? (基于lightweight neural networks or PCA)
- Is the proposed method generic and effective across multiple environments and datasets?(通用性,多数据集,多环境)
- What quantitative criteria should be used for anomaly detection feature selection?(quantitative criteria的选择)
- Does detection in the frequency domain provide any benefifits (i.e., a better detection rate with fewer false alarms) compared to detection in the time domain?(frequency domain优于time domain)
- How robust are the proposed NN architectures to adversarial machine learning attacks?(鲁棒性,对抗攻击)
Kolmogorov–Smirnov test(K-S检验)
可关注这篇博客:柯尔莫可洛夫-斯米洛夫检验(Kolmogorov–Smirnov test,K-S test
总体上来说,是为了反应test data 和 train data之间是否有显著性差异。
如果$D_{n}$趋近于0,说明分布特征相同。
我的批注:简单来说,就是在一个集合里面,以一个数为基准,看比他小的有多少;将这个数看成x写成一个函数作为分布;$D_{n}$就是比较两个分布的情况。
在本文中,使用了一种优化的K-S*方法。它将$D_{n}=\mathop{sup}\limits_{s}|F_{n}(x)-F(x)|$换成了:
经过实际的数据集验证,分布特征偏差不大的倍率减小,增加了稳定性。
关于python中的scipy.stats的ks_2samp函数,
其中statistic为下文的$D_{n}$,pvalue为下文的$D_{crit,0.05}$。
数学上的解释是:
Two-sample KS test
给定两组样本,检测是否他们的分布是否一样?
H0:两组样本的分布一样
H1:两组样本的分布不一样
仍然举例来说明。假设我们有以下两组样本:
X:1.2, 1.4, 1.9, 3.7, 4.4,4.8,9.7,17.3,21.1,28.4
Y:5.6,6.5,6.6,6.9,9.2,10.4,10.6,19.3
先把两组样本合在一块,进行排序,再计算culumative emperical cdf。
K-S statistic(检验统计量)仍然是 $D_{n}=max|F_{exp}(x)-F_{obs}|$
这里 $D_{n}=0.6$
对于两样本,95%的critical value的计算公式为:
$D_{crit,0.05}=1.36\sqrt{\frac{1}{n_{x}}+\frac{1}{n_{y}}}=0.645$
因为0.6<0.645,所以我们不拒绝原假设。