华电技术 ›› 2021, Vol. 43 ›› Issue (2): 1-8.doi: 10.3969/j.issn.1674-1951.2021.02.001

• 电力数据安全 •    下一篇

结合KNN和优化特征工程的AMI通信入侵检测研究

卢官宇(), 田秀霞*(), 张悦()   

  1. 上海电力大学 计算机科学与技术学院,上海 200090
  • 收稿日期:2020-08-04 修回日期:2021-01-23 出版日期:2021-02-25 发布日期:2021-03-05
  • 通讯作者: 田秀霞
  • 作者简介:卢官宇(1997—),男,山东德州人,在读硕士研究生,从事电力终端安全、基于机器学习电网异常流量检测方面的研究(E-mail: lgy18521035808@163.com)。|张悦(1998—),女,安徽宿州人,在读硕士研究生,从事信息安全、访问控制方面的研究工作(E-mail: zy17621817238@163.com)。
  • 基金资助:
    国家自然科学基金面上项目(61772327);国家自然科学基金重点项目(61532021)

Research on AMI communication intrusion detection combining KNN and optimized feature engineering

LU Guanyu(), TIAN Xiuxia*(), ZHANG Yue()   

  1. School of Computer Science and Technology, Shanghai University of Electric Power, Shanghai 200090, China
  • Received:2020-08-04 Revised:2021-01-23 Online:2021-02-25 Published:2021-03-05
  • Contact: TIAN Xiuxia

摘要:

随着互联网技术在智能电网的广泛应用,识别电力系统中的入侵攻击行为显得尤为重要。基于高级量测体系(AMI)中的通信网络架构,根据智能电网入侵检测需求,提出了一种结合K最邻近算法(KNN)和优化特征工程的AMI通信入侵检测方案,通过数据采集、数据预处理、特征工程和模型训练4个模块识别入侵攻击流量。特征工程部分,采用文本特征提取方法对输入KNN训练模型的特征进行优化,并基于信息增益值移除冗余特征向量。模型训练部分,通过k个最近邻训练实例的标签来判断待检测数据的类型。将该方案在公开的入侵检测数据集ADFA-LD上进行测试,得到了各类入侵攻击的检测准确率。试验结果表明,该方案在检测结果性能上显著优于传统的入侵检测方案,最优特征提取下模型的分类准确率提高了21.96%。

关键词: 智能电网, 入侵检测, 高级量测体系, KNN, 特征工程, ADFA-LD, 能源互联网, 电力信息安全

Abstract:

With the widespread use of internet technology in smart grids, it is particularly important to identify intrusion attacks in power systems. Based on the communication network architecture in Advanced Metering Infrastructure(AMI), an AMI communication intrusion detection scheme combining K Nearest Neighbor (KNN) and optimized feature engineering is proposed in response to the smart grid intrusion detection requirements. Intrusion attack flow can be identified through four modules: data collection, data preprocessing, feature engineering and model training. In feature engineering module, the features inputted into KNN training model are optimized by text feature extraction method, and the redundant feature vectors are removed based on the information gain values. In model training part, the types of data are judged by the labels of the k nearest neighbour training samples. The proposed scheme was tested on public intrusion detection data sets ADFA-LD, and the detection accuracy of various intrusion attacks was obtained. The experimental results show that the detection performance of this scheme is superior to the traditional intrusion detection scheme, with an 21.96% increase in the classification accuracy under the optimal feature extraction model.

Key words: smart grid, intrusion detection, Advanced Metering Infrastructure, KNN, feature engineering, ADFA-LD, Energy Internet, power information security

中图分类号: