结合KNN和优化特征工程的AMI通信入侵检测研究

doi:10.3969/j.issn.1674-1951.2021.02.001

摘要/Abstract

摘要：

随着互联网技术在智能电网的广泛应用,识别电力系统中的入侵攻击行为显得尤为重要。基于高级量测体系（AMI）中的通信网络架构,根据智能电网入侵检测需求,提出了一种结合K最邻近算法（KNN）和优化特征工程的AMI通信入侵检测方案,通过数据采集、数据预处理、特征工程和模型训练4个模块识别入侵攻击流量。特征工程部分,采用文本特征提取方法对输入KNN训练模型的特征进行优化,并基于信息增益值移除冗余特征向量。模型训练部分,通过k个最近邻训练实例的标签来判断待检测数据的类型。将该方案在公开的入侵检测数据集ADFA-LD上进行测试,得到了各类入侵攻击的检测准确率。试验结果表明,该方案在检测结果性能上显著优于传统的入侵检测方案,最优特征提取下模型的分类准确率提高了21.96%。

关键词: 智能电网, 入侵检测, 高级量测体系, KNN, 特征工程, ADFA-LD, 能源互联网, 电力信息安全

Abstract:

With the widespread use of internet technology in smart grids, it is particularly important to identify intrusion attacks in power systems. Based on the communication network architecture in Advanced Metering Infrastructure（AMI）, an AMI communication intrusion detection scheme combining K Nearest Neighbor （KNN） and optimized feature engineering is proposed in response to the smart grid intrusion detection requirements. Intrusion attack flow can be identified through four modules： data collection, data preprocessing, feature engineering and model training. In feature engineering module, the features inputted into KNN training model are optimized by text feature extraction method, and the redundant feature vectors are removed based on the information gain values. In model training part, the types of data are judged by the labels of the k nearest neighbour training samples. The proposed scheme was tested on public intrusion detection data sets ADFA-LD, and the detection accuracy of various intrusion attacks was obtained. The experimental results show that the detection performance of this scheme is superior to the traditional intrusion detection scheme, with an 21.96% increase in the classification accuracy under the optimal feature extraction model.

Key words: smart grid, intrusion detection, Advanced Metering Infrastructure, KNN, feature engineering, ADFA-LD, Energy Internet, power information security

中图分类号:

TM732

卢官宇, 田秀霞, 张悦. 结合KNN和优化特征工程的AMI通信入侵检测研究[J]. 华电技术, 2021, 43(2): 1-8.

LU Guanyu, TIAN Xiuxia, ZHANG Yue. Research on AMI communication intrusion detection combining KNN and optimized feature engineering[J]. Huadian Technology, 2021, 43(2): 1-8.

图/表 11

图1

表1

图2

表2

图3

图4

表3

表4

图5

图6

表5

参考文献 17

[1]	王德文, 杨力平. 智能电网大数据流式处理方法与状态监测异常检测[J]. 电力系统自动化, 2016,40(14):122-128.
	WANG Dewen, YANG Liping. Stream processing method and condition monitoring anomaly detection for big data in smart grid[J]. Automation of Electric Power Systems, 2016,40(14):122-128.
[2]	鄢晶, 高天露, 张俊, 等. 边云链协同技术在能源互联网数据管理中的应用及展望[J]. 华电技术, 2020,42(8):41-47.
	YAN Jing, GAO Tianlu, ZHANG Jun, et al. Application and prospect of edge-cloud-chain collaboration technologies for energy internet data management[J]. Huadian Technology, 2020,42(8):41-47.
[3]	张文亮, 刘壮志, 王明俊, 等. 智能电网的研究进展及发展趋势[J]. 电网技术, 2009,33(13):1-11.
	ZHANG Wenliang, LIU Zhuangzhi, WANG Mingjun, et al. Research status and development trend of smart grid[J]. Power System Technology, 2009,33(13):1-11.
[4]	张宪军, 赵谦, 梁志宝, 等. 智能变电站交换机基于GOOSE管理技术设计与实现[J]. 华电技术, 2020,42(2):42-49,57.
	ZHANG Xianjun, ZHAO Qian, LIANG Zhibao, et al. Design and implementation of GOOSE management technology on smart substation switches[J]. Huadian Technology, 2020,42(2):42-49,57.
[5]	韦尧, 王昌达. 基于熵的入侵检测研究综述[J]. 计算机应用与软件, 2020,37(4):297-302.
	WEI Yao, WANG Changda. A review of entropy-based intrusion detection[J]. Computer Applications and Software, 2020,37(4):297-302.
[6]	和湘, 刘晟, 姜吉国. 基于机器学习的入侵检测方法对比研究[J]. 信息网络安全, 2018(5):1-11.
	HE Xiang, LIU Sheng, JIANG Jiguo. Comparative study of intrusion detection methods based on machine learning[J]. Net Information Security, 2018(5):1-11.
[7]	MOUSTAFA N, SLAY J. The significant features of the UNSW-NB15 and the KDD99 data sets for network intrusion detection systems[C]// 2015 4th International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security,Kyoto,Japan, 2015.
[8]	李雅洁. 智能电网邻域网入侵检测技术研究与实现[D]. 北京:北京邮电大学, 2018.
[9]	ZHOU C, WANG Z, HUANG W, et al. Research on network security attack detection algorithm in smart grid system[C]// 2017 International Conference on Computer Technology, Electronics and Communication, Dalian, China, 2017.
[10]	HE H. Intrusion detection method based on improved neural network[C]// 2018 International Conference on Smart Grid and Electrical Automation,Changsha, 2018.
[11]	OUYANG X, MA Z. Using LSTM Networks to identify false data of smart terminals in the smart grid[C]// 2017 IEEE 23rd International Conference on Parallel and Distributed Systems,Shenzhen,China, 2017:765-768.
[12]	蒋南允, 程光. 智能电网入侵检测综述[J]. 网络空间安全, 2018,9(1):93-98.
	JIANG Nanyun, CHENG Guang. A review of intrusion detection in smart grid[J]. Cyberspace Security, 2018,9(1):93-98.
[13]	王贵珍, 曲天光. 入侵检测系统研究与发展概述[J]. 保密科学技术, 2019(2):30-35.
	WANG Guizhen, QU Tianguang. Overview of research and development of intrusion detection system[J]. Secrecy Science and Technology, 2019(2):30-35.
[14]	RADOGLOU-GRAMMATIKIS P I, SARIGIANNIDIS P G. An anomaly-based intrusion detection system for the smart grid based on CART decision tree[C]// 2018 Global Information Infrastructure and Networking Symposium,Thessaloniki,Greece, 2018.
[15]	郝建军, 王启银, 张兴忠. 基于支持向量机的电网通信入侵检测技术[J]. 电测与仪表, 2019,56(22):109-114.
	HAO Jianjun, WANG Qiyin, ZHANG Xingzhong. Intrusion detection technology based on support vector machine for power grid communication[J]. Electrical Measurement & Instrumentation, 2019,56(22):109-114.
[16]	REN B, HU M, YAN H, et al. Classification and prediction of network abnormal data based on machine learning[C]// 2019 International Conference on Robots & Intelligent System,Haikou,China, 2019.
[17]	刘金辉, 李娜, 葛丽娜, 等. 改进的神经网络应用于入侵检测的方案[J]. 网络新媒体技术, 2017,6(6):54-60,53.
	LIU Jinhui, LI Na, GE Lina, et al. Improved neural network applied to intrusion detection computer engineering and applications[J]. Journal of Network New Media, 2017,6(6):54-60,53.

分类标准	入侵检测系统类型
数据源	基于主机式基于网络式基于应用式
体系结构	集中式等级式协作式
使用频率	在线离线
反应机制	主动反应被动反应

特征	描述
Src_ip	传输数据源IP地址
Des_ip	传输数据目的IP地址
Src_port	传输数据源端口号
Des_port	传输数据目的端口号
Pack_length	传输数据报文长度
Pack_frequency	数据报文传送频率
Pack_ttl	数据报文TTL值
Pack_command	传输数据报文命令

数据类型	标注类型	数据量
Training_Data	训练数据	833
Validation_Data	验证数据	4 373
Adduser	攻击数据	91
Hydra_FTP		162
Hydra_SSH		148
Java_Meterpreter		125
Meterpreter		75
Web_Shell		118

评价指标	解释
N_TP	入侵攻击数据流量被方案检测为正例的个数
N_TN	正常数据流量被方案检测为负例的个数
N_FP	正常数据流量被方案检测为正例的个数
N_FN	入侵攻击数据流量被方案检测为负例的个数

对比方案	选用特征提取方法	检测准确率/%
本文方案	BoW	87.77
	TF-IDF	85.31
	N-gram	88.20
传统的AAFID检测		66.24