综合智慧能源 ›› 2025, Vol. 47 ›› Issue (3): 62-72.doi: 10.3969/j.issn.2097-0706.2025.03.006

• 基于AI的新型电力系统调度 • 上一篇    下一篇

基于MDLOF-iForest和M-KNN-Slope的公共建筑负荷异常数据识别与修复

刘一宁1(), 陈柏安1, 杜鹏程1, 林晓刚2(), 江美慧1,3,*()   

  1. 1.广西大学 电气工程学院,南宁 530004
    2.中国科学院海西研究院 泉州装备制造研究中心,福建 泉州 362000
    3.内蒙古工业大学 新能源学院,内蒙古 鄂尔多斯 017010
  • 收稿日期:2024-12-31 修回日期:2025-02-11 接受日期:2025-03-05 出版日期:2025-03-25
  • 通讯作者: *江美慧(1994),女,讲师,博士,从事综合能源、风光储一体化技术等方面的研究,meihuijiang@yeah.net
  • 作者简介:刘一宁(2001),男,硕士生,从事智慧能源系统方面的研究,2412392073@st.gxu.edu.cn
    林晓刚(1990),男,助教,博士,从事综合能源、自动控制等方面的研究,xg_lin_nuaa@126.com
  • 基金资助:
    国家自然科学基金项目(52307072)

Detection and repair of abnormal load data of public buildings based on MDLOF-iForest and M-KNN-Slope

LIU Yining1(), CHEN Baian1, DU Pengcheng1, LIN Xiaogang2(), JIANG Meihui1,3,*()   

  1. 1. School of Electrical Engineering, Guangxi University, Nanning 530004, China
    2. Quanzhou Institute of Equipment Manufacturing, Haixi Institute,Chinese Academy of Sciences, Quanzhou 362000, China
    3. School of Renewable Energy,Inner Mongolia University of Technology, Ordos 017010, China
  • Received:2024-12-31 Revised:2025-02-11 Accepted:2025-03-05 Published:2025-03-25
  • Supported by:
    National Natural Science Foundation of China(52307072)

摘要:

在公共建筑能耗研究中,对异常负荷值进行识别与修复是不可或缺的数据处理环节。针对现有方法的局限性,提出一种基于马氏距离局部离群因子-孤立森林(MDLOF-iForest)算法和考虑斜率的K近邻改进(M-KNN-Slope)算法的负荷异常数据识别与修复方法。MDLOF-iForest算法在传统局部离群因子算法中引入马氏距离,提高了模型对数据特征间关联性的感知能力,同时将MDLOF算法与iForest算法的优势相结合,快速准确识别出异常数据。M-KNN-Slope算法利用异常数据与正常数据负荷趋势线特征相似的邻居,得到相似趋势线斜率加权平均值,完成对异常数据的修复,减少对样本数据的依赖。通过对南宁市一栋办公和一栋商业公共建筑2024年8—11月负荷数据的验证,修复后90%左右数据与正确数据差值在10%以内,且相较一般算法,M-KNN-Slope算法能够获得更多误差在5%以内的数据。分别利用极端梯度提升、长短期记忆网络、反向传播神经网络、支持向量机对修复前后的数据进行预测,均方根值分别降低了5.02%~17.83%,绝对平均误差分别降低了2.44%~13.34%。

关键词: 公共建筑能耗, 负荷数据集, 异常数据识别, 异常数据修复, 马氏距离局部离群因子-孤立森林算法, 考虑斜率的K近邻改进算法

Abstract:

In research on energy consumption of public buildings, making the detection and repair of abnormal load data an indispensable part of data processing. To address the limitations of existing methods, a method based on the Mahalanobis distance-based local outlier factor-isolation forest (MDLOF-iForest) algorithm and the modified K-nearest neighbors-slope (M-KNN-Slope) algorithm was proposed.The MDLOF-iForest algorithm incorporated Mahalanobis distance into the traditional local outlier factor algorithm, improving the models ability to perceive correlations between data features. Meanwhile, by combining the advantages of MDLOF algorithm and iForest algorithm, it enabled rapid and accurate detection of abnormal data. The M-KNN-Slope algorithm used neighbors with similar load trend line characteristics of abnormal data and normal data to obtain the weighted average values of similar trend line slopes, completing the repair of abnormal data and reducing reliance on sample data. Verification was conducted using load data from an office public building and a commercial public building in Nanning, from August to November 2024. The results showed that approximately 90% of the repaired data had a difference of less than 10% compared to the correct data. Compared with conventional algorithms,the M-KNN-Slope algorithm could obtain more data with errors within 5%. Extreme gradient boosting, long short-term memory network, backpropagation neural network, and support vector machine were used to predict the data before and after repair. The root mean square values decreased by 5.02% to 17.83%, and the absolute mean errors decreased by 2.44% to 13.34%.

Key words: energy consumption of public buildings, load dataset, abnormal data detection, abnormal data repair, Mahalanobis distance-based local outlier factor-isolation forest, modified K-nearest neighbors-slope

中图分类号: