Integrated Intelligent Energy ›› 2025, Vol. 47 ›› Issue (3): 62-72.doi: 10.3969/j.issn.2097-0706.2025.03.006

• New Power System Scheduling based on AI • Previous Articles     Next Articles

Detection and repair of abnormal load data of public buildings based on MDLOF-iForest and M-KNN-Slope

LIU Yining1(), CHEN Baian1, DU Pengcheng1, LIN Xiaogang2(), JIANG Meihui1,3,*()   

  1. 1. School of Electrical Engineering, Guangxi University, Nanning 530004, China
    2. Quanzhou Institute of Equipment Manufacturing, Haixi Institute,Chinese Academy of Sciences, Quanzhou 362000, China
    3. School of Renewable Energy,Inner Mongolia University of Technology, Ordos 017010, China
  • Received:2024-12-31 Revised:2025-02-11 Accepted:2025-03-05 Published:2025-03-25
  • Contact: JIANG Meihui E-mail:2412392073@st.gxu.edu.cn;xg_lin_nuaa@126.com;meihuijiang@yeah.net
  • Supported by:
    National Natural Science Foundation of China(52307072)

Abstract:

In research on energy consumption of public buildings, making the detection and repair of abnormal load data an indispensable part of data processing. To address the limitations of existing methods, a method based on the Mahalanobis distance-based local outlier factor-isolation forest (MDLOF-iForest) algorithm and the modified K-nearest neighbors-slope (M-KNN-Slope) algorithm was proposed.The MDLOF-iForest algorithm incorporated Mahalanobis distance into the traditional local outlier factor algorithm, improving the models ability to perceive correlations between data features. Meanwhile, by combining the advantages of MDLOF algorithm and iForest algorithm, it enabled rapid and accurate detection of abnormal data. The M-KNN-Slope algorithm used neighbors with similar load trend line characteristics of abnormal data and normal data to obtain the weighted average values of similar trend line slopes, completing the repair of abnormal data and reducing reliance on sample data. Verification was conducted using load data from an office public building and a commercial public building in Nanning, from August to November 2024. The results showed that approximately 90% of the repaired data had a difference of less than 10% compared to the correct data. Compared with conventional algorithms,the M-KNN-Slope algorithm could obtain more data with errors within 5%. Extreme gradient boosting, long short-term memory network, backpropagation neural network, and support vector machine were used to predict the data before and after repair. The root mean square values decreased by 5.02% to 17.83%, and the absolute mean errors decreased by 2.44% to 13.34%.

Key words: energy consumption of public buildings, load dataset, abnormal data detection, abnormal data repair, Mahalanobis distance-based local outlier factor-isolation forest, modified K-nearest neighbors-slope

CLC Number: