综合智慧能源 ›› 2022, Vol. 44 ›› Issue (4): 1-11.doi: 10.3969/j.issn.2097-0706.2022.04.001

• 发电与智能控制 •    下一篇

基于双尺度度量的改进模糊均值曲线聚类方法研究

陈甜甜1(), 高亚静2(), 卢占会1,*()   

  1. 1.华北电力大学 数理学院,北京 102206
    2.中国华能集团碳中和研究所,北京 100031
  • 收稿日期:2021-09-23 修回日期:2022-01-10 出版日期:2022-04-25 发布日期:2022-05-05
  • 通讯作者: 卢占会
  • 作者简介:陈甜甜(1997),女,在读硕士研究生,从事数据驱动与电力系统预测技术方面的研究, 1174576665@qq.com;
    高亚静(1980),女,副教授,博士/博士后,从事电力市场、综合能源系统、电力系统预测与优化等方面的研究, ncepugyj@163.com
  • 基金资助:
    国家自然科学基金项目(U1866204)

Research on improved fuzzy mean curve clustering method based on two-scale measurement

CHEN Tiantian1(), GAO Yajing2(), LU Zhanhui1,*()   

  1. 1. Institute of Mathematical,North China Electric Power University,Beijing 102206,China
    2. China Huaneng Group Carbon Neutrality Institute,Beijing 100031,China
  • Received:2021-09-23 Revised:2022-01-10 Online:2022-04-25 Published:2022-05-05
  • Contact: LU Zhanhui

摘要:

智能化电力网络中存在许多随时间变化表现出明显曲线特征的函数型数据,进行曲线聚类可以有效挖掘数据信息。针对模糊均值聚类算法初始聚类中心选取困难以及曲线聚类方法中相似度衡量不准确等问题,提出一种基于双尺度度量的改进模糊均值曲线聚类方法。根据皮尔逊距离衡量曲线的纵向形状相似性、动态时间弯曲距离衡量曲线的横向形状相似性,提出一种基于双尺度度量的密度峰值算法确定初始聚类中心;采用改进熵权法融合皮尔逊距离与动态时间弯曲距离作为聚类算法中的相似性度量;采用聚类有效性指标,从聚类效果、算法稳定性2个方面对聚类结果和算法性能进行评价;最后采用某地区一年的风电实际出力数据作为算例样本进行聚类分析,验证了所提出模型及算法的正确性和有效性。

关键词: 智能电网, 数据挖掘, 曲线聚类, 改进模糊均值曲线聚类, 皮尔逊距离, 动态时间弯曲距离, 改进熵权法, 相似度, 风电

Abstract:

There are many functional data showing obvious curve features that vary with time in intelligent power networks. Curve clustering can effectively mine the data information. Aiming at the difficulty in selecting the initial clustering centre for fuzzy mean clustering algorithm and the inaccurate similarity measurement of curve clustering methods, an improved fuzzy mean curve clustering method based on two-scale metric is proposed. The longitudinal shape similarity of a curve is measured according to the Pearson distance, and the horizontal shape similarity of the curve is measured according to the dynamic time wrapping distance. Then,a density peak algorithm based on two-scale measurement is proposed to determine the initial clustering centre. The improved entropy weight method combines Pearson distance and dynamic time wrapping distance in similarity measurement of clustering algorithm. Clustering validity indexes are taken to evaluate the clustering results and algorithm performance from the aspects of clustering effect and algorithm stability. At last, taking the annual data of wind power outputs in a region as the example for clustering analysis,the results verify the correctness and effectiveness of the model and calculation method.

Key words: intelligent grid, data mining, curve clustering, improved fuzzy mean curve clustering, Pearson distance, dynamic time warping distance, improved entropy weight method, similarity, wind power

中图分类号: