面向分布式数据流的混合聚类算法
张杨;毛国君
【期刊名称】《微计算机信息》 【年(卷),期】2011(027)011
【摘要】在日常生活中,数据时常是以分布式网络为媒介收集的,分布式数据流近年来受到关注。本文为分布式数据流聚类挖掘专门设计了一种局部概要数据结构,并给出了一种有效的混合聚类算法(Effective Mixed Clustering Algorithm,EMCA),实现了分布式数据流中的增量式聚类挖掘。实验说明,本文提出的算法可以在有效地降低数据通信代价的同时还能够保证较高的聚类质量。úta is often collected over a distributed network in the daily life , so the research to distributed data streaming model has recently gained a high attraction due to its applications. In fact, most ongoing studies for mining distributed data streams are suffering from the problems of accuracy or efficiency. In this paper, one improved synopsis data structure for summarizing data streams is designed, one effective distributed clustering algorithm named EMCA in an incremental way is presented. Experiments show that EMCA algorithm has less communication cost and higher clustering qualities. 【总页数】3页(120-122)
【关键词】分布式数据流;局部概要数据结构;增量式聚类 【作者】张杨;毛国君
【作者单位】北京工业大学,北京100124;中央财经大学,北京100081