scipy.cluster.vq.kmeans() 和 scipy.cluster.vq.kmeans2() 方法有什么区别?
该SciPy的。cluster.vq()有两种方法来实现k-means聚类,即kmeans()kmeans2()。这两种方法的工作方式存在显着差异。让我们理解它-
scipy.cluster.vq.kmeans(obs,k_or_guess,iter=20,thresh=1e-05,check_finite=True)-该kmeans()方法通过对一组观察向量执行k-means算法来形成k个集群。为了确定质心的稳定性,该方法使用阈值来比较观测值与其对应质心之间的平均欧几里德距离的变化。该方法的输出是一个将质心映射到代码的代码簿,反之亦然。
scipy.cluster.vq.kmeans2(data,k,iter=10,thresh=1e-05,minit='random',missing='warn',check_finite=True)-kmeans2()方法对一组观察向量进行分类通过执行k-means算法分成k个簇。为了检查收敛性,与kmeans()方法不同,kmeans2()方法不使用阈值。Kmeans2()的参数也比kmeans()方法多。它有额外的参数来决定质心的初始化方法、处理空簇以及验证输入矩阵是否只包含有限数。
示例
用kmeans()方法计算K均值-
#importing the required Python libraries: import numpy as np from numpy import vstack,array fromnumpy.randomimport rand from scipy.cluster.vq import whiten, kmeans, vq #Random data generation: data = vstack((rand(200,2) + array([.5,.5]),rand(150,2))) #Normalizing the data: data = whiten(data) # computing K-Means with kmeans() method centroids, mean_value = kmeans(data, 3) print("Code book :\n", centroids, "\n") print("欧几里得距离的平均值:", mean_value.round(4))输出结果
Code book : [[2.45420231 3.19421081] [2.77295342 1.74582367] [0.99156276 1.35546602]] 欧几里得距离的平均值: 0.791
使用kmeans2()方法在相同的数组数据上计算K-means-
#importing the required Python libraries: import numpy as np from numpy import vstack,array fromnumpy.randomimport rand from scipy.cluster.vq import whiten, kmeans2 #Random data generation: data = vstack((rand(200,2) + array([.5,.5]),rand(150,2))) #Normalizing the data: data = whiten(data) # computing K-Means with kmeans2() method centroids, clusters = kmeans2(data, 3, minit='random') print("Code book :\n", centroids, "\n") print(("集群:", clusters))输出结果
Code book : [[3.07353603 2.71692674] [1.07148876 0.74285308] [1.64579292 2.29821454]] ('集群:', array([2, 0, 0, 2, 0, 2, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 0, 2, 2, 1, 2, 2, 2, 0, 0, 2, 0, 2, 0, 2, 0, 2, 0, 2, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 0, 2, 0, 0, 0, 2, 2, 2, 0, 0, 0, 2, 2, 0, 2, 0, 0, 0, 2, 2, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 2, 0, 0, 0, 2, 2, 0, 2, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 2, 0, 0, 2, 0, 2, 0, 2, 0, 0, 2, 2, 0, 2, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 2, 0, 0, 0, 0, 0, 2, 0, 2, 0, 0, 2, 2, 2, 2, 2, 2, 0, 2, 2, 2, 0, 2, 0, 0, 2, 0, 2, 2, 0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 0, 2, 0, 2, 0, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 2, 2, 1, 1, 2, 2, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 2, 1, 1, 1, 1, 2, 2, 1, 2, 2, 1, 2, 1, 0, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 2, 2, 2, 1, 1, 2, 1, 1], dtype=int32))