微客导航 » 文章资讯 » Python处理PDF与CDF实例

Python处理PDF与CDF实例

2023-07-31 15:32:05 397

在拿到数据后，最需要做的工作之一就是查看一下自己的数据分布情况。而针对数据的分布，又包括pdf和cdf两类。

下面介绍使用python生成pdf的方法：

使用matplotlib的画图接口hist()，直接画出pdf分布；

使用numpy的数据处理函数histogram()，可以生成pdf分布数据，方便进行后续的数据处理，比如进一步生成cdf；

使用seaborn的distplot()，好处是可以进行pdf分布的拟合，查看自己数据的分布类型；

上图所示为采用3种算法生成的pdf图。下面是源代码。

fromscipyimportstats
importmatplotlib.pyplotasplt
importnumpyasnp
importseabornassns

arr=np.random.normal(size=100)

#plothistogram
plt.subplot(221)
plt.hist(arr)

#obtainhistogramdata
plt.subplot(222)
hist,bin_edges=np.histogram(arr)
plt.plot(hist)

#fithistogramcurve
plt.subplot(223)
sns.distplot(arr,kde=False,fit=stats.gamma,rug=True)
plt.show()

下面介绍使用python生成cdf的方法：

使用numpy的数据处理函数histogram()，生成pdf分布数据，进一步生成cdf；

使用seaborn的cumfreq()，直接画出cdf；

上图所示为采用2种算法生成的cdf图。下面是源代码。

fromscipyimportstats
importmatplotlib.pyplotasplt
importnumpyasnp
importseabornassns

arr=np.random.normal(size=100)

plt.subplot(121)
hist,bin_edges=np.histogram(arr)
cdf=np.cumsum(hist)
plt.plot(cdf)

plt.subplot(122)
cdf=stats.cumfreq(arr)
plt.plot(cdf[0])

plt.show()

在更多时候，需要把pdf和cdf放在一起，可以更好的显示数据分布。这个实现需要把pdf和cdf分别进行归一化。

上图所示为归一化的pdf和cdf。下面是源代码。

fromscipyimportstats
importmatplotlib.pyplotasplt
importnumpyasnp
importseabornassns

arr=np.random.normal(size=100)

hist,bin_edges=np.histogram(arr)
width=(bin_edges[1]-bin_edges[0])*0.8
plt.bar(bin_edges[1:],hist/max(hist),width=width,color='#5B9BD5')

cdf=np.cumsum(hist/sum(hist))
plt.plot(bin_edges[1:],cdf,'-*',color='#ED7D31')

plt.xlim([-2,2])
plt.ylim([0,1])
plt.grid()

plt.show()

以上这篇Python处理PDF与CDF实例就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持毛票票。

声明：本文内容来源于网络，版权归原作者所有，内容由互联网用户自发贡献自行上传，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任。如果您发现有涉嫌版权的内容，欢迎发送邮件至：czq8825#qq.com（发邮件时，请将#更换为@）进行举报，并提供相关证据，一经查实，本站将立刻删除涉嫌侵权内容。

返回顶部
3162201930
czq8825@qq.com

Python处理PDF与CDF实例

热门推荐

随机推荐