微客导航 » 文章资讯 » python使用pandas抽样训练数据中某个类别实例

python使用pandas抽样训练数据中某个类别实例

2023-07-31 11:22:04 474

废话真的一句也不想多说，直接看代码吧！

#-*-coding:utf-8-*-

importnumpy
fromsklearnimportmetrics
fromsklearn.svmimportLinearSVC
fromsklearn.naive_bayesimportMultinomialNB
fromsklearnimportlinear_model
fromsklearn.datasetsimportload_iris
fromsklearn.cross_validationimporttrain_test_split
fromsklearn.preprocessingimportOneHotEncoder,StandardScaler
fromsklearnimportcross_validation
fromsklearnimportpreprocessing
importscipyassp
fromsklearn.linear_modelimportLogisticRegression
fromsklearn.feature_selectionimportSelectKBest,chi2
importpandasaspd
fromsklearn.preprocessingimportOneHotEncoder
#importiris_data

'''
creativeID,userID,positionID,clickTime,conversionTime,connectionType,
telecomsOperator,appPlatform,sitesetID,positionType,age,gender,
education,marriageStatus,haveBaby,hometown,residence,appID,appCategory,label
'''


deftest():
df=pd.read_table("/var/lib/mysql-files/data1.csv",sep=",")
df1=df[["connectionType","telecomsOperator","appPlatform","sitesetID",
"positionType","age","gender","education","marriageStatus",
"haveBaby","hometown","residence","appCategory","label"]]
printdf1["label"].value_counts()
N_data=df1[df1["label"]==0]
P_data=df1[df1["label"]==1]
N_data=N_data.sample(n=P_data.shape[0],frac=None,replace=False,weights=None,random_state=2,axis=0)
#printdf1.loc[:,"label"]==0
printP_data.shape
printN_data.shape

data=pd.concat([N_data,P_data])
printdata.shape
data=data.sample(frac=1).reset_index(drop=True)
printdata[["label"]]
return

补充拓展：pandas实现对dataframe抽样

随机抽样

importpandasaspd
#对dataframe随机抽取2000个样本
pd.sample(df,n=2000)

分层抽样

利用sklean中的函数灵活进行抽样

fromsklearn.model_selectionimporttrain_test_split
#y是在X中的某一个属性列
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,stratify=y)

以上这篇python使用pandas抽样训练数据中某个类别实例就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持毛票票。

声明：本文内容来源于网络，版权归原作者所有，内容由互联网用户自发贡献自行上传，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任。如果您发现有涉嫌版权的内容，欢迎发送邮件至：czq8825#qq.com（发邮件时，请将#更换为@）进行举报，并提供相关证据，一经查实，本站将立刻删除涉嫌侵权内容。

返回顶部
3162201930
czq8825@qq.com

python使用pandas抽样训练数据中某个类别实例

热门推荐

随机推荐