对pandas中to_dict的用法详解
简介:pandas中的to_dict可以对DataFrame类型的数据进行转换
可以选择六种的转换类型,分别对应于参数‘dict',‘list',‘series',‘split',‘records',‘index',下面逐一介绍每种的用法
Helponmethodto_dictinmodulepandas.core.frame:
to_dict(orient='dict')methodofpandas.core.frame.DataFrameinstance
ConvertDataFrametodictionary.
Parameters
----------
orient:str{'dict','list','series','split','records','index'}
Determinesthetypeofthevaluesofthedictionary.
-dict(default):dictlike{column->{index->value}}
-list:dictlike{column->[values]}
-series:dictlike{column->Series(values)}
-split:dictlike
{index->[index],columns->[columns],data->[values]}
-records:listlike
[{column->value},...,{column->value}]
-index:dictlike{index->{column->value}}
..versionadded::0.17.0
Abbreviationsareallowed.`s`indicates`series`and`sp`
indicates`split`.
Returns
-------
result:dictlike{column->{index->value}}
1、选择参数orient='dict'
dict也是默认的参数,下面的data数据类型为DataFrame结构,会形成{column->{index->value}}这样的结构的字典,可以看成是一种双重字典结构
-单独提取每列的值及其索引,然后组合成一个字典
-再将上述的列属性作为关键字(key),值(values)为上述的字典
查询方式为:data_dict[key1][key2]
-data_dict为参数选择orient='dict'时的数据名
-key1为列属性的键值(外层)
-key2为内层字典对应的键值
data
Out[9]:
pclassageembarkedhome.destsex
10863rd31.194181UNKNOWNUNKNOWNmale
121st31.194181CherbourgParis,Francefemale
10363rd31.194181UNKNOWNUNKNOWNmale
8333rd32.000000SouthamptonForesvik,NorwayPortland,NDmale
11083rd31.194181UNKNOWNUNKNOWNmale
5622nd41.000000CherbourgNewYork,NYmale
4372nd48.000000SouthamptonSomerset/Bernardsville,NJfemale
6633rd26.000000SouthamptonUNKNOWNmale
6693rd19.000000SouthamptonEnglandmale
5072nd31.194181SouthamptonPetworth,Sussexmale
In[10]:data_dict=data.to_dict(orient='dict')
In[11]:data_dict
Out[11]:
{'age':{12:31.19418104265403,
437:48.0,
507:31.19418104265403,
562:41.0,
663:26.0,
669:19.0,
833:32.0,
1036:31.19418104265403,
1086:31.19418104265403,
1108:31.19418104265403},
'embarked':{12:'Cherbourg',
437:'Southampton',
507:'Southampton',
562:'Cherbourg',
663:'Southampton',
669:'Southampton',
833:'Southampton',
1036:'UNKNOWN',
1086:'UNKNOWN',
1108:'UNKNOWN'},
'home.dest':{12:'Paris,France',
437:'Somerset/Bernardsville,NJ',
507:'Petworth,Sussex',
562:'NewYork,NY',
663:'UNKNOWN',
669:'England',
833:'Foresvik,NorwayPortland,ND',
1036:'UNKNOWN',
1086:'UNKNOWN',
1108:'UNKNOWN'},
'pclass':{12:'1st',
437:'2nd',
507:'2nd',
562:'2nd',
663:'3rd',
669:'3rd',
833:'3rd',
1036:'3rd',
1086:'3rd',
1108:'3rd'},
'sex':{12:'female',
437:'female',
507:'male',
562:'male',
663:'male',
669:'male',
833:'male',
1036:'male',
1086:'male',
1108:'male'}}
2、当关键字orient='list'时
和1中比较相似,只不过内层变成了一个列表,结构为{column->[values]}
查询方式为:data_list[keys][index]
data_list为关键字orient='list'时对应的数据名
keys为列属性的键值,如本例中的'age',‘embarked'等
index为整型索引,从0开始到最后
In[19]:data_list=data.to_dict(orient='list')
In[20]:data_list
Out[20]:
{'age':[31.19418104265403,
31.19418104265403,
31.19418104265403,
32.0,
31.19418104265403,
41.0,
48.0,
26.0,
19.0,
31.19418104265403],
'embarked':['UNKNOWN',
'Cherbourg',
'UNKNOWN',
'Southampton',
'UNKNOWN',
'Cherbourg',
'Southampton',
'Southampton',
'Southampton',
'Southampton'],
'home.dest':['UNKNOWN',
'Paris,France',
'UNKNOWN',
'Foresvik,NorwayPortland,ND',
'UNKNOWN',
'NewYork,NY',
'Somerset/Bernardsville,NJ',
'UNKNOWN',
'England',
'Petworth,Sussex'],
'pclass':['3rd',
'1st',
'3rd',
'3rd',
'3rd',
'2nd',
'2nd',
'3rd',
'3rd',
'2nd'],
'sex':['male',
'female',
'male',
'male',
'male',
'male',
'female',
'male',
'male',
'male']}
3、关键字参数orient='series'
形成结构{column->Series(values)}
调用格式为:data_series[key1][key2]或data_dict[key1]
data_series为数据对应的名字
key1为列属性的键值,如本例中的'age',‘embarked'等
key2使用数据原始的索引(可选)
In[21]:data_series=data.to_dict(orient='series')
In[22]:data_series
Out[22]:
{'age':108631.194181
1231.194181
103631.194181
83332.000000
110831.194181
56241.000000
43748.000000
66326.000000
66919.000000
50731.194181
Name:age,dtype:float64,'embarked':1086UNKNOWN
12Cherbourg
1036UNKNOWN
833Southampton
1108UNKNOWN
562Cherbourg
437Southampton
663Southampton
669Southampton
507Southampton
Name:embarked,dtype:object,'home.dest':1086UNKNOWN
12Paris,France
1036UNKNOWN
833Foresvik,NorwayPortland,ND
1108UNKNOWN
562NewYork,NY
437Somerset/Bernardsville,NJ
663UNKNOWN
669England
507Petworth,Sussex
Name:home.dest,dtype:object,'pclass':10863rd
121st
10363rd
8333rd
11083rd
5622nd
4372nd
6633rd
6693rd
5072nd
Name:pclass,dtype:object,'sex':1086male
12female
1036male
833male
1108male
562male
437female
663male
669male
507male
Name:sex,dtype:object}
4、关键字参数orient='split'
形成{index->[index],columns->[columns],data->[values]}的结构,是将数据、索引、属性名单独脱离出来构成字典
调用方式有data_split[‘index'],data_split[‘data'],data_split[‘columns']
data_split=data.to_dict(orient='split')
data_split
Out[38]:
{'columns':['pclass','age','embarked','home.dest','sex'],
'data':[['3rd',31.19418104265403,'UNKNOWN','UNKNOWN','male'],
['1st',31.19418104265403,'Cherbourg','Paris,France','female'],
['3rd',31.19418104265403,'UNKNOWN','UNKNOWN','male'],
['3rd',32.0,'Southampton','Foresvik,NorwayPortland,ND','male'],
['3rd',31.19418104265403,'UNKNOWN','UNKNOWN','male'],
['2nd',41.0,'Cherbourg','NewYork,NY','male'],
['2nd',48.0,'Southampton','Somerset/Bernardsville,NJ','female'],
['3rd',26.0,'Southampton','UNKNOWN','male'],
['3rd',19.0,'Southampton','England','male'],
['2nd',31.19418104265403,'Southampton','Petworth,Sussex','male']],
'index':[1086,12,1036,833,1108,562,437,663,669,507]}
5、当关键字orient='records'时
形成[{column->value},…,{column->value}]的结构
整体构成一个列表,内层是将原始数据的每行提取出来形成字典
调用格式为data_records[index][key1]
data_records=data.to_dict(orient='records')
data_records
Out[41]:
[{'age':31.19418104265403,
'embarked':'UNKNOWN',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'},
{'age':31.19418104265403,
'embarked':'Cherbourg',
'home.dest':'Paris,France',
'pclass':'1st',
'sex':'female'},
{'age':31.19418104265403,
'embarked':'UNKNOWN',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'},
{'age':32.0,
'embarked':'Southampton',
'home.dest':'Foresvik,NorwayPortland,ND',
'pclass':'3rd',
'sex':'male'},
{'age':31.19418104265403,
'embarked':'UNKNOWN',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'},
{'age':41.0,
'embarked':'Cherbourg',
'home.dest':'NewYork,NY',
'pclass':'2nd',
'sex':'male'},
{'age':48.0,
'embarked':'Southampton',
'home.dest':'Somerset/Bernardsville,NJ',
'pclass':'2nd',
'sex':'female'},
{'age':26.0,
'embarked':'Southampton',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'},
{'age':19.0,
'embarked':'Southampton',
'home.dest':'England',
'pclass':'3rd',
'sex':'male'},
{'age':31.19418104265403,
'embarked':'Southampton',
'home.dest':'Petworth,Sussex',
'pclass':'2nd',
'sex':'male'}]
6、当关键字orient='index'时
形成{index->{column->value}}的结构,调用格式正好和'dict'对应的反过来,请读者自己思考
data_index=data.to_dict(orient='index')
data_index
Out[43]:
{12:{'age':31.19418104265403,
'embarked':'Cherbourg',
'home.dest':'Paris,France',
'pclass':'1st',
'sex':'female'},
437:{'age':48.0,
'embarked':'Southampton',
'home.dest':'Somerset/Bernardsville,NJ',
'pclass':'2nd',
'sex':'female'},
507:{'age':31.19418104265403,
'embarked':'Southampton',
'home.dest':'Petworth,Sussex',
'pclass':'2nd',
'sex':'male'},
562:{'age':41.0,
'embarked':'Cherbourg',
'home.dest':'NewYork,NY',
'pclass':'2nd',
'sex':'male'},
663:{'age':26.0,
'embarked':'Southampton',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'},
669:{'age':19.0,
'embarked':'Southampton',
'home.dest':'England',
'pclass':'3rd',
'sex':'male'},
833:{'age':32.0,
'embarked':'Southampton',
'home.dest':'Foresvik,NorwayPortland,ND',
'pclass':'3rd',
'sex':'male'},
1036:{'age':31.19418104265403,
'embarked':'UNKNOWN',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'},
1086:{'age':31.19418104265403,
'embarked':'UNKNOWN',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'},
1108:{'age':31.19418104265403,
'embarked':'UNKNOWN',
'home.dest':'UNKNOWN',
'pclass':'3rd',
'sex':'male'}}
以上这篇对pandas中to_dict的用法详解就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持毛票票。