Pandas:DataFrame数据选择⽅法(索引)
#⾸先创建我们的Series对象,然后合并到dataframe对象⾥⾯去import pandas as pdimport numpy as np
area=pd.Series({'ChongQing':188888,'BeiJing':92387928,'Shanghai':8374583746,'Sydney':82734})population=pd.Series({'ChongQing':1000,'BeiJing':2000,'Shanghai':2900,'Sydney':3000})
data=pd.DataFrame({'area':area,'population':population})#备注:创建字典的结构时⼀定要遵循字典的数据结构#也就是创建完字典之后⼀定要在字典的前后写上花括号,这个是⼀个很重要的习惯print(data)输出结果:
area populationChongQing 188888 1000BeiJing 92387928 2000Shanghai 8374583746 2900Sydney 82734 3000输⼊代码增加我们colums上的对象:data['area']输出:
ChongQing 188888BeiJing 92387928Shanghai 8374583746Sydney 82734Name: area, dtype: int输⼊:
#利⽤属性的形式来列出⼀个columns的数据,上⾯是使⽤了索引的形式,这种形式并不太常⽤data.area输出:
ChongQing 188888BeiJing 92387928Shanghai 8374583746Sydney 82734Name: area, dtype: int输⼊:
data.values#其实dataframe是⼀个⼗分显然的⼆维数组,我们可以⽤这个公式来验证它输出:
array([[1.88888000e+05, 1.00000000e+03, 1.88888000e+02], [9.23879280e+07, 2.00000000e+03, 4.619390e+04], [8.37458375e+09, 2.90000000e+03, 2.88778750e+06], [8.27340000e+04, 3.00000000e+03, 2.75780000e+01]])