Python – 如何检查 Pandas 中缺失的日期
要检查丢失的日期,首先,让我们设置一个包含日期记录的列表字典,即我们的示例中的购买日期-
#列表字典 d = {'Car': ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'], 'Date_of_purchase': ['2020-10-10', '2020-10-12', '2020-10-17', '2020-10-16', '2020-10-19', '2020-10-22']}
现在,从上面的列表字典创建一个数据框-
dataFrame = pd.DataFrame(d)
接下来,将其设置为索引-
dataFrame = dataFrame.set_index('Date_of_purchase')
使用to_datetime()将字符串转换为DateTime对象-
dataFrame.index = pd.to_datetime(dataFrame.index)
显示范围内的剩余日期-
k = pd.date_range( start="2020-10-10", end="2020-10-22").difference(dataFrame.index);
示例
以下是代码-
import pandas as pd #列表字典 d = {'Car': ['BMW', 'Lexus', 'Audi', 'Mercedes', 'Jaguar', 'Bentley'], 'Date_of_purchase': ['2020-10-10', '2020-10-12', '2020-10-17', '2020-10-16', '2020-10-19', '2020-10-22'] } #creatingdataframefromtheabovedictionaryoflists dataFrame = pd.DataFrame(d) print"DataFrame...\n",dataFrame #Date_of_purchasesetasindex dataFrame = dataFrame.set_index('Date_of_purchase') #usingto_datetime()toconvertstringtoDateTimeobject dataFrame.index = pd.to_datetime(dataFrame.index) #remainingdatesdisplayedasoutput print("\nDisplaying remaining dates from a range of dates...") k = pd.date_range(start="2020-10-10", end="2020-10-22").difference(dataFrame.index); print(k);输出结果
这将产生以下输出-
DataFrame... Car Date_of_purchase 0 BMW 2020-10-10 1 Lexus 2020-10-12 2 Audi 2020-10-17 3 Mercedes 2020-10-16 4 Jaguar 2020-10-19 5 Bentley 2020-10-22 Displaying remaining dates from a range of dates... DatetimeIndex(['2020-10-11', '2020-10-13', '2020-10-14', '2020-10-15', '2020-10-18', '2020-10-20', '2020-10-21'], dtype='datetime64[ns]', freq=None)