我查过这个:KeyError: 'Date'和这个:Pandas DataFrame - KeyError: 'date'没有帮助。我收到KeyError:“日期”没有解释。
下面是我的代码:
import pandas as pd, numpy as np
import csv
import warnings
from bs4 import BeautifulSoup, MarkupResemblesLocatorWarning
from sklearn.impute import SimpleImputer
from sklearn.exceptions import ConvergenceWarning
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LinearRegression, LogisticRegression, Perceptron
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import mean_squared_error, r2_score, accuracy_score, confusion_matrix, ConfusionMatrixDisplay
import seaborn as sns
import matplotlib.pyplot as plt
## Reading the data
dtypes = { 'Unnamed: 0': 'int32', 'drugName': 'category', 'condition': 'category', 'review': 'category', 'rating': 'float16', 'date': 'categorical', 'usefulCount': 'int16' }
train_df = pd.read_csv('/content/drugsComTrain_raw.tsv', sep='\t', quoting=2, dtype=dtypes)
# Randomly selecting 80% of the data from the training dataset
train_df = train_df.sample(frac=0.8, random_state=42)
test_df = pd.read_csv('/content/drugsComTest_raw.tsv', sep='\t', quoting=2, dtype=dtypes)
print(train_df.head())
## Converting date column to datetime format
train_df['date'], test_df['date'] = pd.to_datetime(train_df['date'], format='%b %d, %Y'), pd.to_datetime(test_df['date'], format='%b %d, %Y') #This is the line where Im getting the error.
最后一行是我得到错误的地方
错误:
KeyError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3801 try:
-> 3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
4 frames
/usr/local/lib/python3.10/dist-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
/usr/local/lib/python3.10/dist-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'date'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-17-056c9fab2e6c> in <cell line: 24>()
22 print(train_df.head())
23 ## Converting date column to datetime format
---> 24 train_df['date'], test_df['date'] = pd.to_datetime(train_df['date'], format='%b %d, %Y'), pd.to_datetime(test_df['date'], format='%b %d, %Y')
25
26 ## Extracting day, month, and year into separate columns
/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py in __getitem__(self, key)
3805 if self.columns.nlevels > 1:
3806 return self._getitem_multilevel(key)
-> 3807 indexer = self.columns.get_loc(key)
3808 if is_integer(indexer):
3809 indexer = [indexer]
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
-> 3804 raise KeyError(key) from err
3805 except TypeError:
3806 # If we have a listlike key, _check_indexing_error will raise
KeyError: 'date'
2条答案
按热度按时间5kgi1eie1#
一切都很好地与您的dataset。也许你的文件被破坏了。直接用URL试试:
输出:
czq61nw12#
代码中的错误可能在于定义dtypes字典的方式。键“date”的值应为“object”,而不是“string”。