使用pathlib将jsons的目录转换为df

mzaanser  于 2023-11-20  发布在  其他
关注(0)|答案(1)|浏览(110)

我在一个目录中有一组json,我试图循环并使用pathlib.Path('my_jsons_dir').iterdir()读取Pandas df
单个文件的情况下工作得很好

json_path = pathlib.Path('my_json_path')
dict_single = json.loads(json_path.read_bytes())
df_single = pd.DataFrame.from_dict(pd.json_normalize(dict_single), orient='columns')

字符串
但循环打嗝...

file_paths = pathlib.Path(file_dir)
data = []

for file in file_paths.iterdir():
    if file.is_file():
        dict = json.loads(file.read_bytes())
        df = pd.DataFrame.from_dict(pd.json_normalize(dict), orient='columns')
        data.append(df)


错误消息

JSONDecodeError                           Traceback (most recent call last)
Cell In[34], line 8
      6 for file in file_paths.iterdir():
      7     if file.is_file():
----> 8         dict = json.loads(file.read_bytes())
      9         df = pd.DataFrame.from_dict(pd.json_normalize(dict), orient='columns')
     10         data.append(df)

File c:\Users\mdw0523\AppData\Local\Programs\Python\Python310\lib\json\__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    341     s = s.decode(detect_encoding(s), 'surrogatepass')
    343 if (cls is None and object_hook is None and
    344         parse_int is None and parse_float is None and
    345         parse_constant is None and object_pairs_hook is None and not kw):
--> 346     return _default_decoder.decode(s)
    347 if cls is None:
    348     cls = JSONDecoder

File c:\Users\mdw0523\AppData\Local\Programs\Python\Python310\lib\json\decoder.py:337, in JSONDecoder.decode(self, s, _w)
    332 def decode(self, s, _w=WHITESPACE.match):
    333     """Return the Python representation of ``s`` (a ``str`` instance
    334     containing a JSON document).
    335 
    336     """
--> 337     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
...
    354 except StopIteration as err:
--> 355     raise JSONDecodeError("Expecting value", s, err.value) from None
    356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)


它看起来像一个简单的输入的东西,但不明白为什么dict = json.loads(file.read_bytes())的工作方式不一样,因为它是在单文件的情况下,libpath优于os,由于需要灵活地跨mac/win工作
谢谢你的帮助!

gupuwyp2

gupuwyp21#

原来错误是由其中一个不包含文本的文件引起的。此问题可以关闭。

相关问题