CSV仅包含 Dataframe 的最后一行

yxyvkwin  于 2023-01-18  发布在  其他
关注(0)|答案(1)|浏览(164)

在看了一堆类似的答案后无法解决这个问题。
我拥有的CSV只有Dataframe打印的最后一行。
我需要整个 Dataframe 被记录在CSV和 parquet 文件。

for r in records:
    content = '-----BEGIN CERTIFICATE-----' + '\n' + \
            r[1] + '\n'+'-----END CERTIFICATE-----'

    try:
        cert = x509.load_pem_x509_certificate(str.encode(content))
        cert_policy_value = cert.extensions.get_extension_for_oid(
        ExtensionOID.CERTIFICATE_POLICIES).value

        for ext in cert_policy_value:
            policy_check = ext.policy_identifier.dotted_string
            # logging.info(ext.policy_identifier.dotted_string)

        #Check whether the cert policy oid is Qualified or Non-QF        
        if policy_check in qualified_qv_cert_oid:
            flag = 'Non-QF'
            logging.info('NON-QLFY')
        else:
            flag = 'QLFY'

    except BaseException as e:
        logging.error(f'Error found for cert: {e}')
        pass

    #Prepaing the Dataframe to write to a parquet file
    df = pd.DataFrame([{'id': r[0], 'flag':flag}])
    df.insert(2, 'timestamp', datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
    df.to_csv('qv_output.csv', index=False, encoding='utf-8')
    df.to_parquet(path='qv_parsing.parquet', engine='auto', compression='snappy', index=False, partition_cols=None, storage_options=None)
mklgxw1f

mklgxw1f1#

在阅读了Michael Butscher的答案之后,我尝试将我的行附加到Dataframe。
得出了与此答案类似的方法。Create a pd dataframe by appending one row at a time
我的代码:

def append_row(df, row):
    '''
    A method which requires a dataframe & a new row to be appended to the dataframe.
    '''
    return pd.concat([
                        df, pd.DataFrame([row], 
                        columns=row.index)
                        ]).reset_index(drop=True)
def parse():
    for r in records:
        content = '-----BEGIN CERTIFICATE-----' + '\n' + \
            r[1] + '\n'+'-----END CERTIFICATE-----'

        try:
            cert = x509.load_pem_x509_certificate(str.encode(content))
            cert_policy_value = cert.extensions.get_extension_for_oid(
            ExtensionOID.CERTIFICATE_POLICIES).value

            for ext in cert_policy_value:
                policy_check = ext.policy_identifier.dotted_string
                # logging.info(ext.policy_identifier.dotted_string)

            #Check whether the cert policy oid is Qualified or Non-QF        
            if policy_check in qualified_qv_cert_oid:
                flag = 'Non-QF'
                logging.info('NON-QLFY')
            else:
                flag = 'QLFY'
            
        except BaseException as e:
            logging.error(f'Error found for cert: {e}')
            pass

        #Prepaing the Dataframe to write to a parquet file
        timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        new_row = pd.Series({'id': r[0], 'flag':flag, 'timestamp':timestamp})
        df = append_row(df, new_row)
        df.to_csv('qv_output.csv', index=False, encoding='utf-8')
        df.to_parquet(path='qv_parsing.parquet', engine='auto', compression='snappy', index=False, partition_cols=None, storage_options=None)
        logging.info(df)

相关问题