在CsvHelper中处理坏的CSV记录

wlzqhblo  于 2023-06-27  发布在  其他
关注(0)|答案(2)|浏览(127)

我希望能够遍历CSV文件中的所有记录,并将所有好的记录添加到一个集合中,并单独处理所有“坏”的记录。我好像做不到,我想我一定错过了什么。
如果我试图捕获BadDataException,那么后续的读取将失败,这意味着我无法继续读取文件的其余部分-

while (true)
{
    try
    {
        if (!reader.Read())
            break;

        var record = reader.GetRecord<Record>();
        goodList.Add(record);
    }
    catch (BadDataException ex)
    {
        // Exception is caught but I won't be able to read further rows in file
        // (all further reader.Read() result in same exception thrown)
        Console.WriteLine(ex.Message);
    }
}

讨论的另一个选项是设置BadDataFound回调操作来处理它-

reader.Configuration.BadDataFound = x =>
{
    Console.WriteLine($"Bad data: <{x.RawRecord}>");
};

然而,虽然回调被称为不良记录仍然结束了在我的“好名单”
在将记录添加到列表之前,是否有某种方法可以查询阅读器以查看记录是否良好?
对于这个例子,我的记录定义是-

class Record
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public int Age { get; set; }
}

和数据(第一行坏,第二行好)-

"Jo"hn","Doe",43
"Jane","Doe",21

有趣的是,用MissingFieldException处理一个丢失的字段似乎完全符合我的要求--抛出了异常,但后续的行仍然可以正常读取。

dhxwm5r4

dhxwm5r41#

这是我提供的example

void Main()
{
    using (var stream = new MemoryStream())
    using (var writer = new StreamWriter(stream))
    using (var reader = new StreamReader(stream))
    using (var csv = new CsvReader(reader))
    {
        writer.WriteLine("FirstName,LastName");
        writer.WriteLine("\"Jon\"hn\"\",\"Doe\"");
        writer.WriteLine("\"Jane\",\"Doe\"");
        writer.Flush();
        stream.Position = 0;

        var good = new List<Test>();
        var bad = new List<string>();
        var isRecordBad = false;
        csv.Configuration.BadDataFound = context =>
        {
            isRecordBad = true;
            bad.Add(context.RawRecord);
        };
        while (csv.Read())
        {
            var record = csv.GetRecord<Test>();
            if (!isRecordBad)
            {
                good.Add(record);
            }

            isRecordBad = false;
        }

        good.Dump();
        bad.Dump();
    }
}

public class Test
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
}
tf7tbtn2

tf7tbtn22#

这可以在一次加载整个列表时完成。这是通用实现。见内联注解

public static List<TOut>? CsvLoad<TMap, TOut>(string path) where TMap : ClassMap<TOut> where TOut : class, 
{
    List<TOut>? modelList = null;

    if (File.Exists(path))
    {
        var config = new CsvConfiguration(CultureInfo.InvariantCulture);
        config.ReadingExceptionOccurred = re =>
        {
            // HERE YOU CAN DO ANYTHING YOU WANT WITH A BAD ROW
            Debug.WriteLine($"Bad Row in file '{path}'; CSV ERROR: {re.Exception}");
            return false; // <-- tells process to continue
        };      

        try
        {

            using (var stream = new StreamReader(path))
            using (var csv = new CsvReader(stream, config))
            {
                csv.Context.RegisterClassMap<TMap>();
                modelList = csv.GetRecords<TOut>().ToList(); // <-- get all records
            }
        }
        catch (Exception ex)
        {
            // This must be some bad exception
        }
    }
    else
    {
        // LOG, THROW EXCEPTION, whatever
    }
    return modelList;

}

注:可根据需要添加更多配置。可以将options参数添加到方法中。

相关问题