我正在尝试使用apachecamel（版本2.25.3）reactive streams结合spring boot来读取一个大的csv文件，并使用bindy解组行。这是“工作”的意义上说，应用程序运行和检测文件，因为他们出现，但我然后只看到我的流文件的第一行。我知道它必须与bindy相关，因为如果我把解组从等式中去掉，我就可以很好地恢复流中csv文件的所有行。我已经简化了这个问题，以便在这里演示。我正在使用springwebflux来公开结果发布者。
所以我的 Camel 路线如下：

import lombok.RequiredArgsConstructor;
import org.apache.camel.builder.RouteBuilder;
import org.apache.camel.component.reactive.streams.api.CamelReactiveStreamsService;
import org.apache.camel.dataformat.bindy.csv.BindyCsvDataFormat;
import org.reactivestreams.Publisher;
import org.springframework.stereotype.Component;
import reactor.core.publisher.Flux;

@RequiredArgsConstructor
@Component
public class TransactionLineCsvRoute extends RouteBuilder {
    private final CamelReactiveStreamsService camelRs;

    @Override
    public void configure() {
        var bindy = new BindyCsvDataFormat(LineItem.class);

        from("file:input/?include=.*\\.csv&move=successImport&moveFailed=failImport")
                .unmarshal(bindy)
                .to("reactive-streams:lineItems");
    }

    public Flux<LineItem> getLineItemFlux() {
        Publisher<LineItem> lineItems = camelRs.fromStream("lineItems", LineItem.class);

        return Flux.from(lineItems);
    }
}

bindy类：

@ToString
@Getter
@CsvRecord(separator = ";", skipFirstLine = true, skipField =true)
public class LineItem {
    @DataField(pos = 2)
    private String description;
}

以及暴露焊剂的端点：

@GetMapping(value = "/lineItems", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<LineItem> lineItems() {
    return lineItemFlux;
}

所以当我现在做卷发的时候：

curl localhost:8080/lineItems

我只返回第一行，而当我删除“.unmarshal（bind）”行（并将流重构为string类型而不是lineitem）时，我返回csv文件的所有元素。
所以我想我没有在React流上下文中使用bindy correct。我遵循这个驼峰文档，并试图重写我的路线如下：

from("file:input/?include=.*\\.csv&move=successImport&moveFailed=failImport")
        .to("reactive-streams:rawLines");

from("reactive-streams:rawLines")
        .unmarshal(bindy)
        .to("reactive-streams:lineItems");

它显示路由已正确启动：

2021-01-04 10:13:26.798  INFO 26438 --- [           main] o.a.camel.spring.SpringCamelContext      : Route: route1 started and consuming from: file://input/?include=.*%5C.csv&move=successImport&moveFailed=failImport
2021-01-04 10:13:26.800  INFO 26438 --- [           main] o.a.camel.spring.SpringCamelContext      : Route: route2 started and consuming from: reactive-streams://rawLines
2021-01-04 10:13:26.801  INFO 26438 --- [           main] o.a.camel.spring.SpringCamelContext      : Total 2 routes, of which 2 are started

但是我得到一个异常，声明“流没有活动订阅”：

Message History
---------------------------------------------------------------------------------------------------------------------------------------
RouteId              ProcessorId          Processor                                                                        Elapsed (ms)
[route1            ] [route1            ] [file://input/?include=.*%5C.csv&move=successImport&moveFailed=failImport      ] [         9]
[route1            ] [to1               ] [reactive-streams:rawLines                                                     ] [         5]

Stacktrace
---------------------------------------------------------------------------------------------------------------------------------------

java.lang.IllegalStateException: The stream has no active subscriptions
    at org.apache.camel.component.reactive.streams.engine.CamelPublisher.publish(CamelPublisher.java:108) ~[camel-reactive-streams-2.25.3.jar:2.25.3]
    at org.apache.camel.component.reactive.streams.engine.DefaultCamelReactiveStreamsService.sendCamelExchange(DefaultCamelReactiveStreamsService.java:144) ~[camel-reactive-streams-2.25.3.jar:2.25.3]
    at org.apache.camel.component.reactive.streams.ReactiveStreamsProducer.process(ReactiveStreamsProducer.java:52) ~[camel-reactive-streams-2.25.3.jar:2.25.3]

有没有人告诉我如何将bindy与React流结合使用？谢谢！
编辑
在burki的非常有用的帖子之后，我能够修复我的代码。因此，路由定义更改为以下内容。如您所见，我删除了unmarshal步骤，因此它只是在文件到达时从文件系统中提取文件，并将它们放入一个React流中：

@Override
public void configure() {
    from("file:input/?include=.*\\.csv&move=successImport&moveFailed=failImport")
            .to("reactive-streams:extractedFile");
}

然后将文件流作为通量公开：

public Flux<File> getFileFlux() {
    return Flux.from(camelRs.fromStream("extractedFile", File.class));
}

解析csv的代码如下（使用burki建议的opencsv，但使用api的另一部分）：

private Flux<LineItem> readLineItems() {
    return fileFlux
            .flatMap(message -> Flux.using(
                    () -> new CsvToBeanBuilder<LineItem>(createFileReader(message)).withSkipLines(1)
                            .withSeparator(';')
                            .withType(LineItem.class)
                            .build()
                            .stream(),
                    Flux::fromStream,
                    BaseStream::close)
            );
}

private FileReader createFileReader(File file) {
    System.out.println("Reading file from: " + file.getAbsolutePath());
    try {
        return new FileReader(file);
    } catch (FileNotFoundException e) {
        throw new RuntimeException(e);
    }
}

现在可以将此结果流作为端点公开：

@GetMapping(value = "/lineItems", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<LineItem> lineItems() {
    return readLineItems();
}

现在，当你像我上面做的那样做 curl 时，你会从csv中得到完整的未编组的行项目。
我仍然要做的是，这是否真的将整个文件加载到内存中。我不这么认为，我想我只得到一个指向文件的指针，然后流到opencsv bean，但我需要验证这一点，可能是我现在先把整个文件读到内存中，然后流到内存中，这会破坏目的。

1条答案

按热度按时间

tf7tbtn21#

我猜文件使用者只是将整个文件传递到解组步骤。
因此，如果将文件使用者的结果解组到 LineItem ，将整个文件内容“减少”到第一行。
相反，如果删除反编组，则会得到整个文件内容。但文件使用者可能在传递文件之前将整个文件加载到内存中。
但阅读整个文件不是你想要的。要逐行读取csv文件，需要以流模式拆分文件。

from("file:...")
    .split(body().tokenize(LINE_FEED)).streaming()
    .to("direct:processLine")

就像这样，拆分器将每条线发送到路由 direct:processLine 以便进一步处理。
我在这个场景中遇到的问题是解析单个csv行。大多数csv库的设计目的是读取和解析整个文件，而不是单行。
然而，相当旧的opencsv库有一个 CSVParser 用一个 parseLine(String csvLine) 方法。所以我用这个来解析一个“完全分离”的csv行。

赞(0）回复(0）举报 2021-06-29

apache camelReact流，仅bindy读取第一行

1条答案

相关问题

热门标签

最新问答