有人知道塞纳本生图书馆(https://github.com/cerner/bunsen)将加载fhir r4包并将数据持久化到spark sql数据库?如果有人能给我任何指导或指点,那就太好了。目前我只是想从https://simplifier.net/ukcore. 最终目标是将传入的bundle持久化到一个hive数据库中,以供apachespark集群访问。
尝试加载单个条目包的示例代码是:
Bundles bundles = Bundles.forR4();
URL fileUrl = R4Test.class.getClassLoader().getResource("ukcore/UKCore-AllergyIntolerance-Amoxicillin-Example.json");
JavaRDD bundlesRdd = bundles.loadFromDirectory(spark, fileUrl.toExternalForm(), 200);
Object c = bundlesRdd.collect();
bundles.saveAsDatabase(spark, bundlesRdd, "r4database", "AllergyIntolerance");
上 bundlesRdd.collect()
我收到以下警告:
INFO WholeTextFileRDD: Input split: Paths:/path/to/ukcore/UKCore-AllergyIntolerance-Amoxicillin-Example.json:0+2017
WARN LenientErrorHandler: Unknown element 'meta' found while parsing
WARN LenientErrorHandler: Unknown element 'clinicalStatus' found while parsing
WARN LenientErrorHandler: Unknown element 'verificationStatus' found while parsing
WARN LenientErrorHandler: Unknown element 'type' found while parsing
WARN LenientErrorHandler: Unknown element 'category' found while parsing
WARN LenientErrorHandler: Unknown element 'code' found while parsing
WARN LenientErrorHandler: Unknown element 'patient' found while parsing
WARN LenientErrorHandler: Unknown element 'encounter' found while parsing
WARN LenientErrorHandler: Unknown element 'recordedDate' found while parsing
WARN LenientErrorHandler: Unknown element 'recorder' found while parsing
WARN LenientErrorHandler: Unknown element 'asserter' found while parsing
WARN LenientErrorHandler: Unknown element 'reaction' found while parsing
当你试图 saveAsDatabase()
失败原因:
java.lang.IllegalArgumentException: Unsupported FHIR version: R4
at com.cerner.bunsen.definitions.StructureDefinitions.create(StructureDefinitions.java:120)
at com.cerner.bunsen.spark.SparkRowConverter.forResource(SparkRowConverter.java:75)
at com.cerner.bunsen.spark.SparkRowConverter.forResource(SparkRowConverter.java:54)
at com.cerner.bunsen.spark.Bundles.extractEntry(Bundles.java:211)
at com.cerner.bunsen.spark.Bundles.saveAsDatabase(Bundles.java:290)
我当前运行的依赖项如下:
<dependencies>
<dependency>
<groupId>com.cerner.bunsen</groupId>
<artifactId>bunsen-r4</artifactId>
<version>0.4.5</version>
</dependency>
<dependency>
<groupId>com.cerner.bunsen</groupId>
<artifactId>bunsen-core</artifactId>
<version>0.5.7</version>
</dependency>
<dependency>
<groupId>com.cerner.bunsen</groupId>
<artifactId>bunsen-spark</artifactId>
<version>0.5.7</version>
</dependency>
<!--
to resolve java.lang.IllegalAccessError:
"tried to access method com.google.common.base.Stopwatch.<init>()V from class
org.apache.hadoop.mapreduce.lib.input.FileInputFormat"
-->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.7.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.2</version>
</dependency>
<!-- Spark dependencies -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.5</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.5</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.4.5</version>
</dependency>
</dependencies>
非常感谢,
戴夫
1条答案
按热度按时间46scxncf1#
目前r4版本是不受支持的,因为在0.5.x版本和它的主要变化在我们的路线图,但我们还没有一个预计到达时间。
如果您试图探索示例数据,请使用同时支持stu3和r4的0.4.6版本进行测试。请注意,旧版本不再维护。
谢谢,阿马雷什