如何解析apachespark中kafka中编码为字符串的avro模式?我正在使用apachespark流媒体。我已经以avro格式存储了我的clickstreams,我使用了divolte收集器来获取点击。此外,我正在使用Kafka得到实时到Spark流点击流。现在我想将这个avro字符串反序列化为一个模式,以供spark进一步使用。
我使用scala case类和genseler.scalavro库来实现它,但没有成功。下面是代码
case class Change(detectedDuplicate : Boolean,
detectedCorruption: Boolean,
firstInSession: Boolean,
timestamp: Long,
remoteHost: String,
referer: String,
location: String,
viewportPixelWidth: Int,
viewportPixelHeight: Int,
screenPixelWidth: Int,
screenPixelHeight: Int,
partyId: String,
sessionId: String,
pageViewId: String,
eventType: String,
userAgentString: String,
userAgentName: String,
userAgentFamily: String,
userAgentVendor: String,
userAgentType: String,
userAgentVersion: String,
userAgentDeviceCategory: String,
userAgentOsFamily: String,
userAgentOsVersion: String,
userAgentOsVendor: String)
object kafkaParser{
def parse(event: String): Change = {
val m = AvroType[Change]
return new Change(m.schema)// gives me a error at this point unspecified value parameters
}
}
m、 schema给了我avro文件的模式。
http://genslerappspod.github.io/scalavro/
请帮助我如何进行这项工作。
暂无答案!
目前还没有任何答案,快来回答吧!