本文整理了Java中org.apache.spark.sql.DataFrame.rdd()
方法的一些代码示例,展示了DataFrame.rdd()
的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。DataFrame.rdd()
方法的具体详情如下:
包路径:org.apache.spark.sql.DataFrame
类名称:DataFrame
方法名:rdd
暂无
代码示例来源:origin: amidst/toolbox
static JavaRDD<DataInstance> toDataInstanceRDD(DataFrame data, Attributes attributes) {
JavaRDD<double[]> rawRDD = data.rdd()
.toJavaRDD()
.map( row -> transformRow2DataInstance(row, attributes) );
return rawRDD.map(v -> new DataInstanceFromDataRow( new DataRowSpark(v, attributes) ) );
}
代码示例来源:origin: org.wso2.carbon.analytics/org.wso2.carbon.analytics.spark.core
private void writeDataFrameToDAL(DataFrame data) {
if (this.preserveOrder) {
logDebug("Inserting data with order preserved! Each partition will be written using separate jobs.");
for (int i = 0; i < data.rdd().partitions().length; i++) {
data.sqlContext().sparkContext().runJob(data.rdd(),
new AnalyticsWritingFunction(this.tenantId, this.tableName, data.schema(),
this.globalTenantAccess, this.schemaString, this.primaryKeys, this.mergeFlag,
this.recordStore, this.recordBatchSize), CarbonScalaUtils.getNumberSeq(i, i + 1),
false, ClassTag$.MODULE$.Unit());
}
} else {
data.foreachPartition(new AnalyticsWritingFunction(this.tenantId, this.tableName, data.schema(),
this.globalTenantAccess, this.schemaString, this.primaryKeys, this.mergeFlag,
this.recordStore, this.recordBatchSize));
}
}
内容来源于网络,如有侵权,请联系作者删除!