使用apachespark一步一步快速开始,但最后显示此警告消息
20/05/25 09:43:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
log4j:WARN No appenders could be found for logger (org.apache.spark.deploy.SparkSubmit$$anon$2).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
我的密码是
package firstmaven;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.Dataset;
/**
* Hello world!
*
*/
public class App
{
public static void main( String[] args )
{
String logFile = "/data/spark/README.md";
SparkSession spark = SparkSession.builder().appName("Simple Application")
.config("spark.master","local").getOrCreate();
Dataset<String> logData = spark.read().textFile(logFile).cache();
long numAs = logData.filter(s -> s.contains("a")).count();
long numBs = logData.filter(s -> s.contains("b")).count();
System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
System.out.println("Hello world");
spark.stop();
}
}
我该怎么做才能让它工作?谢谢。
1条答案
按热度按时间yh2wf1be1#
您需要指定要筛选的列的值。请检查以下代码段:
假设
input.txt
是这样吗上面代码段的输出如下
希望这有帮助。