我是Pig的初学者。
我在wiki之后编写了一个程序,将文件中的单词转换为大写。
--cat上限.java
package com.bigdata.myUdf;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;
public class UPPER extends EvalFunc<String> {
public String exec(Tuple input) throws IOException {
if (input == null || input.size() == 0)
return null;
try{
String str = (String)input.get(0);
return str.toUpperCase();
}catch(Exception e){
throw WrappedIOException.wrap("Caught exception processing input row ", e);
}
}
}
--cat/home/hduser/lab/mydata/myscript.pig
REGISTER /home/hduser/software/myUdfs/UPPER.jar
std_det = LOAD '/pigdata/udf1.txt' USING PigStorage(',') as (name:chararray);
B = FOREACH std_det GENERATE com.bigdata.myUdf.UPPER(name);
dump B;
但当我运行它时,我得到了错误。
java -cp com.bigdata.myUdf.UPPER.jar org.apache.pig.Main -x local /home/hduser/lab/mydata/myscript.pig
错误
Error: Could not find or load main class org.apache.pig.Main
cat.bashrc公司
export PIG_INSTALL=/home/hduser/software/pig
export PATH="${PATH}:${PIG_INSTALL}/bin"
export PIG_CLASSPATH=$HADOOP_CONF_DIR:${PIG_INSTALL}:.
export CLASSPATH=.:${PIG_CLASSPATH}
pig脚本位于:/home/hduser/lab/mydata/myscript.pig
jar文件位于:/home/hduser/software/myudfs/upper.jar
请帮助我理解我做错了什么。提前谢谢。在遵循湿婆沙克提的指示之后。程序运行了,但没有任何输出。
pig -x local myScript.pig
15/01/05 04:47:57 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
15/01/05 04:47:57 INFO pig.ExecTypeProvider: Picked LOCAL as the ExecType
2015-01-05 04:47:57,920 [main] INFO org.apache.pig.Main - Apache Pig version 0.14.0 (r1640057) compiled Nov 16 2014, 18:02:05
2015-01-05 04:47:57,921 [main] INFO org.apache.pig.Main - Logging error messages to: /home/hduser/lab/piglog/pig_1420462077918.log
2015-01-05 04:47:57,959 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - user.name is deprecated. Instead, use mapreduce.job.user.name
2015-01-05 04:47:58,314 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-05 04:47:58,315 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2015-01-05 04:47:58,318 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2015-01-05 04:47:58,463 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-05 04:47:59,070 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2015-01-05 04:47:59,227 [main] INFO org.apache.pig.Main - Pig script completed in 2 seconds and 505 milliseconds (2505 ms)
2条答案
按热度按时间zd287kbt1#
该错误指示apachejar不在类路径中。
-cp com.bigdata.myUdf.UPPER.jar
不包括必需的jar。它只包括'upper.jar'。您可以在这里查看如何在类路径中正确地包含所有必需的jarp、 我认为您应该使用命令行中的pig命令,而不是按自己的方式执行。但我自己没用过,所以这只是猜测。
du7egjpx2#
你能按以下步骤做吗?。
1.从下面的链接(pig-0.11.1.jar、hadoop-common-0.21.0.jar和piggybank.jar)下载3个jar文件
2将上述3个jar文件都设置为类路径
三。从当前目录创建目录名“com/bigdata/myudf/”
4编译upper.java文件,确保java\u home设置正确,并且上述三个jars文件都包含在类路径中,否则会出现编译问题
5将编译后的upper.class文件移到“com/bigdata/myudf/”文件夹中
6创建一个jar文件名upper.jar
7现在将upper.jar包含到pig脚本中并运行以下命令
运行上述命令后,您将获得实际输出。
例子
输入
myscript.pig文件
输出:
示例命令: