访问hive mapreduce中的数据

xjreopfe  于 2021-06-04  发布在  Hadoop
关注(0)|答案(1)|浏览(351)

我尝试从配置单元表加载数据并将数据放入另一个表中。从表加载数据

CREATE  TABLE `dmg_bindings`(
  `viuserid` string, 
  `puid` string, 
  `ts` bigint)
PARTITIONED BY ( 
  `dt` string, 
  `pid` string)

把数据放进去

CREATE  TABLE `newdmgbnd`(
  `ts` int, 
  `puid1` string, 
  `puid2` string)
PARTITIONED BY ( 
  `dt` string, 
  `platid1` string, 
  `platid2` string)

但是我有一个问题,找不到我错的地方。我有下一个错误:

15/01/15 10:22:07 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    15/01/15 10:22:07 INFO hive.metastore: Trying to connect to metastore with URI thrift://srv112.test.local:9083
    15/01/15 10:22:07 INFO hive.metastore: Connected to metastore.
    15/01/15 10:22:08 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6d88b065] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6e205d5c] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@5b031819] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@223e0fa1] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@1d73aa82] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@1b10b8a3] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@506422f2] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@3f0eca9f] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@da24f04] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6ad66647] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2469fb45] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2b2b5f52] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@4ba6fc80] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2a5c3214] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@666e18bb] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6a974e] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2c09f7be] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@362239c7] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@7ac85bb5] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@4d9e25f] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@1a74fc3d] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@17c02eb9] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@847ac3e] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@656a0389] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@f775a5b] nullstring=\N
    15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@53ef7ba0] nullstring=\N
    15/01/15 10:22:08 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    15/01/15 10:22:09 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    15/01/15 10:22:10 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    15/01/15 10:22:10 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 2
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 2
    15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 16
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
    15/01/15 10:22:12 INFO mapred.JobClient: Running job: job_201412021320_0142
    15/01/15 10:22:13 INFO mapred.JobClient:  map 0% reduce 0%
    15/01/15 10:22:24 INFO mapred.JobClient: Task Id : attempt_201412021320_0142_m_000002_0, Status : FAILED
    java.lang.NullPointerException
        at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:167)
        at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
        at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
        at MapNewDmg.map(MapNewDmg.java:32)
        at MapNewDmg.map(MapNewDmg.java:15)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(Use
    attempt_201412021320_0142_m_000002_0: SLF4J: Class path contains multiple SLF4J bindings.
    attempt_201412021320_0142_m_000002_0: SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    attempt_201412021320_0142_m_000002_0: SLF4J: Found binding in [jar:file:/mnt1/mapred/local/taskTracker/mvolosnikova/jobcache/job_201412021320_0142/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    attempt_201412021320_0142_m_000002_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    attempt_201412021320_0142_m_000002_0: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

我的代码驱动程序.class。

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.WritableComparable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.MultipleInputs;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hive.hcatalog.data.DefaultHCatRecord;
import org.apache.hive.hcatalog.data.schema.HCatFieldSchema;
import org.apache.hive.hcatalog.data.schema.HCatSchema;
import org.apache.hive.hcatalog.mapreduce.HCatInputFormat;
import org.apache.hive.hcatalog.mapreduce.HCatOutputFormat;
import org.apache.hive.hcatalog.mapreduce.InputJobInfo;
import org.apache.hive.hcatalog.mapreduce.OutputJobInfo;
import java.io.FileInputStream;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.*;

public class Driver extends Configured implements Tool{
    @Override
    public int run(String[] strings) throws Exception {
        Configuration conf = getConf();
        Job job = Job.getInstance(conf, "newDmg");
        HCatInputFormat.setInput(job, "default", "dmg_bindings", "dt=\"2014-09-01\"");
        job.setJarByClass(Driver.class);
        job.setMapperClass(MapNewDmg.class);
        job.setNumReduceTasks(0);
        job.setInputFormatClass(HCatInputFormat.class);
        job.setOutputKeyClass(WritableComparable.class);
        job.setOutputValueClass(DefaultHCatRecord.class);
        job.setOutputFormatClass(HCatOutputFormat.class);
        Map staticPartitions = new HashMap<String, String>(1);
        staticPartitions.put("dt", "2014-09-01");
        List dynamicPartitions = new ArrayList<String>();
        dynamicPartitions.add("platid1");
        dynamicPartitions.add("platid2");
        OutputJobInfo jobInfo = OutputJobInfo.create("default", "newdmgbnd", staticPartitions);
        jobInfo.setDynamicPartitioningKeys(dynamicPartitions);
        HCatOutputFormat.setOutput(job, jobInfo);
        HCatSchema schema = HCatOutputFormat.getTableSchema(job);
        schema.append(new HCatFieldSchema("platid1", HCatFieldSchema.Type.STRING, ""));
        schema.append(new HCatFieldSchema("platid2", HCatFieldSchema.Type.STRING, ""));
        HCatOutputFormat.setSchema(job, schema);
        return job.waitForCompletion(true) ? 0 : 1;
    }
    public static void main(String[] args) throws Exception {
        int exitcode = ToolRunner.run(new Driver(), args);
        System.exit(exitcode);
    }
}

我的代码Map器.class。

import org.apache.hadoop.io.WritableComparable;
import org.apache.hive.hcatalog.data.DefaultHCatRecord;
import org.apache.hive.hcatalog.data.HCatRecord;
import org.apache.hive.hcatalog.data.schema.HCatFieldSchema;
import org.apache.hive.hcatalog.data.schema.HCatSchema;
import org.apache.hive.hcatalog.mapreduce.HCatInputFormat;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Mapper;

public class MapNewDmg extends Mapper<WritableComparable, HCatRecord, WritableComparable, HCatRecord> {
    @Override
    protected void map(WritableComparable key, HCatRecord value, Context context) throws IOException, InterruptedException {
        String viuserid = (String) value.get(0);
        String puid = (String) value.get(1);
        Long ts = (Long) value.get(2);
        String pid = (String) value.get(4);
        int newts = (int) (ts / 1000);
        HCatRecord record = new DefaultHCatRecord(6);
        record.set(0, newts);
        record.set(1, viuserid);
        record.set(2, puid);
        record.set(4, "586");
        record.set(5, pid);
        context.write(null, record);
    }
}

在我的程序中我做错了什么?我不明白为什么会出现这个错误,因为我的数据不是空的(是的,我退房了)请帮帮我。谢谢。

9jyewag0

9jyewag01#

在你调用的Map器中 context.write(null, record); 这是错误的。如果不想指定密钥,请使用 NullWritable (更改Map程序、驱动程序的声明以反映所使用的新类型,以及 context.write(null, record);context.write(NullWritable.get(), record); 但如果您涉及减速机,这不是最佳解决方案(这不是您的情况,但仅供参考),请参阅此处了解详细信息:https://support.pivotal.io/hc/en-us/articles/202810986-mapper-output-key-value-nullwritable-can-cause-reducer-phase-to-move-slowly

相关问题