如何执行从Hive压缩兽人到Druid的数据摄取

k5hmc34c 于 2021-06-26 发布在 Hive

关注(0)|答案(1)|浏览(297)

我正在尝试从hdfs中的hive orc压缩表数据摄取数据到druid中。任何关于这方面的建议都会很有帮助。

Hive druid

来源：https://stackoverflow.com/questions/44220960/how-to-perform-data-ingestion-from-hive-compressed-orc-to-druid

1条答案

按热度按时间

klr1opcd1#

假设您已经安装了druid和yarn/mapreduce，那么您可以启动一个index\uhadoop任务来执行您的请求。
有一个Druid兽人的扩展，允许读取兽人的文件，我不认为它来的标准版本，所以你必须得到它的方式（我们从源代码构建）
（扩展列表）http://druid.io/docs/latest/development/extensions.html)
下面是一个示例，它将接收一堆orc文件并向数据源附加一个间隔。投递给领主http://overlord：8090/druid/索引器/v1/任务
（文件http://druid.io/docs/latest/ingestion/batch-ingestion.html)
您可能需要根据您的发行版进行调整，我记得我们在hortonworks上遇到了一些未找到类的问题（classpathprefix将有助于调整mapreduce classpath）

{
  "type": "index_hadoop",
  "spec": {
    "ioConfig": {
      "type": "hadoop",
      "inputSpec": {
        "type": "granularity",
        "inputFormat": "org.apache.hadoop.hive.ql.io.orc.OrcNewInputFormat",
        "dataGranularity": "hour",
        "inputPath": "/apps/hive/warehouse/table1",
        "filePattern": ".*",
        "pathFormat": "'partition='yyyy-MM-dd'T'HH"
      }
    },
    "dataSchema": {
      "dataSource": "cube_indexed_from_orc",
      "parser": {
        "type": "orc",
        "parseSpec": {
          "format": "timeAndDims",
          "timestampSpec": {
            "column": "timestamp",
            "format": "nano"
          },
          "dimensionsSpec": {
            "dimensions": ["cola", "colb", "colc"],
            "dimensionExclusions": [],
            "spatialDimensions": []
          }
        },
        "typeString": "struct<timestamp:bigint,cola:bigint,colb:string,colc:string,cold:bigint>"
      },
      "metricsSpec": [{
        "type": "count",
        "name": "count"
      }],
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "DAY",
        "queryGranularity": "HOUR",
        "intervals": ["2017-06-14T00:00:00.000Z/2017-06-15T00:00:00.000Z"]
      }
    },
    "tuningConfig": {
      "type": "hadoop",
      "partitionsSpec": {
        "type": "hashed",
        "targetPartitionSize": 5000000
      },
      "leaveIntermediate": false,
      "forceExtendableShardSpecs": "true"
    }
  }
}

赞(0）回复(0）举报 2021-06-26

我来回答

如何执行从Hive压缩兽人到Druid的数据摄取

1条答案

相关问题

热门标签

最新问答