我正试着在气流1.9上运行Hive操作器。
在python(intellij环境)中,代码编译时没有任何错误
已配置连接
代码为:
import airflow
from airflow.operators.hive_operator import HiveOperator
from airflow.hooks.hive_hooks import HiveCliHook
from airflow.models import DAG
from datetime import timedelta
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': airflow.utils.dates.days_ago(2),
'email': ['support@mail.com'],
'email_on_failure': True,
'retries': 2,
'retry_delay': timedelta(seconds=30),
'catchup': False,
}
HiveCli_hook = HiveCliHook(hive_cli_conn_id='hive_cli_default')
hql = 'INSERT INTO test.test_table SELECT DISTINCT id FROM
test.tabl_test;'
dag = DAG(
dag_id='Hive_in_action',
default_args=default_args,
schedule_interval='0 0 * * *',
dagrun_timeout=timedelta(minutes=60))
create_test_table = HiveOperator(
task_id="create_test_table",
hql=hql,
hive_cli_conn_id=HiveCli_hook,
dag=dag
)
我使用隧道,这就是localhost的原因
我得到一个错误:
错误-'hiveclihook'对象没有属性'upper'
日志的最大部分:
[2018-04-09 16:40:14,672] {models.py:1428} INFO - Executing Task(HiveOperator): create_test_table> on 2018-04-09 14:39:08
[2018-04-09 16:40:14,672] {base_task_runner.py:115} INFO - Running: ['bash', '-c', 'airflow run Hive_in_action create_test_table 2018-04-09T14:39:08 --job_id 19 --raw -sd DAGS_FOLDER/Hive_in_action.py']
[2018-04-09 16:40:15,283] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,282] {__init__.py:45} INFO - Using executor SequentialExecutor
[2018-04-09 16:40:15,361] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,360] {models.py:189} INFO - Filling up the DagBag from /Users/mypc/airflow/dags/Hive_in_action.py
[2018-04-09 16:40:15,387] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,387] {base_hook.py:80} INFO - Using connection to: localhost
[2018-04-09 16:40:15,400] {cli.py:374} INFO - Running on host MyPC.local
[2018-04-09 16:40:15,413] {base_task_runner.py:98} INFO - Subtask: [2018-04-09 16:40:15,412] {hive_operator.py:96} INFO - Executing: INSERT INTO test.test_table SELECT DISTINCT id FROM test.tabl_test;
[2018-04-09 16:40:15,412] {models.py:1595} ERROR - 'HiveCliHook' object has no attribute 'upper'
Traceback (most recent call last):
File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/models.py", line 1493, in _run_raw_task
result = task_copy.execute(context=context)
File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/operators/hive_operator.py", line 97, in execute
self.hook = self.get_hook()
File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/operators/hive_operator.py", line 86, in get_hook
mapred_job_name=self.mapred_job_name)
File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/hive_hooks.py", line 71, in __init__
conn = self.get_connection(hive_cli_conn_id)
File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/base_hook.py", line 77, in get_connection
conn = random.choice(cls.get_connections(conn_id))
File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/base_hook.py", line 68, in get_connections
conn = cls._get_connection_from_env(conn_id)
File "/Users/mypc/anaconda/lib/python3.6/site-packages/airflow/hooks/base_hook.py", line 60, in _get_connection_from_env
environment_uri = os.environ.get(CONN_ENV_PREFIX + conn_id.upper())
AttributeError: 'HiveCliHook' object has no attribute 'upper'
[2018-04-09 16:40:15,416] {models.py:1622} INFO - All retries failed; marking task as FAILED
2条答案
按热度按时间kt06eoxx1#
看起来您正在将hiveclihook对象作为http\u conn\u id传递。我认为hiveoperator使用upper()函数将预期的字符串转换为大写,因此
hive_cli_conn_id=HiveCli_hook,
导致了这个错误。3pvhb19x2#
不应为变量或对象指定与类相同的名称:
HiveCliHook = HiveCliHook(...)
请改用另一个名称: