我在hivejob下的querylist中编写查询。
将配置单元作业提交到dataproc群集
def submit_hive_job(dataproc, project, region,
cluster_name):
job_details = {
'projectId': project,
'job': {
'placement': {
'clusterName': cluster_name
},
"hiveJob": {
"queryList": {
###
how can i execute .sql file here which is in bucket
####
"queries": [
"CREATE TABLE IF NOT EXISTS sai ( eid int, name String, salary String, destination String)",
"Insert into table sai values (26,'Shiv','1500','ac')"
]
}
}
}
}
result = dataproc.projects().regions().jobs().submit(
projectId=project,
region=region,
body=job_details).execute()
job_id = result['reference']['jobId']
print('Submitted job Id {}'.format(job_id))
return job_id
bucket中的hive.sql文件
create table employee ( employeeid: int, employeename: string, salary: float) rows format delimited fields terminated by ‘,’ ;
describe employee;
select * from employee;
1条答案
按热度按时间z18hc3ub1#
我发现我们可以将.sql文件保存在bucket中,然后像下面那样指定queryfileuri