在dataproc上运行的flink作业找不到google应用程序默认凭据

6tr1vspr 于 2021-06-21 发布在 Flink

关注(0)|答案(1)|浏览(352)

根据大多数文档（并不多），当在googlecomputeengine上运行应用程序时，google客户端库应该自动获取用于生成vm的应用程序默认凭据。
我目前正在dataproc（托管hadoop）上运行一个flink集群。dataproc在google计算引擎平台上运行，主节点和工作节点使用vm。当我使用yarn部署作业时，作业失败，因为它无法检测应用程序默认凭据。
有人知道flink是否能够自动获取vm上的应用程序默认凭据吗？我是否需要配置任何东西，或者只是不支持此功能，我需要在代码中手动指定服务帐户json？
编辑：
更多信息。
flink作业是一个流式作业（永不结束），它拾取记录并将其插入googlebigquery表和googlebucket中。为此，我使用两个客户端库，如下所示：

<dependency>
   <groupId>com.google.cloud</groupId>
   <artifactId>google-cloud-bigquery</artifactId>
   <version>1.65.0</version>
</dependency>
<dependency>
   <groupId>com.google.cloud</groupId>
   <artifactId>google-cloud-storage</artifactId>
   <version>1.65.0</version>
</dependency>

我在主运行函数中添加了 GoogleCredentials.getApplicationDefault() 调用以确保正在获取凭据，但这会引发以下错误：

The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials

除日志记录外，还有一行 Failed to detect whether we are running on Google Compute Engine . 这让我相信，它无法在计算引擎平台上检测到它的错误。
从一些在线阅读中，他们说元数据服务器被用来检测这个。我们是在专有网络上运行的，所以我认为它无法建立这种连接。真的是这样吗？如果是的话，我还能用别的方法吗？

google-cloud-platform apache-flink google-cloud-dataproc google-compute-engine service-accounts

来源：https://stackoverflow.com/questions/63962653/flink-job-running-on-dataproc-not-finding-google-application-default-credentials

1条答案

按热度按时间

idfiyjo81#

因此，这可能不是每个人都适用，但问题是在设置中。
我使用一个kubernetes pod来启动一个yarn会话，并使用它将作业提交给flink集群。如果运行这种方法，需要记住的是，拓扑似乎是在任务管理器上运行的，而main函数是在启动yarn会话的机器上调用的。就我而言，这就是吊舱。
将服务帐户凭据装载到pod并指定 GOOGLE_APPLICATION_CREDENTIALS 指向那个目录修复了这个问题。

赞(0）回复(0）举报 2021-06-21

我来回答

在dataproc上运行的flink作业找不到google应用程序默认凭据

1条答案

相关问题

热门标签

最新问答