在尝试将本地环境设置为将数据从Flink流式传输到MinIO上的Iceberg表时遇到了一些问题。
[ERROR] Could not execute SQL statement. Reason:
org.apache.hadoop.hive.metastore.api.MetaException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
字符串
这是我用于Flink jobmanager/taskmanager和sqlclient的Dockerfile
FROM flink:1.16.2-scala_2.12-java11
ENV HADOOP_VERSION=3.3.2
RUN APACHE_HADOOP_URL=https://archive.apache.org/dist/hadoop/ \
&& HADOOP_VERSION=3.3.2 \
&& wget ${APACHE_HADOOP_URL}/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz \
&& tar xzvf hadoop-${HADOOP_VERSION}.tar.gz \
&& HADOOP_HOME=`pwd`/hadoop-${HADOOP_VERSION}
ENV HADOOP_CLASSPATH=/opt/flink/hadoop-${HADOOP_VERSION}/etc/hadoop:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/common/lib/*:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/common/*:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/hdfs:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/hdfs/lib/*:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/hdfs/*:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/mapreduce/*:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/yarn:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/yarn/lib/*:/opt/flink/hadoop-${HADOOP_VERSION}/share/hadoop/yarn/*
COPY lib/flink-sql-connector-hive-3.1.2_2.12-1.16.2.jar /opt/flink/lib/
COPY lib/flink-sql-connector-kafka-1.16.2.jar /opt/flink/lib/
COPY lib/iceberg-flink-runtime-1.16-1.3.0.jar /opt/flink/lib/
COPY lib/iceberg-hive-runtime-1.3.0.jar /opt/flink/lib/
COPY lib/hive-metastore-3.1.3.jar /opt/flink/lib/
COPY lib/hadoop-aws-3.3.2.jar /opt/flink/lib/
COPY lib/aws-java-sdk-bundle-1.11.1026.jar /opt/flink/lib/
COPY lib/flink-s3-fs-hadoop-1.16.2.jar /opt/flink/plugins/
WORKDIR /opt/flink
型
下面是docker-compose服务定义:
sqlclient:
container_name: sqlclient
build: flink
command:
- /opt/flink/bin/sql-client.sh
- embedded
depends_on:
- jobmanager
environment:
- ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-hadoop-1.16.2.jar
- JOB_MANAGER_RPC_ADDRESS=jobmanager
- AWS_ACCESS_KEY_ID=minio
- AWS_SECRET_ACCESS_KEY=minio123
- AWS_REGION=us-east-1
volumes:
- ./flink-sql:/etc/sql
jobmanager:
build: flink
hostname: "jobmanager"
container_name: "jobmanager"
expose:
- "6123"
ports:
- "8081:8081"
command: jobmanager
environment:
- ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-hadoop-1.16.2.jar
- JOB_MANAGER_RPC_ADDRESS=jobmanager
- AWS_ACCESS_KEY_ID=minio
- AWS_SECRET_ACCESS_KEY=minio123
- AWS_REGION=us-east-1
taskmanager:
build: flink
hostname: "taskmanager"
container_name: "taskmanager"
expose:
- "6121"
- "6122"
depends_on:
- jobmanager
command: taskmanager
links:
- jobmanager:jobmanager
environment:
- ENABLE_BUILT_IN_PLUGINS=flink-s3-fs-hadoop-1.16.2.jar
- JOB_MANAGER_RPC_ADDRESS=jobmanager
- AWS_ACCESS_KEY_ID=minio
- AWS_SECRET_ACCESS_KEY=minio123
- AWS_REGION=us-east-
型
1条答案
按热度按时间a6b3iqyw1#
您正在将S3插件复制到
plugin
文件夹的根目录中。要使用可插拔的文件系统,您必须在启动Flink之前将相应的JAR文件复制到Flink发行版的plugins目录下的目录中。有关此的更多信息,请访问https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/plugins/