Zeppelin Python Flink无法打印到控制台

thtygnil  于 2022-12-09  发布在  Apache
关注(0)|答案(1)|浏览(301)

I'm using Kinesis Data Analytics Studio which provides a Zeppelin environment.
Very simple code:

%flink.pyflink

from pyflink.common.serialization import JsonRowDeserializationSchema
from pyflink.common.typeinfo import Types
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.datastream.connectors import FlinkKafkaConsumer

# create env = determine app runs locally or remotely

env = s_env or StreamExecutionEnvironment.get_execution_environment()
env.add_jars("file:///home/ec2-user/flink-sql-connector-kafka_2.12-1.13.5.jar")

# create a kafka consumer

deserialization_schema = JsonRowDeserializationSchema.builder() \
    .type_info(type_info=Types.ROW_NAMED(
      ['id', 'name'], 
      [Types.INT(), Types.STRING()])
    ).build()

kafka_consumer = FlinkKafkaConsumer(
    topics='nihao',
    deserialization_schema=deserialization_schema,
    properties={
      'bootstrap.servers': 'kakfa-brokers:9092', 
      'group.id': 'group1'
})

kafka_consumer.set_start_from_earliest()

ds = env.add_source(kafka_consumer)

ds.print()

env.execute('job1')

I can get this working locally can sees change logs being produced to console. However I cannot get the same results in Zeppelin.
Also checked STDOUT in Flink web console task managers, nothing is there too.
Am I missing something? Searched for days and could not find anything on it.

cotxawn7

cotxawn71#

我不是100%肯定,但我认为您可能需要一个接收器来开始通过数据流拉取数据,您可能会使用包含的打印接收器函数

相关问题