当前位置: 代码迷 >> 综合 >> Spark Streaming 整合Flume 使用Pull方式,启动Flume报错
  详细解决方案

Spark Streaming 整合Flume 使用Pull方式,启动Flume报错

热度:46   发布时间:2023-12-04 14:24:16.0

报错1

2019-05-10 08:58:46,802 (conf-file-poller-0) [ERROR - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:147)] Failed to load configuration data. Exception follows.
org.apache.flume.FlumeException: Unable to load sink type: org.apache.spark.streaming.flume.sink.SparkSink, class: org.apache.spark.streaming.flume.sink.SparkSinkat org.apache.flume.sink.DefaultSinkFactory.getClass(DefaultSinkFactory.java:70)at org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:43)at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:450)at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:106)at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:145)at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.flume.sink.SparkSinkat java.net.URLClassLoader.findClass(URLClassLoader.java:381)at java.lang.ClassLoader.loadClass(ClassLoader.java:424)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)at java.lang.ClassLoader.loadClass(ClassLoader.java:357)at java.lang.Class.forName0(Native Method)at java.lang.Class.forName(Class.java:264)at org.apache.flume.sink.DefaultSinkFactory.getClass(DefaultSinkFactory.java:68)... 11 more

解决1:
将对应的spark-streaming-flume-sink_2.11-2.3.2.jar(根据自己使用的版本),复制到Flume 安装目录下的/lib 目录。
在这里插入图片描述

报错2

2019-05-10 09:09:08,633 (New I/O worker #2) [WARN - org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.exceptionCaught(NettyServer.java:201)] Unexpected exception from downstream.
org.apache.avro.AvroRuntimeException: Excessively large list allocation request detected: 1663066400 items! Connection closed.at org.apache.avro.ipc.NettyTransportCodec$NettyFrameDecoder.decodePackHeader(NettyTransportCodec.java:167)at org.apache.avro.ipc.NettyTransportCodec$NettyFrameDecoder.decode(NettyTransportCodec.java:139)at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)at java.lang.Thread.run(Thread.java:748)

解决2:
原因1:Flume 配置文件中source 的类型与数据输入的类型不匹配
原因2:使用的netcat 的端口错误,向sink 端口(41414)发送了数据,本应该向source端口(44444)。(博主的错误)
在这里插入图片描述

完!

  相关解决方案