Re: how to read lz4 compressed data using fileStream of spark streaming?

Akhil Das Thu, 14 May 2015 01:02:12 -0700

Here's
<https://github.com/twitter/hadoop-lzo/blob/master/src/main/java/com/hadoop/mapreduce/LzoTextInputFormat.java>
the class. You can read more here
https://github.com/twitter/hadoop-lzo#maven-repository


Thanks
Best Regards

On Thu, May 14, 2015 at 1:22 PM, lisendong <lisend...@163.com> wrote:

> LzoTextInputFormat where is this class?
> what is the maven dependency?
>
>
> 在 2015年5月14日，下午3:40，Akhil Das <ak...@sigmoidanalytics.com> 写道：
>
> That's because you are using TextInputFormat i think, try
> with LzoTextInputFormat like:
>
> val list_join_action_stream = ssc.fileStream[LongWritable, Text,
> com.hadoop.mapreduce.LzoTextInputFormat](gc.input_dir, (t: Path) => true,
> false).map(_._2.toString)
>
> Thanks
> Best Regards
>
> On Thu, May 14, 2015 at 1:04 PM, lisendong <lisend...@163.com> wrote:
>
>> I have action on DStream.
>> because when I put a text file into the hdfs, it runs normally, but if I
>> put a lz4 file, it does nothing.
>>
>> 在 2015年5月14日，下午3:32，Akhil Das <ak...@sigmoidanalytics.com> 写道：
>>
>> What do you mean by not detected? may be you forgot to trigger some
>> action on the stream to get it executed. Like:
>>
>> val list_join_action_stream = ssc.fileStream[LongWritable, Text,
>> TextInputFormat](gc.input_dir, (t: Path) => true,
>> false).map(_._2.toString)
>>
>> *list_join_action_stream.count().print()*
>>
>>
>>
>>
>> Thanks
>> Best Regards
>>
>> On Wed, May 13, 2015 at 7:18 PM, hotdog <lisend...@163.com> wrote:
>>
>>> in spark streaming, I want to use fileStream to monitor a directory. But
>>> the
>>> files in that directory are compressed using lz4. So the new lz4 files
>>> are
>>> not detected by the following code. How to detect these new files?
>>>
>>>     val list_join_action_stream = ssc.fileStream[LongWritable, Text,
>>> TextInputFormat](gc.input_dir, (t: Path) => true,
>>> false).map(_._2.toString)
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-read-lz4-compressed-data-using-fileStream-of-spark-streaming-tp22868.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com
>>> <http://nabble.com/>.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>>
>
>

Re: how to read lz4 compressed data using fileStream of spark streaming?

Reply via email to