Take a look at these options:
- HBase Sinks (send data into HBase):
http://flume.apache.org/FlumeUserGuide.html#hbasesinks
- Apache Flume Morphline Solr Sink (for heavy duty ETL processing and ingestion
into Solr):
http://flume.apache.org/FlumeUserGuide.html#morphlinesolrsink
> Best,
> Flavio
>
> On Fri, Jul 19, 2013 at 12:51 AM, Wolfgang Hoschek
> wrote:
> Take a look at these options:
>
> - HBase Sinks (send data into HBase):
>
> http://flume.apache.org/FlumeUserGuide.html#hbasesinks
>
> - Apache Flume Morphline Solr Sink (for
The Morphline Solr Sink ships as part of Apache Flume 1.4.0:
http://flume.apache.org/download.html
Documentation is here:
http://flume.apache.org/FlumeUserGuide.html#morphlinesolrsink
Basically, you configure it like any other Flume Sink, plus point it to a
morphline config fi
Looks like the DcXMLParser spits out a metadata field called "title" and
another title as part of the Tika XML stream. That metadata field is then added
to the solr document by solrcell. If you add "title" to the captures the title
from the XML stream gets added as well by solrcell.
JSON suppor
could add some more tests including readJson and the new
> xquery and xslt in trunk?
>
> Best,
> Flavio
> On Mon, Jul 22, 2013 at 8:12 PM, Wolfgang Hoschek
> wrote:
> Looks like the DcXMLParser spits out a metadata field called "title" and
> another title as pa
ory) but for the new xslt and xquery
> I'm not able to find the tests code..could you give me an hook?
>
> On Mon, Jul 22, 2013 at 9:21 PM, Wolfgang Hoschek
> wrote:
> There are many tests for this in the morphlines repo.
>
> Wolfgang.
>
> On Jul
u couldn't be more precise ;)
>
> Thanks,
> Flavio
>
> On Mon, Jul 22, 2013 at 11:02 PM, Wolfgang Hoschek
> wrote:
> Docs for the xquery and xslt morphline commands are here (look for xquery"):
> https://github.com/cloudera/cdk/blob/master/cdk-morphlines/src/site/co
commons-daemon/1.0.3/commons-daemon-1.0.3.pom.
> Return code is: 409 -> [Help 1]
>
>
> On Tue, Jul 23, 2013 at 10:22 AM, Wolfgang Hoschek
> wrote:
> Tests pass on java 6 but fail on java 7. Correspondingly, I have filed
> https://issues.cloudera.org/browse/CDK-80. We'
Perhaps you could implement a custom command based on something like the Guava
RateLimiter class.
Wolfgang.
On Jul 23, 2013, at 4:00 PM, Flavio Pompermaier wrote:
> Hi to all,
>
> I need help in understanding how to manage the flow in Flume. More precisely,
> I need to call a command that req
Take a look at the Apache Flume Morphline Solr Sink, for example for heavy duty
ETL processing and
ingestion into Solr:
http://flume.apache.org/FlumeUserGuide.html#morphlinesolrsink
It provides a scripting engine that enables CEP on the flow of log events.
Wolfgang.
On Aug 26, 2013, at
There is no out of the box command to remove the first line from an event body
but you could write one yourself and plug it in.
If you just want to read CSV records from an event that contains a file, and do
so while ignoring the first line, you can use ignoreFirstLine : true on the
readCSV or
Thanks everybody! Looking forward to a good ride.
Wolfgang.
On Sep 24, 2013, at 3:39 PM, Hari Shreedharan wrote:
> On behalf of the Apache Flume PMC, I am excited to welcome Wolfgang Hoschek
> as a committer on the Apache Flume project. Wolfgang contributed a new sink
> with the abil
You can use module cdk-morphlines-all for that.
Wolfgang.
On Oct 2, 2013, at 2:22 PM, bitsof info wrote:
> Hi,
> New to flume and I am trying to use the MorphlineInterceptor per the
> documentation here:
>
> http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
>
> When I run flu
Here is some material to get started with morphlines:
http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/index.html
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/morphlinesReferenceGuide.html
http://cloudera.gi
Consider using the solrj client class CloudSolrServer, which queries zookeeper
as necessary.
This discussion isn't flume specific, so in the future please post to
solr-u...@lucene.apache.org instead.
Thanks,
Wolfgang.
On Nov 5, 2013, at 12:30 AM, Eric Bus wrote:
> Hi,
>
> I'm currently using
Consider if the splitKeyValue command is applicable here, perhaps in
combination with readLine, split and grok.
Example is here:
http://cloudera.github.io/cdk/docs/current/cdk-morphlines/morphlinesReferenceGuide.html#/splitKeyValue
Wolfgang.
On Nov 12, 2013, at 3:18 PM, Matt Wise wrote:
> Pa
FWIW, here is an example for how this could be handled in a
MorphlineInterceptor:
morphlines : [
{
id : morphline1
importCommands : ["org.kitesdk.**"]
commands : [
{
tryRules {
catchExceptions: true
rules : [
# first rule
Looks like you are running with a guava version that's different than the one
that was used to compile. Flume uses guava 11.0.2 per flume/pom.xml.
Wolfgang.
On Jan 10, 2014, at 7:49 AM, Chhaya Vishwakarma wrote:
> Hi
> Thank you so much that error is gone now I am getting some different error
>
Flume requires guava.
Wolfgang.
On Jan 10, 2014, at 12:40 PM, Chhaya Vishwakarma wrote:
> Hi,
> My flume version is 1.4.0 and I have not put guava jar in classpath
>
> -Original Message-----
> From: Wolfgang Hoschek [mailto:whosc...@cloudera.com]
> Sent: Friday, Januar
'tail' with exec source spits out one flume event per line into the
interceptor. But readMultiLine expects multiple lines per event, not one line
per event. In other words, by the time the data arrives in the interceptor it's
already too late for readMultiLine to make sense.
Wolfgang.
On Jan
t; Subject: RE: flume agent not starting
>
> Hi ,
>
> What i should use then for interceptor to work shall I use CAT ??
>
> -Original Message-
> From: Wolfgang Hoschek [mailto:whosc...@cloudera.com]
> Sent: Tuesday, January 21, 2014 6:30 PM
> To: user@flume.apac
Firstly, to print diagnostic information such as the content of records as they
pass through the morphline commands, consider enabling TRACE log level, for
example by adding the following line to your log4j.properties file:
log4j.logger.org.kitesdk.morphline=TRACE
Secondly, is it expected that
To fix up invalid JSON you can try readClob (or maybe readLine) followed by
findReplace or grok, followed by toByteArray, followed by setValues {
_attachment_body : "@{message}" }, followed by readJson.
Wolfgang.
On Mar 26, 2014, at 8:59 PM, Andrew Sammut wrote:
>
> Hi all
>
> I'm a relativ
The “contains” command tests whether X is one of the elements in list Y, not a
substring of some other string. You can use a mini script with the “java"
command for that.
Wolfgang.
On Mar 27, 2014, at 10:55 PM, Andrew Sammut wrote:
>
> Hi all,
>
> I'm attempting to place a conditional state
detectMimetype can’t detect whether it’s valid JSON, it can at most see whether
it looks like JSON in the first few bytes. Consider wrapping the readJson in a
tryRules command to handle it.
On Mar 30, 2014, at 10:11 PM, Andrew Sammut wrote:
>
> Hi all,
>
> Has anyone used detectMimetype to v
My sense is that a) is interesting if it evolves into a capable true native
tailer, whereas b) is already available in flume and c) and d) are already
available in flume via the MorphlineInterceptor
Wolfgang.
On May 3, 2014, at 12:18 AM, Israel Ekpo wrote:
> Flume Community,
>
> I created a
There is no backwards incompatible change in the code regardless of whether
it's kite 0.10 or 0.11 or 0.12 or 0.13 or 0.14. The dependencies have been made
“optional” in flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml via
true, thus the dependencies don’t ship automatically with
the build.
A morphline receives a flume event at a time. What and how much is contained in
the flume event is up to you, but flume isn’t really designed to send large
events such as whole files or parts of files, it’s designed to send small
discrete events, like a log line per event, or similar.
There is
This means that your TSV data file contains invalid data. Every opening quote
character needs to eventually be followed by a closing quote character in the
data file. Such a closing quote is apparently missing.
Consider fixing your input data, or perhaps try to handle it with readLine +
split r
A Sink allows to emit zero or multiple records per input event whereas an
interceptor only allow to emit zero or one records per input event. Also, an
interceptor can be used to route events to channels and hence sinks.
Wolfgang.
On Jul 24, 2014, at 10:23 AM, Guillermo Ortiz wrote:
> I want t
Congrats Roshan!
On Nov 5, 2014, at 11:54 AM, Saravanan Nagarajan
wrote:
> Congratulations Roshan!
>
> On Wed, Nov 5, 2014 at 7:32 AM, Ahmed Radwan wrote:
>
>> Congrats Roshan!
>>
>> On Tue, Nov 4, 2014 at 2:12 PM, Arvind Prabhakar
>> wrote:
>>
>>> On behalf of Apache Flume PMC, it is my
31 matches
Mail list logo