Hey Guys,

So I'm trying to implement the following Route:

FTP --> Get Large log file (3G) --> Copy to Local Working Directory -->
Aggregate File into chunks of some size (BatchSize = 1000) --> Write Each
Batched Exchange to a Kafka Topic.

Now i'm able to get the route working for small files ( < 10MB ).  But the
moment I use something large, say 1GB or 2GB, I get Java OOME's
(Specifically Heap Size issues).  

Here is my code: 

    public void configure() throws Exception {
        from(ftpToKafkaObj.getFtpEndpoint())
                .split(bodyAs(String.class).tokenize("\n"))
                .streaming()
                .process(new FtpToLocalFileProcessor())
                .routeId("ROUTEID::::: " + ftpToKafkaObj.getRouteId())
                .aggregate(constant(true),batchAggregationStrategy())
                .completionPredicate(batchSizePredicate())
                .completionTimeout(3000L)
                .to(ftpToKafkaObj.getKafkaEndpoint())
                .end();
    }

Now You'll notice the "split" method maps the incoming file to
"String.class" now this is feasible for small files, but once I get to
anything bigger I shouldn't be mapping the object to a String as this will
pull the contents of the entire file into memory and thus requiring a very
large Heap Size.  

So the above works well for small files and I have no issues.  I can
aggregate and send the messages to the Kafka Endpoints as expected.  Now
when I remove the String.class mapping from the splitter, I don't seem to be
getting the contents of the file, but rather a single line which contains
this:

-rw-r--r--    1 501      501      3139636252 Feb 26 19:40 test_file.txt

Now I'm kind of confused... where do I go from here?  How do I get the
Splitter to actually split on the new line character and send in line by
line?



--
View this message in context: 
http://camel.465427.n5.nabble.com/Camel-Large-File-Processing-Issues-tp5781221.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Reply via email to