Hey Guys,
So I'm trying to implement the following Route:
FTP --> Get Large log file (3G) --> Copy to Local Working Directory -->
Aggregate File into chunks of some size (BatchSize = 1000) --> Write Each
Batched Exchange to a Kafka Topic.
Now i'm able to get the route working for small files ( < 10MB ). But the
moment I use something large, say 1GB or 2GB, I get Java OOME's
(Specifically Heap Size issues).
Here is my code:
public void configure() throws Exception {
from(ftpToKafkaObj.getFtpEndpoint())
.split(bodyAs(String.class).tokenize("\n"))
.streaming()
.process(new FtpToLocalFileProcessor())
.routeId("ROUTEID::::: " + ftpToKafkaObj.getRouteId())
.aggregate(constant(true),batchAggregationStrategy())
.completionPredicate(batchSizePredicate())
.completionTimeout(3000L)
.to(ftpToKafkaObj.getKafkaEndpoint())
.end();
}
Now You'll notice the "split" method maps the incoming file to
"String.class" now this is feasible for small files, but once I get to
anything bigger I shouldn't be mapping the object to a String as this will
pull the contents of the entire file into memory and thus requiring a very
large Heap Size.
So the above works well for small files and I have no issues. I can
aggregate and send the messages to the Kafka Endpoints as expected. Now
when I remove the String.class mapping from the splitter, I don't seem to be
getting the contents of the file, but rather a single line which contains
this:
-rw-r--r-- 1 501 501 3139636252 Feb 26 19:40 test_file.txt
Now I'm kind of confused... where do I go from here? How do I get the
Splitter to actually split on the new line character and send in line by
line?
--
View this message in context:
http://camel.465427.n5.nabble.com/Camel-Large-File-Processing-Issues-tp5781221.html
Sent from the Camel - Users mailing list archive at Nabble.com.