[ https://issues.apache.org/jira/browse/CAMEL-21288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Claus Ibsen updated CAMEL-21288: -------------------------------- Component/s: camel-core > Log processor causing memory leak in split for very large data sets > ------------------------------------------------------------------- > > Key: CAMEL-21288 > URL: https://issues.apache.org/jira/browse/CAMEL-21288 > Project: Camel > Issue Type: Bug > Components: camel-core > Affects Versions: 4.1.0 > Reporter: Michal Stepan > Priority: Major > > Given random data generator function: > > {code:java} > public static List<Map<String, Object>> seed(int numberOfRows, int > numberOfColumns) { > List<Map<String, Object>> dataList = new ArrayList<>(); > Random random = new Random(); > for (int i = 0; i < numberOfRows; i++) { > Map<String, Object> row = new HashMap<>(); > for (int j = 1; j <= numberOfColumns; j++) { > String columnName = "col" + j; > var value = random.nextInt(1000); > row.put(columnName, value); > } > dataList.add(row); > } > return dataList; > } {code} > And two processors - first generates 20 batches and second would generate 20k > rows in each batch (tweak as you want): > > {code:java} > public class OutsideSplitProcessor implements Processor { > @Override > public void process(Exchange exchange) throws Exception { > exchange.getIn().setBody(seed(20, 1)); > } > } {code} > > {code:java} > public class InsideSplitProcessor implements Processor { > > @Override > public void process(Exchange exchange) throws Exception { > exchange.getIn().setBody(seed(20000, 20)); > } > } {code} > And a route: > > {code:java} > <route> > <from uri="direct:test"/> > <process ref="outsideSplitProcessor"/> > <split stopOnException="true" parallelProcessing="false" streaming="true"> > <simple>${body}</simple> > <process ref="insideSplitProcessor"/> > <log message="Ha, now you fail ${body.size()}"/> > <setBody><constant/></setBody> > </split> > <to uri="mock:test"/> > </route> {code} > The processing would fail on OOM when used limited memory setting ( -Xmx512m > in my case of macbook m1 pro 16Gb ram). > The problem is on the line: > > {code:java} > <log message="Ha, now you fail ${body.size()}"/> {code} > Where upon analysis, the expression evaluation stores the content of the body > into memory (ok), but keep it referrenced even after leaving the {*}split{*}. > This is happening only when the generated data are objects (Random usage in > this case) - when using unboxed *int* values, the problem is not there. Our > original case was using *sql* component, that returned database data (boxed > in objects). > > You can mitigate the problem by using external processor instead of log: > > {code:java} > <process ref="logProcessor"/> {code} > {code:java} > public class LogProcessor implements Processor { > @Override > public void process(Exchange exchange) throws Exception { > log.info("Haha, now you will not fail: {}", > exchange.getIn().getBody(List.class).size()); > } > } {code} > or using groovy: > {code:java} > <groovy> > request.headers.bodySize = body.size() > </groovy> {code} > In both cases, referrences are cleaned up - not causing OOM. > > This behavior seems very unexpected. > > > -- This message was sent by Atlassian Jira (v8.20.10#820010)