[ https://issues.apache.org/jira/browse/SOLR-16265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558690#comment-17558690 ]
Chris M. Hostetter commented on SOLR-16265: ------------------------------------------- Given that {{Http2SolrClient.createRequest}} is private and only used in a few places, it seems like we could probably rethink it's API a bit to avoid needing the {{ByteArrayOutputStream}} and let the {{ContentWriter}} write directly to an {{OutputStreamContentProvider}} ... but skimming the jetty docs on {{ContentProvider}} I gather this might have some behavior changes relating to retries. So perhaps we could have our own custom {{ContentProvider}} that works similar to {{OutputStreamContentProvider}} but makes a new call to {{ContentWriter.write(...)}} each time the {{iterator()}} is called? But in the meantime, just switching the {{ByteArrayOutputStream}} to use the existing {{BinaryRequestWriter.BAOS}} class would eliminate the {{byte[]}} copy and give a quick/small improvement ... so I'll open a sub-task for that > reduce memory usage of ContentWriter based requests in Http2SolrClient > ---------------------------------------------------------------------- > > Key: SOLR-16265 > URL: https://issues.apache.org/jira/browse/SOLR-16265 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Chris M. Hostetter > Priority: Major > > I recently noticed the code below exists in > {{Http2SolrClient.createRequest}}... > {code} > if (contentWriter != null) { > Request req = httpClient.newRequest(url + > wparams.toQueryString()).method(method); > ByteArrayOutputStream baos = new ByteArrayOutputStream(); > contentWriter.write(baos); > // TODO reduce memory usage > return req.content( > new BytesContentProvider(contentWriter.getContentType(), > baos.toByteArray())); > {code} > * AFAICT there is no (other) existing jira discussing this TODO > * This method is called for most "simple" HTTP2 based requests > ** {{Http2SolrClient}} or {{CloudHttp2SolrClient}} -- but not > {{ConcurrentUpdateHttp2SolrClient}} > * This block triggers for anything with a {{ContentWriter}} > ** ie: all {{UpdateRequests}} ... and in theory other custom requests > * Part of the issue seems to be that this code repurposes the > {{ContentWriter}} "push" style API into a "pull" style Jetty client API > ** Even though {{Http2SolrClient}} has other code used only by > {{ConcurrentUpdateHttp2SolrClient}} ({{initOutStream(...)}}) which does > leverage a "push" style Jetty client API: {{OutputStreamContentProvider}} > * But more silly: we make one (serialized) {{byte[]}} of the data in memory > inside the {{ByteArrayOutputStream}} then we call {{toByteArray()}} which > makes a second copy of the {{byte[]}}. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org