[ https://issues.apache.org/jira/browse/CXF-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
aaron pieper updated CXF-3068: ------------------------------ Attachment: MimeBodyPartInputStreamTest.java This unit test uses a custom InputStream which only ever returns 2 bytes, regardless of how many are asked for. It demonstrates that MimeBodyInputStream is returning 0 where it should return -1. I'm not confident in the assertions in this test, but I'm at least confident that it shouldn't return 0 like it is doing now. > MimeBodyPartInputStream illegally returns "0" from a read call with chunked > InputStream > --------------------------------------------------------------------------------------- > > Key: CXF-3068 > URL: https://issues.apache.org/jira/browse/CXF-3068 > Project: CXF > Issue Type: Bug > Components: Core > Affects Versions: 2.2.3, 2.2.4, 2.2.5, 2.2.6, 2.2.7, 2.2.8, 2.2.9, 2.2.10 > Environment: Windows > Reporter: aaron pieper > Attachments: MimeBodyPartInputStreamTest.java > > > I'm having a problem with some MTOM attachments. It started when I upgraded > from CXF 2.2.2 to CXF 2.2.3. The bug is that after calling a service which > returned an MTOM attachment, when I try to parse the attachment, I sometimes > get an error: > java.io.IOException: Underlying input stream returned zero bytes > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:268) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) > at sun.nio.cs.StreamDecoder.READ(StreamDecoder.java:158) > at java.io.InputStreamReader.READ(InputStreamReader.java:167) > at java.io.Reader.READ(Reader.java:123) > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1128) > at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104) > at org.apache.commons.io.IOUtils.copy(IOUtils.java:1050) > at org.apache.commons.io.IOUtils.toString(IOUtils.java:359) > at com.pragmatics.AsyncUtils.messageToString(AsyncUtils.java:18) > > The error only happens for some attachments - about 25% of them. It's a > seemingly arbitrary 25% - it's not like, the biggest 25% or the ones that > have special characters. I was able to track this down to > MimeBodyPartInputStream. MimeBodypartInputStream has some logic in > processBuffer for reading the boundary. It goes like this: > while ((boundaryIndex < boundary.length) && (value == > boundary[boundaryIndex])) { if (!hasData(buffer, initialI, i + 1, off, len)) > { > return initialI - off; > } > value = buffer[++i]; > boundaryIndex++; > } > So, basically, when MimeBodyPartInputStream finds the start of a boundary, it > reads from the stream until either there's no more characters to read, or > until it read the entire boundary. The problem with this logic is that it > assumes the entire boundary will be read in the same call to the underlying > InputStream. This assumption isn't always true. Specifically, when I'm > fetching an attachment in my application, this MimeBodyPartInputStream is > backed by an HttpURLConnection.HttpInputStream. This HttpInputStream > sometimes fetches as few as 24 characters, I guess that's just how the > HttpInputStream works. But if these 24 characters happen to fall on one of > these MIME boundaries, it can cause problems. > One problem, which I'm running into here, is that the > MimeBodyPartInputStream's read(byte,int,int) method returns 0, since the only > bytes that were read were parts of the MIME boundary. In returning 0, it > breaks InputStream's contract which says states that the read method will > only ever return a positive integer (if some bytes were read) or -1 (if no > bytes were read.) There are probably other possible problems - it seems like > it's possible MimeBodyPartInputStream might misunderstand whether or not it's > hit a boundary in some cases. I haven't run into that problem though. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.