Hello, I haven't posted any thing about the component below on JIRA yet, as I was thinking a bit more about it this week-end... Should it be a separate component as it is shown below (named "stax" because of the underlying technology it uses), or should it be merged with the actual "xpath" component ? The former solution may seem a bit confusing to the API users, the latter would require more work but would be cleaner.
What do you think about it ? Regards, Xavier On Wed, May 18, 2011 at 5:11 PM, Xavier Coulon <[email protected]> wrote: > Hello, > > As a complement the contribution of Romain (who is a collegue of mine, but > in a different team), I would like to submit another component to the Camel > project. This component splits XML inputs with streaming, which, according > to the documentation, is not possible yet. The rule for splitting is an > XPath expression, and the input source can be a GenericFile or an > inputstream. > > The code is based on 3 classes, so I put it directly in this message (I > just excluded the JUnit tests here): > > public class StaxExpressionBuilder implements Expression { > > private static final Logger LOGGER = LoggerFactory > .getLogger(StaxExpressionBuilder.class); > > /** The XPath value that inputstream elements must match to be splitted. > */ > private final String path; > > public StaxExpressionBuilder(String path) { > this.path = path; > } > > @SuppressWarnings("unchecked") > @Override > public <T> T evaluate(Exchange exchange, Class<T> type) { > try { > Endpoint fromEndpoint = exchange.getFromEndpoint(); > fromEndpoint.getEndpointKey(); > Object body = exchange.getIn().getBody(); > InputStream inputStream = null; > if (body instanceof GenericFile) { > GenericFile<File> file = (GenericFile<File>) body; > inputStream = new FileInputStream(file.getFile()); > } > if (inputStream != null) { > return (T) new StaxIterator(inputStream, path); > } > LOGGER.error("No inputstream for message body of type " > + body.getClass().getCanonicalName()); > } catch (FileNotFoundException e) { > LOGGER.error("Failed to read incoming file", e); > } catch (XMLStreamException e) { > LOGGER.error( > "Failed to create STaX iterator on incoming file message", > e); > } > return null; > } > } > > -------------------------------- > public class StaxIterator implements Iterator<String> { > > private final AtomicInteger counter = new AtomicInteger(0); > private static final Logger LOGGER = LoggerFactory > .getLogger(StaxIterator.class); > > private final XMLEventReader eventReader; > private final XPathLocation currentLocation = new XPathLocation(); > private final List<String> matchPathes; > private final XMLInputFactory inputFactory = XMLInputFactory.newInstance(); > > private String nextItem = null; > > public StaxIterator(InputStream inputStream, String pathes) > throws XMLStreamException { > this.matchPathes = new ArrayList<String>(); > for (String path : pathes.split("\\|")) { > this.matchPathes.add(path.trim()); > } > this.eventReader = inputFactory.createXMLEventReader(inputStream); > this.nextItem = readNextItem(); > } > > @Override > public boolean hasNext() { > return (nextItem != null); > } > > @Override > public String next() { > String currentItem = this.nextItem; > this.nextItem = readNextItem(); > return currentItem; > } > > private String readNextItem() { > try { > StringBuilder itemBuilder = null; > boolean found = false; > String item = null; > while (eventReader.hasNext() && !found) { > XMLEvent event = eventReader.nextEvent(); > if (event.isStartElement()) { > StartElement element = event.asStartElement(); > String localName = element.getName().getLocalPart(); > currentLocation.appendSegment(localName); > if (currentLocation.matches(matchPathes)) { > itemBuilder = new StringBuilder(); > } > startRecording(itemBuilder, element); > } else if (event.isCharacters()) { > record(itemBuilder, event.asCharacters()); > } else if (event.isEndElement()) { > // If we reach the end of an item element we stop recording. > endRecordingElement(itemBuilder, event.asEndElement()); > if (currentLocation.matches(matchPathes)) { > found = true; > item = itemBuilder.toString(); > counter.incrementAndGet(); > } > currentLocation.removeLastSegment(); > } > } > return item; > } catch (XMLStreamException e) { > LOGGER.error("Failed to read item #" + counter.get() > + " from inputstream", e); > return null; > } > } > > private void endRecordingElement(StringBuilder itemBuilder, > EndElement endElement) { > if (itemBuilder == null) { > return; > } > itemBuilder.append("</").append(endElement.getName().getLocalPart()) > .append(">"); > } > > private void record(StringBuilder itemBuilder, Characters characters) { > if (itemBuilder == null) { > return; > } > itemBuilder.append(characters.getData()); > } > > private void startRecording(StringBuilder itemBuilder, StartElement > element) { > if (itemBuilder == null) { > return; > } > itemBuilder.append("<").append(element.getName().getLocalPart()); > @SuppressWarnings("unchecked") > Iterator<Attribute> attributes = element.getAttributes(); > while (attributes.hasNext()) { > Attribute attr = attributes.next(); > itemBuilder.append(" ").append(attr.getName()).append("=\"") > .append(attr.getValue()).append("\""); > } > itemBuilder.append(">"); > } > > @Override > public void remove() { > throw new UnsupportedOperationException( > "remove() method is not supported by this Iterator, in the context of > StAX input reading only."); > } > } > > -------------------------------- > public class XPathLocation { > > private static final String NODE_SEPARATOR = "/"; > > private static final String DOUBLE_NODE_SEPARATOR = "//"; > > /** location with initial value. */ > private String location = NODE_SEPARATOR; > > /** > * Constructor > */ > public XPathLocation() { > super(); > } > > /** > * Full Constructor. > * > * @param value > * initial value > */ > public XPathLocation(String value) { > super(); > this.location = value; > } > > public String getLocation() { > return location; > } > > public String appendSegment(String segment) { > location = new StringBuilder(location).append(NODE_SEPARATOR) > .append(segment).toString(); > location = location.replaceAll("//", "/"); > return location; > } > > public String removeLastSegment() { > location = StringUtils.substringBeforeLast(location, NODE_SEPARATOR); > if (location.isEmpty()) { > location = NODE_SEPARATOR; > } > return location; > } > > /** > * Returns true if one of the given pattern matches the current location, > * false otherwise > * > * @param orPatterns > * the given patterns > * @return true or false > */ > public boolean matches(final List<String> orPatterns) { > for (String pattern : orPatterns) { > if (matches(pattern)) { > return true; > } > } > return false; > } > > /** > * Returns true if the given pattern matches the current location, false > * otherwise > * > * @param pattern > * the given pattern > * @return true or false > */ > public boolean matches(final String pattern) { > if (pattern == null || pattern.isEmpty()) { > return false; > } else if (pattern.startsWith(NODE_SEPARATOR)) { > return matchStartWith(pattern); > } else if (pattern.contains(DOUBLE_NODE_SEPARATOR)) { > return matchContains(pattern); > } else { > String lastSegments = StringUtils.substringAfterLast(location, > pattern + NODE_SEPARATOR); > return (!lastSegments.isEmpty()) && location.endsWith(lastSegments) > && !lastSegments.contains(NODE_SEPARATOR); > } > } > > private boolean matchContains(String pattern) { > String firstSegments = StringUtils.substringBefore(pattern, > DOUBLE_NODE_SEPARATOR) + NODE_SEPARATOR; > String lastSegments = NODE_SEPARATOR > + StringUtils.substringAfter(pattern, DOUBLE_NODE_SEPARATOR); > > return location.contains(firstSegments) > && location.endsWith(lastSegments) > && location.indexOf(lastSegments, firstSegments.length()) >= (location > .indexOf(firstSegments) + firstSegments.length() - NODE_SEPARATOR > .length()); > } > > private boolean matchStartWith(String pattern) { > if (pattern.startsWith(DOUBLE_NODE_SEPARATOR)) { > return location.endsWith(StringUtils.substringAfter(pattern, > DOUBLE_NODE_SEPARATOR)); > } else { > return pattern.equals(location); > } > } > } > -------------------------------- > > In the code, here is how he use it: > > public class MyRouteBuilder > extends RouteBuilder { > > @Override > public void configure() { > from(file:..).*split(stax("//foo/bar")).streaming()*.to(...); > } > > private Expression stax(String path) { > return new StaxExpressionBuilder(path); > } > } > > Here's how it works : > - when splitting the incoming message body, the stax() method returns a new > type of Iterator. > - when streaming, the iterator's next() method is called. Using StAX > inside, it moves into the inputstream and keeps track of the element > locations it traverses. > - when an element's location matches the given XPathLocation, the iterator > 'records' the inputstream content and returns it at the end of the element. > > Note that the stax() method is part of my RouteBuilder, but it could be > moved to the RouteBuilder super class for a generic usage. > > > What do you think about it ? > Is this something you're interested in ? > > Best regards, > Xavier > > On Fri, May 13, 2011 at 8:21 AM, Romain Manni-Bucau <[email protected] > > wrote: > >> Hi, >> >> thank you Richard and Claus for your feedbacks. >> >> I modified the classloading stuff, the NPE catch and added the XMLUtil >> class >> to get the tag name. >> >> I added support for input stream as input (adding some converters) but the >> problem is that camel already have a lot of converters and you can load >> back >> the whole file very fast if you don't take care. >> >> - Romain >> >> 2011/5/13 Claus Ibsen <[email protected]> >> >> > Hi >> > >> > Yeah it does look very cool. Good work. >> > >> > Would be great if the StaxComponent could also cater for non file >> > based inputs. You may have the message body as a Source already. But >> > that can always be improved. >> > >> > And yes as Richard mention the class loading should use the >> > ClassResolver. You can get it from the CamelContext. exchange -> camel >> > context -> class resolver. >> > >> > And the stuff that finds the annotations. We may have some common code >> > for that. Or later refactor that into a util class. >> > >> > Anyway keep it up. >> > >> > >> > On Fri, May 13, 2011 at 1:29 AM, Richard Kettelerij >> > <[email protected]> wrote: >> > > Hi Romain, >> > > >> > > Nice work. I've taken a look at your component. A few minor >> suggestions >> > for >> > > improvement, in case you want to contribute it to Apache: >> > > >> > > - The component currently uses getContextClassLoader().loadClass() for >> > > classloading. Camel actually has a abstraction to make this portable >> > across >> > > various runtime environments. You can just replace it with >> > > org.apache.camel.spi.ClassResolver().resolveClass(). >> > > >> > > - Avoid catching the NullPointException in the >> > StAXJAXBIteratorExpression. >> > > >> > > - Do you plan to add a DSL method for the StAXJAXBIteratorExpression >> > > (requires patching camel-core)? So you can write for example >> > > "split(stax(Record.class))" in your route. >> > > >> > > Regards, >> > > Richard >> > > >> > > On Thu, May 12, 2011 at 5:55 PM, Romain Manni-Bucau >> > > <[email protected]>wrote: >> > > >> > >> Hi all, >> > >> >> > >> i worked a bit around stax (thanks to claus for its advices). >> > >> >> > >> You can find what i've done here: >> > >> http://code.google.com/p/rmannibucau/source/browse/camel/camel-stax/ >> > >> >> > >> The test show what can be done with it: >> > >> >> > >> >> > >> http://code.google.com/p/rmannibucau/source/browse/camel/camel-stax/src/test/java/org/apache/camel/stax/test/StAXRouteTest.java >> > >> >> > >> - validation using sax (just need a converter) >> > >> - parsing using a sax contenthandler and a stax stream reader (a >> > simple >> > >> component) >> > >> - parsing of sub tree to get jaxb objects using a stax event reader >> > for >> > >> the whole tree and jaxb for the sub objects >> > >> >> > >> >> > >> - Romain >> > >> >> > > >> > >> > >> > >> > -- >> > Claus Ibsen >> > ----------------- >> > FuseSource >> > Email: [email protected] >> > Web: http://fusesource.com >> > CamelOne 2011: http://fusesource.com/camelone2011/ >> > Twitter: davsclaus >> > Blog: http://davsclaus.blogspot.com/ >> > Author of Camel in Action: http://www.manning.com/ibsen/ >> > >> > > > > -- > Xavier > -- Xavier
