ajs6f and others who like talking about immutability and fluent APIs: Stuck on a train with no WiFi I plundered a bit further with the idea of a fluent Parser (& writer) API for Commons RDF. I have committed this idea to the fluent-parser branch for now:
https://github.com/apache/commons-rdf/compare/fluent-parser (Note that I have not wired up any of the implementations) Before we talked about mutability/immutability. I think I might have found some middle ground: RDF.parser(RDFSyntax) returns a Parser instance (perhaps we should call it RDFParser?) https://s.apache.org/RDFparser if the syntax is unsupported, then Optional.empty(). The argument null can be used for syntax guessing, content negotiation, e.g. as supported by Jena RIOT (but not by JSONLD-Java). The parser http://stain.github.io/commons-rdf/fluent-parser/org/apache/commons/rdf/api/io/Parser.html takes a single ParserConfig argument http://stain.github.io/commons-rdf/fluent-parser/org/apache/commons/rdf/api/io/ParserConfig.html which is the (hopefully serializable) bean of the job to parse. Most importantly a ParserSource (e.g. a IRI or Path) and a ParserTarget (e.g. a Dataset). You can start building these either from a ParserSource.immutable() or ParserSource.mutable() and then use fluent interface with .withBase(), .withSource() etc. The interfaces these expect have similar helpers. E.g. RDF rdf = new JenaRDF(); ParserConfig c = ParserConfig.mutable() .withSyntax(RDFSyntax.TURTLE) .withSource(ParserSource.fromPath( Paths.get("/tmp/file.ttl"))) .withTarget(ParserTarget.toDataset( rdf.createDataset())) .asImmutableConfig(); The .withOption() can be used to set a key-value pair for vendor-specific Options. I think further work is needed there to ensure those are serializable. This can then be used like this: Parsed parsed = rdf.parser(null).parse(c); (syntax will be picked up from c in this case) http://stain.github.io/commons-rdf/fluent-parser/org/apache/commons/rdf/api/io/Parsed.html is just the source and target of the completed job, as well as the count of how many were parsed. This is obviously more interesting in an async parser case where you have many parser sessions. (I'll come to that in another email) The mutable ParserConfig bean modifies in-place the config values, so you can set them many times, returning the same bean: https://s.apache.org/MutableParserConfig This is naive, simple, and definitely not thread-safe, but it should have minimal performance hits. The .asImmutableConfig can make a snapshot copy to a similar bean that is immutable (its with* method returns new instances) - that would not really be needed if you don't keep around the ParserConfig instance and don't do async parser jobs, as both mutable and immutable version comply with the ParserConfig interface. (Discuss!) ParserConfig.immutable() works just the same, but keeps every step immutable, starting with a "null" config. Thus every step can be used both as argument for multiple Parser sessions, but also for thread-safe re-use of ParserConfig (e.g. to vary the source). Here instead of copying lots of fields we start with an empty config with no fields and all methods returning Optional.empty() https://s.apache.org/ImmutableParserConfigImpl Any call to a .withSomething() will then create a "child" config that delegates to unchanged properties and keeps the new value. For instance https://s.apache.org/WithTarget only overrides .target() while .source() will go to the (immutable) parent config. Unset properties will therefore fall back to the underlying Optional.empty() in the initial bean. Note that "parent" here does not mean subclass, but delegation. (To confuse matters and avoid code smell all parent delegation is done by the WithParent superclass) As such the immutable parser configs are thread-safe and can be re-used many times. They are also Serializable (which makes a single snapshot https://s.apache.org/SnapshotParserConfig while the MutableParserConfig on purpose is not serializable. BTW: The With* classes are quite light-weight and so would hopefully not come with a much larger memory footprint than the mutable bean. There is some potential memory waste here if a property is overridden many times, e.g. c.withSource(a).withSource(b).withSource(c) as the intermediate immutable configs in the "parent tree" is still kept. In extreme instances this could cause a stack overflow when looking up properties. (Someone optimizing that can use c.asMutableConfig().asImmutableConfig() to cause a new base snapshot.) As this is the config interface which happens to be fluent, but it is not opinionated, e.g. you are allowed to "forget" a parser source which generally won't make sense. There is no overloading shortcuts such as .withSource(path). Also you have to fetch the Parser and pass the config along yourself. The advantage I see here is that the JenaParser implementation gets a very clean Parser interface to implement, no abstract class needed. It might however need to check itself that the config is complete enough for its needs as in theory it could be a null-config with nothing set. I'll come back to the alternative ParserBuilder interface which guides the client caller step by step straight into a parsed file. -- Stian Soiland-Reyes The University of Manchester http://www.esciencelab.org.uk/ http://orcid.org/0000-0001-9842-9718 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org