On Wed, Sep 18, 2013 at 03:41:34PM +0100, Peter Keller wrote: > I hope that this isn't quite what you meant.... There are already > mutually-incompatible CIF dialects out there that have been created > by developers coding to their own understanding and interpretations > of the CIF/STAR format. I am sure that you would not want to be the > creator of yet another one :-) Correct tokenising is a necessary > (but not sufficient) condition for preventing the problem getting > worse.
This reminded me that I was looking into CIF grammar several years ago. I took "Appendix A: A formal grammar for CIF": http://www.iucr.org/resources/cif/spec/version1.1/cifsyntax#bnf and I used it (after necessary syntax modification) in Boost.Spirit, which is one of many parser generators. Then I noted two things that may be errors in the specification: - no whitespace between LoopHeader and LoopBody see <DataItems>: <LoopHeader> ends with <Tag>, <LoopBody> starts with <Value>, but there is no <WhiteSpace> between. - extra "|" in <TokenizedComments> (...<eol> |}...) Am I right? Marcin -- Scanned by iCritical.