"Michael F. Stemper" <michael.stem...@gmail.com> writes: > On 21/09/2021 13.49, alister wrote: >> On Tue, 21 Sep 2021 13:12:10 -0500, Michael F. Stemper wrote: > It's my own research, so I can give myself the data in any format that I > like. > >> as far as I can see the main issue with XML is bloat, it tries to do >> too many things & is a very verbose format, often the quantity of >> mark-up can easily exceed the data contained within it. other formats >> such a JSON & csv have far less overhead, although again not always >> suitable. > > I've heard of JSON, but never done anything with it.
Then you should certainly try to get a basic understanding of it. One thing JSON shares with XML is that it is best left to machines to produce and consume. Because both can be viewed in a text editor there is a common misconception that they are easy to edit. Not so, commas are a common bugbear in JSON and non-trivial edits in (XML unaware) text editors are tricky. Consider what overhead you should worry about. If you are concerned about file sizes then XML, JSON and CSV should all compress to a similar size. > How does CSV handle hierarchical data? For instance, I have > generators[1], each of which has a name, a fuel and one or more > incremental heat rate curves. Each fuel has a name, UOM, heat content, > and price. Each incremental cost curve has a name, and a series of > ordered pairs (representing a piecewise linear curve). > > Can CSV files model this sort of situation? The short answer is no. CSV files represent spreadsheet row-column values with nothing fancier such as formulas or other redirections. CSV is quite good as a lowest common denominator exchange format. I say quite because I would characterize it by 8 attributes and you need to pick a dialect such as MS Excel which sets out what those are. XML and JSON are controlled much better. You can easily verify that you conform to those and guarantee that *any* conformant parser can read your content. XML is more powerful in that repect than JSON in that you can define and enforce schemas. In your case the fuel name, UOM, etc. can be validated with standard tools. In JSON all that checking is entirely handled by the consuming program(s). >> As in all such cases it is a matter of choosing the most apropriate tool >> for the job in hand. > > Naturally. That's what I'm exploring. You might also like to consider HDF5. It is targeted at large volumes of scientific data and its capabilities are well above what you need. MATLAB, Octave and Scilab use it as their native format. PyTables and h2py provide Python/NumPy bindings to it. -- Pete Forman -- https://mail.python.org/mailman/listinfo/python-list