Re: Performance improvement for xml loads (+comments)

2000-12-18 Thread Rob Browning
Bill Carlson <[EMAIL PROTECTED]> writes: > diff -r1.5 io-gncxml-r.c > 2217a2218, > > /* commit all groups, this completes the BeginEdit started when the > >* account_end_handler finished reading the account. > >*/ > > xaccAccountGroupCommitEdit (ag); > > > 2599a2605,2609 > > /*

Re: Performance improvement for xml loads (+comments)

2000-12-10 Thread Al Snell
On 10 Dec 2000, Derek Atkins wrote: > Have fun. You can also look (at least under Linux) at > /usr/include/rpcsvc/*.x for examples of definition files. You can > also look at rpcgen(1). You may also find the contents of: http://love.warhead.org.uk/~gc_xdr/ interesting :-) ABS --

Re: Performance improvement for xml loads (+comments)

2000-12-10 Thread Derek Atkins
Rob Browning <[EMAIL PROTECTED]> writes: > Thanks very much. I glanced at the manpage briefly, and it's quite > interesting. I appreciate the pointer, and I'll definitely have to > spend a bit of time grokking it. Have fun. You can also look (at least under Linux) at /usr/include/rpcsvc/*.x f

Re: Performance improvement for xml loads (+comments)

2000-12-10 Thread Rob Browning
Derek Atkins <[EMAIL PROTECTED]> writes: > man xdr(3n) > > XDR is the eXternal Data Representation from SunRPC. It is the > on-the-wire "format" used by NFS, NIS, Spray, and any other Sun/ONC > RPC system. It's been around since the early 80's. [...] > Hopefully I've given you enough pointer

Re: Performance improvement for xml loads (+comments)

2000-12-10 Thread Derek Atkins
Rob Browning <[EMAIL PROTECTED]> writes: > If I had known anything about XDR (what is it?) when we were trying to > figure out what do to with the dead binary format, I would certainly > have considered it. I didn't, and I don't recall anyone else bringing > it up as a trivial solution then eith

Re: Performance improvement for xml loads (+comments)

2000-12-09 Thread Rob Browning
Before I continue, let me state for the record, that dealing gracefully with arbitrary hardware/os-failures was never a stated goal of the file format, so I'm going to ignore the points relating to that. I also maintain that all of this talk is essentially wasted, so I'm probably going to quit r

Re: Performance improvement for xml loads (+comments)

2000-12-08 Thread Al Snell
On 8 Dec 2000, Derek Atkins wrote: > Also, I'm looking at network protocols. When designing a network > protocol, I don't like assuming you have 100BaseT between your client > and server. That means you have to make protocols as compact as > possible. I also don't like to compress a network pr

Re: Performance improvement for xml loads (+comments)

2000-12-08 Thread Rob Browning
Derek Atkins <[EMAIL PROTECTED]> writes: > Rob Browning <[EMAIL PROTECTED]> writes: > > > The 50MB of RAM you're worried about is a *BUG*, and it needs to be > > fixed. Other than that, and some performance work that seems fairly > > Is this a bug we can fix? Or is it a bug in libXML? I've n

Re: Performance improvement for xml loads (+comments)

2000-12-08 Thread Patrick Spinler
I wrote: > > To a first approximation. At least, all the databases appear to support > the data types required by the SQL92 standard, and this standard > includes all the types you'd expect (several sizes of integer & floating > point variables, character strings, date fields, money fields, blob

Re: Performance improvement for xml loads (+comments)

2000-12-08 Thread Patrick Spinler
Derek Atkins wrote: > > > > Not everything in the GnuCash data is an SQL primitive data type. > > > > I'm having a hard time thinking of anything. I was planning that we > > go out of the way to make sure we use primitives for all the primary > > stuff. As I mentioned before, this might even me

Re: Performance improvement for xml loads (+comments)

2000-12-08 Thread Derek Atkins
Rob Browning <[EMAIL PROTECTED]> writes: > The 50MB of RAM you're worried about is a *BUG*, and it needs to be > fixed. Other than that, and some performance work that seems fairly Is this a bug we can fix? Or is it a bug in libXML? I've not done any profiling to determine this. > straightfo

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Bill Gribble
On Thu, Dec 07, 2000 at 06:36:58PM -0500, Derek Atkins wrote: > So the question is: are these data files meant as a storage backend > for GnuCash, or are they meant for users to interface to directly? I > think some people here are in the latter camp, while I am firmly in > the former. Both. On

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Rob Browning
Derek Atkins <[EMAIL PROTECTED]> writes: > You're a developer. You don't count. When I wasn't a developer, I still cared. And if you poke through the gnucash logs over the past few years, I think you'll find I'm not alone, but again, we obviously have different experiences here. The 50MB of

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
Patrick Spinler <[EMAIL PROTECTED]> writes: > This is the true reason why all the major unix subsystems do their > configurations in ASCII flat files. Think about apache, for instance, > or even worse, sendmail. That's one hellish config file, especially > since it has to be parsed on startup _

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Al Snell
On 7 Dec 2000, Derek Atkins wrote: > I admit that I don't know very much about DBMS systems. Are columns > in a table labeled? And can you arbitrarily add a new column to an > existing table (I suppose you could create a new table with the > existing column information and add the new column, t

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Jason Godfrey
On Thu, Dec 07, 2000 at 05:44:15PM -0500, Derek Atkins wrote: > I admit that I don't know very much about DBMS systems. Are columns > in a table labeled? And can you arbitrarily add a new column to an > existing table (I suppose you could create a new table with the > existing column informatio

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Patrick Spinler
Derek Atkins wrote: > > I admit that I don't know very much about DBMS systems. Are columns > in a table labeled? And can you arbitrarily add a new column to an > existing table Yes, and yes. While it is possible to access SQL data in a position dependant manner, it is considered bad pract

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
Patrick Spinler <[EMAIL PROTECTED]> writes: > Derek Atkins wrote: > > > > Well, using MySQL or PostgreSQL is just one part of it. It's a > > storage mechanism, but you still need to create the data formats that > > are stored. You still need to define the transaction objects or split > > objec

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
Rob Browning <[EMAIL PROTECTED]> writes: > Derek Atkins <[EMAIL PROTECTED]> writes: > > > Honestly, I think this is a red herring. I'm not at all convinced > > that if said tools did exist they would at all be useful. Sure, you > > have a tagged data tree, but you have to know what the tags me

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
Patrick Spinler <[EMAIL PROTECTED]> writes: > I disagree. > > Now, I may very well be talking out my butt here, since I've never > looked at XML closely, but my understanding is that one of the key > aspects of XML is that there's a tagged description of the data format > in the data itself, y

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Patrick Spinler
I disagree. Now, I may very well be talking out my butt here, since I've never looked at XML closely, but my understanding is that one of the key aspects of XML is that there's a tagged description of the data format in the data itself, yes ? That is, XML always includes the meta data along w

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
But this is just the same as: account_tree = load_file_version_1(filename); save_file_version_2(filename2, account_tree); You're not making an extensible format, you're changing the format of the data. -derek Patrick Spinler <[EMAIL PROTECTED]> writes: > Ayup - snipped from th

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Patrick Spinler
Rob Browning wrote: > > it's also possible > that when we move to SQL, that we might drop the kvp_frames > altogether. That's something we'll have to discuss. One of the > reasons for implementing them was that it made adding new fields when > we needed them much easier, but it's my impression

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Rob Browning
Derek Atkins <[EMAIL PROTECTED]> writes: > Honestly, I think this is a red herring. I'm not at all convinced > that if said tools did exist they would at all be useful. Sure, you > have a tagged data tree, but you have to know what the tags mean in > order to do anything with them. Well, for m

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Rob Browning
Al Snell <[EMAIL PROTECTED]> writes: > Storing tree structures in SQL is hairy. Very hairy. Sufficiently so that > it may be worthwhile storing the kvp tree for an object in a binary format > in a BLOB field. Well, I definitely appreciate your suggestions, but it's also possible that when we mov

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Al Snell
> > Also, if you do decide to try and whip something up, make sure you're > > aware that we use kvp_frames now, in various places, so you will have > > to be able to accomodate items with arbitrarily deep, recursive > > key/value trees. [...] > > Of course you're welcome to, but why would you wast

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Patrick Spinler
Derek Atkins wrote: > > Well, using MySQL or PostgreSQL is just one part of it. It's a > storage mechanism, but you still need to create the data formats that > are stored. You still need to define the transaction objects or split > objects or whatever that get stored in the database. Well,

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
Rob Browning <[EMAIL PROTECTED]> writes: > I think the synergy here is that people think that if you use XML, > it's more likely that there will be tools that will be availble to > allow you to manipulate your data outside the app. This is in fact > true. Writing a parser/transformer to do some

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Rob Browning
Derek Atkins <[EMAIL PROTECTED]> writes: > > If you are worried about load times and memory usage, we should consider > > using a SAX interface to read in the XML. See this link for tradeoffs: > > http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html > > Unfortunately the problem isn't just a

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Rob Browning
Derek Atkins <[EMAIL PROTECTED]> writes: > Besides, if you compress the data, you lose random-access into > the file ;) We never have, and probably never will use that, so it's not relevant IMO. > Seriously, I'm not against having XML import/export, but I don't > think it's a reasonable primary

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
Christopher Browne <[EMAIL PROTECTED]> writes: > XML has the merit of being easily serialized; it wouldn't be too > difficult to use it for THAT purpose, and using other formats that are > isomorphic to it for more direct access. Lots of formats can be easily serialized. So can ASN.1. So can X

Re: Performance improvement for xml loads (+comments)

2000-12-07 Thread Derek Atkins
Tyson Dowd <[EMAIL PROTECTED]> writes: > On 06-Dec-2000, Derek Atkins <[EMAIL PROTECTED]> wrote: > > Nobody is suggesting going back to the old binary format. I'm > > certainly not. I *AM*, however, suggesting a NEW binary format. > > Any new binary format will have to be at least as extensibl

Re: Performance improvement for xml loads (+comments)

2000-12-06 Thread Tyson Dowd
On 06-Dec-2000, Christopher Browne <[EMAIL PROTECTED]> wrote: > > The other thing to consider is that I've heard you can generate a > > near-optimal binary representation automatically from a DTD. If you are > > suggesting an approach like this for generating a binary format, then > > that would

Re: Performance improvement for xml loads (+comments)

2000-12-06 Thread Christopher Browne
On Thu, 07 Dec 2000 16:13:06 +1100, the world broke into rejoicing as Tyson Dowd <[EMAIL PROTECTED]> said: > On 06-Dec-2000, Derek Atkins <[EMAIL PROTECTED]> wrote: > > Nobody is suggesting going back to the old binary format. I'm > > certainly not. I *AM*, however, suggesting a NEW binary form

Re: Performance improvement for xml loads (+comments)

2000-12-06 Thread Tyson Dowd
On 06-Dec-2000, Derek Atkins <[EMAIL PROTECTED]> wrote: > Nobody is suggesting going back to the old binary format. I'm > certainly not. I *AM*, however, suggesting a NEW binary format. Any new binary format will have to be at least as extensible as XML. After all, there's no point writing a ni

Re: Performance improvement for xml loads (+comments)

2000-12-06 Thread Derek Atkins
Nobody is suggesting going back to the old binary format. I'm certainly not. I *AM*, however, suggesting a NEW binary format. -derek Robert Graham Merkel <[EMAIL PROTECTED]> writes: > Derek Atkins writes: > > That's still not a fair comparrison. I can go compress the > > old binary format,

RE: Performance improvement for XML loads (+comments)

2000-12-05 Thread Phillip Shelton
Not really. It is possible that a compact binary could have a pathological series of bytes that make trying to compress it produce a bigger file. The fact that it is already a compact file means that it is close to being as information rich as possible. > -Original Message- > > The inform

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Robert Graham Merkel
Derek Atkins writes: > That's still not a fair comparrison. I can go compress the > old binary format, too. Let's compare apples to apples here > and leave file compression out of it. You certainly can go and compress the old binary format, and it shrinks a file down to about 1/3rd the size

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Derek Atkins
That's still not a fair comparrison. I can go compress the old binary format, too. Let's compare apples to apples here and leave file compression out of it. Besides, if you compress the data, you lose random-access into the file ;) Seriously, I'm not against having XML import/export, but I don

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Al Snell
On Tue, 5 Dec 2000, Bill Gribble wrote: > I don't like XML all that much either, but be fair: zlib makes the XML > file format actually SMALLER than the old binary data format, and one > of the main reasons it's not in is because we decided to leave > plain-text for a while as a debugging tool.

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Bill Gribble
On Tue, Dec 05, 2000 at 06:38:41PM -0500, Derek Atkins wrote: > The fact that data is being exploded by a factor of 6-10 is just > unacceptable. I don't like XML all that much either, but be fair: zlib makes the XML file format actually SMALLER than the old binary data format, and one of the main

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Al Snell
On 5 Dec 2000, Derek Atkins wrote: > Just to add my $.02, I'd rather see us using ASN.1 (and I absolutely > DETEST Asinine-1) than XML. Indeed, I'd like to see us come up with a > relatively-extensible binary format; if we're planning to use binary > for SQL, we might as well use it for flat-fil

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Derek Atkins
Just to add my $.02, I'd rather see us using ASN.1 (and I absolutely DETEST Asinine-1) than XML. Indeed, I'd like to see us come up with a relatively-extensible binary format; if we're planning to use binary for SQL, we might as well use it for flat-file as well. The fact that data is being expl

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Dave Peticolas
Al Snell writes: > On 5 Dec 2000, Rob Browning wrote: > > > > The only problem I have now is that it is still MUCH slower than the > > > binary. The file size is about 6x the size before (9 Meg vs. 1.5 > > > Meg) and to actually do the write it seems to use about 50 Meg of > > > ram, because xml

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Al Snell
On 5 Dec 2000, Rob Browning wrote: > > The only problem I have now is that it is still MUCH slower than the > > binary. The file size is about 6x the size before (9 Meg vs. 1.5 > > Meg) and to actually do the write it seems to use about 50 Meg of > > ram, because xml builds the whole tree in mem

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Rob Browning
Bill Carlson <[EMAIL PROTECTED]> writes: > I've been trying to make the xml stuff go a bit faster. The > following patch will cut my large file load (~1 transactions) > from about 2 minutes to about 30 seconds. This is obviously good > (and along the lines of a change I made a while a

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Conrad Canterford
Dave Peticolas wrote: > Personally, I think it would be a good idea. With a default > XML interface, new users would not need to worry about being > a DBA, but there would still be an option for storing in sql. > Do you think it would be possible/good to use something like libdba > (the gnome gene

Re: Performance improvement for xml loads (+comments)

2000-12-05 Thread Dave Peticolas
Bill Carlson writes: > Hi, > > I've been trying to make the xml stuff go a bit > faster. The following patch will cut my large file load > (~1 transactions) from about 2 minutes to about 30 seconds. > This is obviously good (and along the lines of a change > I made a while ago for the

Performance improvement for xml loads (+comments)

2000-12-04 Thread Bill Carlson
Hi, I've been trying to make the xml stuff go a bit faster. The following patch will cut my large file load (~1 transactions) from about 2 minutes to about 30 seconds. This is obviously good (and along the lines of a change I made a while ago for the binary file load). The