On Jul 7, 2008, at 1:41 PM, David Trotz wrote: > Troy, >> >> You of all people should know this... :) SWORD stores all the tags >> inline in the text. When you start turning things OFF is when all >> the >> filters have to do work to remove things. If everything is left on, >> then the text is just handed back as it was stored. I'm still not >> convinced. :) >> > I should have known it, but I was honestly confused, but it does make > more sense the way you describe it. My guess is I need a better way to > grow my text buffer. It must be what is slowing me down so much. > 'Cause > if I push in a single very long string I can render very fast, but > when > I push in a verse at a time, I render very slowly. I will do some more > testing.
I've run into this before. Here are some things I have tried. Maybe one of them will work well in your situation: 1) Over-allocate for growth. When allocating exactly for growth (e.g. 1 verse at a time) the algorithm is O(n^^2) Using a doubling algorithm (everytime you run out of space, allocate to the next 2^^n boundary, or allocate 2x what you need. This will be O(nlogn), IIRC. I've tried a doubling algorithm, whose impact is that it can grossly over allocate. It would be better to use an algorithm that starts out by aggressive over-allocation, but eases up with time. If you know nothing about your input, then the following might be useful. I'll be replacing Lucene's doubling algorithm with the following taken from Python: /* This over-allocates proportional to the list size, making room * for additional growth. The over-allocation is mild, but is * enough to give linear-time amortized behavior over a long * sequence of appends() in the presence of a poorly-performing * system realloc(). * The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... */ new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize; I think you can do better when you know something about the input. For example, you can know the average size of a verse and the maximum length of a chapter. Together you can use these to create a custom series for growth. It is also possible to cheaply get the size of each chapter in the module. This could be useful in creating a heuristic estimate. 2) Consumer-producer. I don't know if it is possible to have threads or co-processes. But you might be able to arrange for a producer to provide the next desired verse and a consumer to render the HTML. 3) Use "pages" for large output. Currently, the user can only see a bit at a time and has to do something to see the next bit. So in a way the user is used to paging. We do this in Bible Desktop when the output is more than 50 verses. (The user can configure this limit and we are changing it to 200 because common desktop hardware is getting better.) Visually we use bottom tabs, but our users find it confusing and we are looking for something more obvious. In Him, DM _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page