Chris,

Your suggestion is very similar to JSword's implementation. It has simplified 
code maintenance.

There are three types of module files: index, compression index and data files. 
It may do well to handle these separately.
The index consists of fixed sized entries consisting of parts. For a raw module 
it is: offset and size.  For a compressed module it is: block, offset and size.
The block and offset are always 32bits. But it is the size that varies in 
width. Today, either 2 or 4 bytes.

So I'd suggest two more classes: RawIndex and a sub-class ZIndex. (Maybe 4, 
also struct/class RawIndexEntry and ZIndexEntry).

A couple of observations. A row in the file is of fixed width. The size of the 
file divided by the width of the row gives the number of entries. Finding the 
i-th entry is simple and obvious.

We've started the above, but still have code duplication related to the index 
code being in more than one module driver.

Also, I don't see the point of the 3 byte entry. The only thing it affects is 
the size of the index file. In memory it will be 32bit. For a Bible it would 
save about 65K to have a 3 byte rather than a 4 byte. Rather I'd suggest that 
from now on our module making tools only make 4 byte index files. For a Bible, 
this would add about 128K to the module size.

In Him,
        DM

On Mar 18, 2014, at 2:43 AM, Chris Little <chris...@crosswire.org> wrote:

> We've got quite a few classes in Sword that essentially duplicate code found 
> elsewhere in Sword, with minor changes. The module drivers are a prime 
> example.
> 
> Specific examples include RawText & RawText4, RawCom & RawCom4, zText & 
> zText4 (new as of today), zCom & zCom4 (new as of today), and RawLD & RawLD4, 
> each pair of which differs in that one member uses a 16-bit value to store 
> entry size and the other member uses a 32-bit value. (The 16-bit sizes permit 
> entries up to 64KiB; the 32-bit sizes permit entries up to 4GiB.)
> 
> There are also the pairs RawText & RawCom, RawText4 & RawCom4, zText & zCom, 
> and zText4 & zCom4, each pair of which differs very little.
> 
> My proposal is to collapse the above classes into three classes:
> RawText, zText, and RawLD
> 
> Each of these classes would support entry sizes of 2, 3, or 4 bytes (16-bit = 
> 64KiB entries, 24-bit = 16MiB entries, 32-bit = 4GiB entries). Internally, 
> the classes would always store sizes as a uint32_t, but would serialize as 2, 
> 3, or 4 byte size integers, depending on the parameters passed to the 
> constructor. This will necessitate changing many of the class method 
> signatures to accept uint32_ts instead of shorts & longs.
> 
> Similarly, the classes zVerse & zVerse4, RawVerse & RawVerse4, and RawLD & 
> RawLD4 would be condensed into zVerse, RawVerse, & RawLD capable of reading 
> files with 2, 3, or 4-byte entry sizes.
> 
> This would not require changes to existing modules. A RawLD4 module will 
> still work, but we'll use the RawLD driver to read it and parse the '4' form 
> the end of the driver name to determine that we will read 4-byte entry sizes.
> 
> RawCom, zCom, & SWCom classes would then be derived from RawText, zText, & 
> SWText respectively. Maybe we can even eliminate the *Com classes and simply 
> add a member variable to indicate whether to act like a commentary or a Bible.
> 
> 
> Advantages of this proposal include all of the things that come with reduced 
> code duplication:
> Less code, reduced API complexity, smaller library size, etc.
> Greater consistency, without having to page through half a dozen distinct 
> classes to keep code consistent.
> Bugs only need to be fixed in one location instead of many.
> Whatever else makes DRY practices better than WET.
> 
> The method described also makes it trivial for us to add the 3-byte entry 
> size drivers, which should be enough for anything practical (up to 16MiB per 
> entry). And down the road, we could add 5-byte entry size support with ease 
> for entry sizes up to 1TiB. (No, I'm not suggesting that.)
> 
> If you're wondering why RawGenBook & zLD are left out of the proposal, it's 
> because they both use 4-byte entry sizes already and no 2-byte versions exist.
> 
> --Chris
> 
> 
> 
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to