Re: [DB-SIG] dbf files and compact indices

Ethan Furman Sat, 18 Sep 2010 10:53:47 -0700

Carl Karsten wrote:

On Sat, Sep 18, 2010 at 11:16 AM, Ethan Furman <et...@stoneleaf.us> wrote:

Carl Karsten wrote:

On Sat, Sep 18, 2010 at 1:11 AM, Ethan Furman <et...@stoneleaf.us> wrote:

Does anybody have any pointers, tips, web-pages, already written
routines,
etc, on parsing *.cdx files?  I have found the pages on MS's sight for
Foxpro, but they neglect to describe the compaction algorithm used, and
my
Google-fu has failed to find any sites with that information.

Any and all help greatly appreciated!



"Compound Index File Structure (.cdx)"

http://msdn.microsoft.com/en-us/library/k35b9hs2%28v=VS.80%29.aspx

which basiclly links to:
http://msdn.microsoft.com/en-us/library/s8tb8f47%28v=VS.80%29.aspx

Is that what you need?


Thanks for the link, unfortunately I am already familiar with the page.
What I need help with is the first sentence of the note at the bottom:

Each entry consists of the record number, duplicate byte count and
trailing byte count, all compacted. The key text is placed at the
logical end of the node, working backwards, allowing for previous key
entries.

Here's a dump of the last interior node:

-----
node type: 2
number of keys: 57
free space: 1 (or 256) (and is this bits, bytes, keys, what?)
--
record number mask: c8 0e 40 b0
duplicate byte count mask: 28
trailing byte count mask: 00
--
bits used for record number: 178
bits used for duplicate count: 29
bits used for trail count: 64
bytes used for rec num, dup count, trail count: 192
-----
12 00 ff 3f 00 00 1f 1f 0e 05 05 03 01 00 c8 0e 40 b0 28 00
b2 1d 40 c0 29 00 d0 42 40 d0 54 80 c0 43 40 a8 14 40 b8 40
40 c8 02 40 d0 08 00 b0 4c 80 b0 3a 40 a0 50 80 d0 3b 40 a8
09 40 b8 0a 80 88 3c 80 c0 2a 00 d8 21 c0 c0 3d 40 c0 4a 80
b0 26 40 b8 2b 40 c0 2c 00 c0 41 40 b8 4d 80 c8 37 00 c0 04
40 c8 44 80 c0 1b 40 c8 15 80 c8 27 40 c8 16 00 a8 2d c0 c8
51 80 b8 2e 40 c0 1e 00 b0 17 40 b8 46 40 b0 2f 80 c8 4f 80
a8 13 00 c8 59 00 c8 31 00 c8 1f 00 a8 3e 40 c0 22 40 a8 07
00 c8 23 80 d0 32 80 b0 52 80 c0 34 80 b0 20 40 b0 24 40 c0
47 80 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 4e 44 45 4e 49 44 53 4f 4e 43 43 41 4d 4d 4f 4e 54 54 48
45 57 53 53 4c 45 4e 52 54 49 4e 45 5a 4e 4e 4d 41 47 45 45
49 45 42 45 52 4d 41 4e 45 57 49 4e 53 4c 41 56 45 4e 42 45
52 47 4b 41 56 41 4e 4a 4f 4e 45 53 49 52 49 53 48 53 54 45
54 4c 45 52 52 41 4e 4f 4c 53 54 45 49 4e 45 41 44 4c 45 59
48 41 54 48 41 57 41 59 52 49 4d 45 53 45 41 53 4f 4e 53 53
47 4c 41 44 53 54 4f 4e 45 55 52 52 59 4f 53 54 52 49 4e 4b
52 42 45 53 4f 4c 45 59 46 49 4c 45 4e 45 4e 49 53 4e 47 4c
55 4e 44 45 42 45 52 4c 45 4f 44 53 4f 4e 49 4e 47 4c 45 52
4d 41 52 45 53 54 45 43 4b 45 52 54 4f 4e 44 41 59 57 47 45
52 52 4e 45 49 4c 2d 53 55 4e 44 54 4f 4f 4b 53 45 59 4c 45
4e 44 45 4e 49 4e 55 4e 48 49 41 50 50 45 54 54 41 52 4e 41
48 41 4e 43 41 4c 44 57 45 4c 4c 55 54 54 52 55 43 45 4f 43
41 52 44 45 4c 4f 4f 4d 42 45 52 47 4e 53 45 4c 45 45 52 42
41 43 48 55 47 55 53 54 4e 44 45 52 53 4f 4e 41 4c 4c 41 4e
-----

The last half (roughly) consists of last names compressed together,
while the first half consists of 57 (in this case) entries of the record
number, duplicate byte count and trailing byte count, all compacted --
how do I uncompact them?



huh, I see what you mean.

What are you working on?

I know a few people that may have the answer, but it would help to
explain why it is being worked on.

I have a pure-python module to read db3 and vfp 6 dbf files, and I findthat I need to read (and write) the idx and cdx index files that foxprogenerates. We are in the process of switching from homegrown foxproapps to homegrown python apps, but I have to support the foxpro fileformats until the switch is complete. Once I have the index files down,I'll publish another release of it (an older version can be found on PyPI).


Thanks for your help!

--
~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Re: [DB-SIG] dbf files and compact indices

Reply via email to