Note the 'raw data' wording below-- it doesn't say 'unmarked-up text' The RawText driver implies a human-readable flat file as a datastore. It doesn't say anything about the markup within the file. The .conf file that goes with each module tells what kind of markup you should expect to find in the data.
BUT, AGAIN, this is NOT the preferred way to do things. ALOT of work has gone into the API so that you DON'T NEED TO DO ANY OF THIS. You should never even KNOW what driver or native markup the module is in. The idea is that when you construct the SWMgr class, it will FIND all installed modules, attach the correct filters for your requested output and make a set of SWModule * derivatives available to you for use. You just need to say, "Hey, give me the KJV, position it to Jn 3:16, now get the text rendered as HTML" There are many other things you can do for research, like check to see what options are available on a module (like strongs numbers, etc), turn them on and off. You can get entry 'attributes' about an entry in a module that might tell you things like all the KJV English Translations and frequencies of a greek word entry in the Greek Lexicon (which I think we've pulled the module that can do that for the moment).
Unless you are doing your 'micro-Diatheke' experiment as a way 'to learn more about how sword works' so that you might help on the engine code, I would strongly discourage this route. If you just want to use the API, you won't learn anything you SHOULD ever have to know. In fact the RawText driver is becoming less and less used as we migrate our modules over to the zText (compressed) format.
Of course, you ARE always welcome to help on the actual engine code, and if that is your intention, then experiment away!
-Troy.
Lynn Allan wrote:
So, if you are looking for a way to read raw data in a module stored in the RawText format, then you can:
SWModule *bible = new RawText("/path/to/kjv",...); bible->SetKey("jn 3:16"); cout << *bible;
Troy
I'm attempting to put together a "micro-Diatheke" to learn more about how Sword works. It uses the approach suggested above, except only including <canon.h> and doing raw reads of the .vss files (no versekey, filters, RawText, etc.).
It works ok for an actual "raw" file like the BBE, and only requires about 30 lines of code (+ canon.h for lookup). However, when I look at texts\rawtext\kjv\ot and texts\raw\rawtext\web\ot, they aren't really "raw". There is quite of bit of "mark-up" included, such as <WCF>, etc. I notice a variety of prep and filter routines to take out these tags.
Are there actual "raw" files for kjv\ot, kjv\nt, web\ot, and web\nt? Would they be in an older archive, along with the corresponding .vss files? Can they be generated? (Same questions apply for acv, bwe, godsword, isv, litv, mkjv, rsv, etc.)
TIA, Lynn Allan
******************* ******************* #include <stdio.h> #include <fcntl.h> #include <io.h> #include "canon.h"
void main(int argc, char **argv) { int otFdVss = _open("X:\\DevTools\\Crosswire\\Sword\\modules\\texts\\rawtext\\bbe\\ot.vss" , _O_RDONLY); int otFdRawText = _open("X:\\DevTools\\Crosswire\\Sword\\modules\\texts\\rawtext\\bbe\\ot", _O_RDONLY);
char buf[1024]; long offset = 0, testament = 1, book = 2, chapter = 20; long index, start; unsigned short size; long *offsets[2][2] = {{otbks, otcps}, {ntbks, ntcps}}; // Ten Commandments: Exodus 20:1-17 for (long verse = 1; verse <= 17; ++verse) { offset = offsets[testament-1][0][book]; offset = offsets[testament-1][1][(int)offset + chapter]; index = (offset + verse) * 6; lseek(otFdVss, index, SEEK_SET); read(otFdVss, &start, 4); read(otFdVss, &size, 2); lseek(otFdRawText, start, SEEK_SET); read(otFdRawText, buf, (int)size); buf[size] = 0; printf("%s\n", buf); } }
_______________________________________________ sword-devel mailing list [EMAIL PROTECTED] http://www.crosswire.org/mailman/listinfo/sword-devel
_______________________________________________ sword-devel mailing list [EMAIL PROTECTED] http://www.crosswire.org/mailman/listinfo/sword-devel