Re: Reading in dictionary from txt file: options for speed

2009-04-20 Thread Miles
Exactly what I was thinking last night. I did this this morning -- split up the file into 26 files. Now everything is great. Thanks everyone! On Thu, Apr 16, 2009 at 11:16 PM, Greg Guerin wrote: > Miles wrote: > > I'm creating a game for where the dictionary file will never get modified, >> so

Re: Reading in dictionary from txt file: options for speed

2009-04-17 Thread Michael Ash
On Thu, Apr 16, 2009 at 10:52 PM, Marcel Weiher wrote: > > On Apr 16, 2009, at 18:59 , Michael Ash wrote: > >> On Thu, Apr 16, 2009 at 2:47 PM, WT wrote: >>> >>> since he'll be dealing with the string's raw bytes, won't Miles have to >>> manually add a null byte to terminate the search string? >>

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Greg Guerin
Miles wrote: I'm creating a game for where the dictionary file will never get modified, so I'm not really worried about that. I was pretty sure the dictionary was read-only, but that doesn't mean it's always error-free. I was actually thinking of production errors, where the dictionary

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Marcel Weiher
On Apr 16, 2009, at 18:59 , Michael Ash wrote: On Thu, Apr 16, 2009 at 2:47 PM, WT wrote: since he'll be dealing with the string's raw bytes, won't Miles have to manually add a null byte to terminate the search string? The strnstr() function takes a length, and can thus be safely used on

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Michael Ash
On Thu, Apr 16, 2009 at 2:47 PM, WT wrote: > Marcel, > > since he'll be dealing with the string's raw bytes, won't Miles have to > manually add a null byte to terminate the search string? The strnstr() function takes a length, and can thus be safely used on buffers which contain no NUL byte. On

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Miles
I'm creating a game for where the dictionary file will never get modified, so I'm not really worried about that. The strings that I am searching for are not from user input, but from user selection, so I'm guaranteed to have the case be correct -- so I don't need to worry about upper vs lower-case

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Greg Guerin
Miles wrote: const char *fileBytes = [stringFileContents bytes]; char *ptr= strstr(fileBytes, cString); Are you certain the bytes returned by [stringFileContents bytes] null- terminated? How was that data initialized? If it's reading from a file, is t

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Marcel Weiher
On Apr 16, 2009, at 12:01 , Miles wrote: It looks like I have the search working like this, but I have to double-space the dictionary file to have a leading \n. No, you just need one initial extra newline at the start, the newline from the end of the last string matches up with the start

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Kyle Sluder
On Thu, Apr 16, 2009 at 8:09 PM, Miles wrote: >    char *ptr                        = strstr(fileBytes, cString); OK, this makes sense, but you're still doing a linear scan through the data you've loaded (and I'm assuming creating NSString objects from what you find because plain old C strings wi

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Miles
I'm all for just a little guidance, and thank you for it. Everything you said makes sense now, and I have it working: NSString *searchStr= @"\nJOY\n"; const char *cString = [searchStr UTF8String]; const char *fileBytes = [stringFileContents bytes]; char *ptr

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Scott Ribe
No null terminator... Now it's perfectly possible the bytes will get written into a zeroed block, and so there will be a null terminator purely by chance some times, and not other times. Why the NSData? Why not just get a C string from searchStr, if that's what you want? Otherwise, in general an N

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Greg Guerin
Miles wrote: NSString *searchStr= @"\njoy\n"; NSData *strData= [searchStr dataUsingEncoding:NSUTF8StringEncoding]; const char *strBytes= [strData bytes]; Think about what you're doing here, then look at the NSString method - UTF8String. Also, think about

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Miles
Also, any idea why these lines would give different results in different projects? NSString *searchStr= @"\njoy\n"; NSData *strData= [searchStr dataUsingEncoding:NSUTF8StringEncoding]; const char *strBytes= [strData bytes]; When this code is in the project you

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Miles
It looks like I have the search working like this, but I have to double-space the dictionary file to have a leading \n. NSString *searchStr= @"\njoy\n"; NSData *strData= [searchStr dataUsingEncoding:NSUTF8StringEncoding]; const char *strBytes= [strData bytes];

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread WT
Marcel, since he'll be dealing with the string's raw bytes, won't Miles have to manually add a null byte to terminate the search string? Wagner On Apr 16, 2009, at 7:57 PM, Marcel Weiher wrote: On Apr 16, 2009, at 10:10 , Miles wrote: Marcel, NOW we're talking. This has really been such

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread WT
On Apr 16, 2009, at 7:43 PM, Marcel Weiher wrote: Hi Wagner, we have rather impressive hardware these days, and Objective-C can access all that power if you let it. No kidding. Incidentally, the - [start timeIntervalSinceNow] you used in your code is a really clever trick for getting e

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Marcel Weiher
On Apr 16, 2009, at 10:10 , Miles wrote: Marcel, NOW we're talking. This has really been such an eye-opening thread. Now it's googling time to try to figure out how to search for a string in there. 1. Get the bytes out of your search string in the encoding that your dictionary is in 2.

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Marcel Weiher
Hi Wagner, we have rather impressive hardware these days, and Objective-C can access all that power if you let it. The 0.007 time for the simulator you got sounds about right, my 0.043 was a typo, I was missing a leading zero (fast MacPro). Incidentally, the - [start timeIntervalSinceN

Re: Reading in dictionary from txt file: options for speed

2009-04-16 Thread Miles
Marcel, NOW we're talking. This has really been such an eye-opening thread. Now it's googling time to try to figure out how to search for a string in there. Thanks! On Wed, Apr 15, 2009 at 7:01 PM, WT wrote: > Hi Marcel, > > that's quite impressive. On the simulator on my machine, it took

Re: Reading in dictionary from txt file: options for speed

2009-04-15 Thread WT
Hi Marcel, that's quite impressive. On the simulator on my machine, it took 0.007 seconds, consistently. Learned something new with your message. Thanks! Wagner On Apr 16, 2009, at 12:35 AM, Marcel Weiher wrote: I would do the following: 1. map the file into memory using -[NSData dataWi

Re: Reading in dictionary from txt file: options for speed

2009-04-15 Thread Marcel Weiher
On Apr 14, 2009, at 11:12 , Miles wrote: I'm trying to find the best way to load in a 2MB text file of dictionary words and be able to do quick searches. Simply loading the uncompressed txt file takes about 0.5 seconds which I can handle. What do you do to load the txt file? 0.5 second

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Michael Ash
On Tue, Apr 14, 2009 at 8:26 PM, Kyle Sluder wrote: > On Tue, Apr 14, 2009 at 7:27 PM, Michael Ash wrote: >> I should specify, it has no trouble reading a misaligned int pointer >> *in x86-64 mode*. I did actually test it that way, although the first >> time I ran the test I compiled it 32-bit an

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread WT
"All you need then is a search function, which essentially traverses the tree, concatenating keys and comparing them with the target word." Actually, I just realized that that's pretty stupid. The efficient way is to split the word into its characters and traverse the tree, testing each suc

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread WT
You know, Miles, I've been thinking about something else you asked earlier, about storing the trie on disk and loading it that way, rather than load the data first and build the trie afterwards. A trie is a tree structure, and so is a plist, so I think you could combine both and save time i

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Miles
Yep, I used your dictionaries, not mine. I'm on a core 2 2.8Ghz macbook pro. On Tue, Apr 14, 2009 at 6:15 PM, WT wrote: > On Apr 15, 2009, at 2:48 AM, Miles wrote: > > Sorry, Wagner, I'm a little spaced -- I didn't realize your test included >> getting the contents into an array! This is great

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread WT
On Apr 15, 2009, at 2:48 AM, Miles wrote: Sorry, Wagner, I'm a little spaced -- I didn't realize your test included getting the contents into an array! This is great. No harm, no foul. :) Here are some VERY interesting results. The simulator seems to be faster with bin and text, while the

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Miles
Sorry, Wagner, I'm a little spaced -- I didn't realize your test included getting the contents into an array! This is great. Here are some VERY interesting results. The simulator seems to be faster with bin and text, while the device is quite a bit slower with text, and about even with bin and xml

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Kyle Sluder
On Tue, Apr 14, 2009 at 7:27 PM, Michael Ash wrote: > I should specify, it has no trouble reading a misaligned int pointer > *in x86-64 mode*. I did actually test it that way, although the first > time I ran the test I compiled it 32-bit and then felt kind of > stupid You're right, I haven't

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread WT
On Apr 15, 2009, at 1:23 AM, Miles wrote: There are a handful of games out there that are small in size, have fast loading times, and quick dictionary checking. If only I knew how they did it! Have you considered contacting their authors directly and asking? You might get lucky and find

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread WT
On Apr 15, 2009, at 1:23 AM, Miles wrote: *2)* Wagner, thanks for that test, that's great to know! I'll check it out in a bit, right now I'm focusing on saving time getting the data into an array or some other format to suit my needs. Then you might want to look at what I did sooner rather

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Kyle Sluder
On Tue, Apr 14, 2009 at 7:23 PM, Miles wrote: > *1) *I've been trying Kyle's suggestion for a few hours and I can't get it > working right. I broke it into this simple example, and it's not able to > convert it to the 'word' struct. At this point, better solutions that involve less hackery have b

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Michael Ash
On Tue, Apr 14, 2009 at 7:34 PM, Kyle Sluder wrote: > On Tue, Apr 14, 2009 at 7:26 PM, Michael Ash wrote: >> That has to be the most confusing technical document I've ever seen. >> I'm not certain, but after reading it about three times, I'm pretty >> sure that it's discussing data alignment on I

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Kyle Sluder
On Tue, Apr 14, 2009 at 7:26 PM, Michael Ash wrote: > That has to be the most confusing technical document I've ever seen. > I'm not certain, but after reading it about three times, I'm pretty > sure that it's discussing data alignment on IA-64 (Itanium), not > x86-64. Microsoft certainly thinks t

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Michael Ash
On Tue, Apr 14, 2009 at 7:26 PM, Michael Ash wrote: > And a quick test confirms that my Mac Pro, at least, has no trouble > reading a misaligned int pointer. I should specify, it has no trouble reading a misaligned int pointer *in x86-64 mode*. I did actually test it that way, although the first

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Michael Ash
On Tue, Apr 14, 2009 at 7:18 PM, Kyle Sluder wrote: > On Tue, Apr 14, 2009 at 7:09 PM, Michael Ash wrote: >> This is not so. It's extremely rare to find a platform which >> *requires* aligned access, and you certainly won't find one running OS >> X. What's more common is finding a platform which

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Miles
You guys are awesome! *1) *I've been trying Kyle's suggestion for a few hours and I can't get it working right. I broke it into this simple example, and it's not able to convert it to the 'word' struct. NSMutableData *data1; NSString *myString = @"\\x06hello\\x00"; const char *utfMyString = [mySt

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Kyle Sluder
On Tue, Apr 14, 2009 at 7:09 PM, Michael Ash wrote: > This is not so. It's extremely rare to find a platform which > *requires* aligned access, and you certainly won't find one running OS > X. What's more common is finding a platform which *prefers* aligned > access, punishing misaligned access wi

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Michael Ash
On Tue, Apr 14, 2009 at 3:03 PM, Kyle Sluder wrote: > 2) Pointer access needs to be 4-byte aligned (on PC, don't know about > iPhone); This is not so. It's extremely rare to find a platform which *requires* aligned access, and you certainly won't find one running OS X. What's more common is findi

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Michael Ash
On Tue, Apr 14, 2009 at 2:12 PM, Miles wrote: > [This is sort of in continuation of the thread "Build Settings for Release: > App/Library is bloated", which gradually changed topics.] > I'm trying to find the best way to load in a 2MB text file of dictionary > words and be able to do quick searche

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread WT
Hi Miles, I wrote a little iPhone app to test loading the standard UNIX dictionary (/usr/share/dict/web2, 234,936 words). If you'd like, you can download the XCode 3.1.x project from here: http://www.restlessbrain.com/DictTest.zip I don't actually have an iPhone, so I only tested it on the

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread Kyle Sluder
On Tue, Apr 14, 2009 at 2:12 PM, Miles wrote: > I'm not super concerned about the 2MB of disk space the txt file takes up, > although I wouldn't be mad about decreasing it somehow. And once I get the > whole dictionary in an array, the searches are basically fast enough for my > purposes. I've sti

Re: Reading in dictionary from txt file: options for speed

2009-04-14 Thread WT
Have you tried splitting the full dictionary into sub-dictionaries (as an offline, pre-processing step) and then having your application load them in sequence, one at a time? It might be that creating separate arrays and then joining them is faster than creating one array for the entire dic

Reading in dictionary from txt file: options for speed

2009-04-14 Thread Miles
[This is sort of in continuation of the thread "Build Settings for Release: App/Library is bloated", which gradually changed topics.] I'm trying to find the best way to load in a 2MB text file of dictionary words and be able to do quick searches. Simply loading the uncompressed txt file takes abou