> > > Wouldn't this function be a candidate for crawling into the 'strutils' > > > unit? > > > > Probably, but I would change the interface: > > > > function Split(const str: string; const separator: string; var Res : array of > > string) : Integer; > > > > So the function returns the number of elements in the array. > > The function could also use some speedup. Deleting the parts > > which are no longer needed causes a lot of memory fragmentation.
> I'd second your opinion about the interface, but regarding speedup and > memory fragmentation makes me think. > > For speed I'd need some hints, what technique would be faster/fastest? [HUMOR ON] This is not good. All you asked for was a clean function that did a job. Then you had some people come in and *assume that there was a bottleneck* in your application, which was never indeed ever stated. Knuth, are you there? Knuth!!!! Are you there? I'm going to take a wild guess here... Most of the time when programmers need to split a string into an array, it is fairly small strings. Not often would one have a huge 500MB string with millions of delimiters in them. If one does have a 500MB string with millions of delimiters in them, THEN WE worry about optimization. ON THE OTHER HAND... if it is a case that some people quite often are splitting up large strings.. and if Knuth was wrong... then... I ponder possibly using something called a CapArray which is like my CapString algorithm. Preset the array to about or 100 (or 500 if you anticipate it.. the initial capacitance).. and grow the array chunk by chunk while the single pass parser eats up the strings between delimitters and throws them into the array. Only one chunk growth is needed if the capacitance is estimated correctly or near correctly. Instead of using the Occurs() function which means that a double pass parser is needed.. it could instead be a single pass parser with the array items being added to the CapArray one by one, only triggering chunk growth if the array capacitance becomes filled. When the items are finished being added to the caparray, the caparray is set back to the count strings that was incremented while parsing. Or one could use several stack based Fixed arrays.. for example array[0 to 99] if you only anticipate 100 .. but if there are more than 100, then pop another array2 open using a pointer to the first array (second array is on the heap, but at least the first one wasn't). But, But But..No one has yet informed us that: a) there is a bottleneck b) the string will ever be large enough for any of the above optimizations to have ANY affect Take for example if we were splitting an HTTP request string of some sort somedata;somedata2;somedata3; or somedata=ccc&somedata2=bbb&somedata3=aaa We only have to chop it three times by the ampersands, and then split it once by equal signs.. Making three or six setlength calls isn't going to kill anybody. That's the least of your worries. It would be a worry if you had a huge stream of text.. like a 500MB file, yes. Who said we had a 500mb file? Who has done a survey showing that splitting is usually done on 500MB files? There's More.. But wait.. why use Array's, when a stringlist could be used instead? Then our code becomes even more complex because someone has to allocate the stringlist with a create, and then someone has to free it. Hmm. Or, since arrays and stringlists are actually just poor forms of databases, maybe we should build the split function to pump out the data into an an SQL database instead of an array, as a premature databasization (kind of like a premature optimization). After all, if you have a 500MB file with millions of delimiters in it, this data surely needs to be managed, doesn't it? But why? Maybe all the guy wanted was a simple array and maybe speed was the absolute last thing the guy was worrying about. Okay, but this still helped me brainwash the idea of a CapArray, even if this all was just academic premature optimization. Plus, if it is fairly easy to optimize a function, i.e. it doesn't take much time, then it isn't so harmful to optimize it. Plus, it is fun to solve puzzles too, hence the shootout contest and all that stuff. [HUMOR OFF] _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal