This is an interesting problem, Andrew.  Thanks for posting it.

One modest performance gain available easily may be to change this:

  repeat for each element tThisItem in tInventoryArray
         if tThisItem["description"] contains tSearchQuery then
            put tThisItem into tSortedInventoryArray[ \
              (the number of elements of tSortedInventoryArray) + 1]
         end if
  end repeat

...to:

  put 0 into i
  repeat for each element tThisItem in tInventoryArray
         if tThisItem["description"] contains tSearchQuery then
            add i to i
            put tThisItem into tSortedInventoryArray[i]
         end if
  end repeat

As originally written it needs to traverse the entire destination array each time through the loop just to get a unique ordinal index number; maintaining that manually through i should be a little bit faster.

It may be that a much faster optimization may be possible by leaving the data as a delimited string. It's hard to say up front if that will be the case, but converting strings to and from arrays is expensive, and both "repeat for each line..." and "put <foundthing>&cr after..." are very fast operations.

If these ideas don't yield the performance gain you're looking for, and if the data is not sensitive, feel free to email the data and a sample stack with your current search scripts and I'll see what I can do.

It's useful for much of the work I do to know which methods of querying data will perform better, and I rarely come across good real-world data like this, so I'd be happy to give it a shot to see what can be learned from it.

My test data no doubt differs from yours, but FWIW most of my testing has been done with data containing between 10k and 100k records, and few take as long as a second or two for a given query.

Either some of what I've learned may help your situation, or your situation will teach me new things to consider.

--
 Richard Gaskin
 Fourth World Systems


Andrew Bell wrote:
Is there a quick way to search a large multidimensional array that I am missing? I'm working on an inventory system and trying to figure out some more efficient methods.

Currently I'm taking a tab-delimited spreadsheet provided by the client and converting it to an array, but there are currently > 48000 keys in the array so my repeat loop for searching is taking several minutes. I quickly figured out by making the barcode (unique value) the primary key of the array cut down on a simple SKU search, but I'm trying to also search based on other values (like the item description).


A line of sample data looks like this:
66290 PHOTO, Early to Mid 1960's, Womens Hair Style, 27x21" Blue Background w/ White Vine Edging, Gold Frame 1 $200.00


An item in the array looks like this:
tInventoryArray[66290]["barcode"]
tInventoryArray[66290]["description"]
tInventoryArray[66290]["details"]
tInventoryArray[66290]["qty"]
tInventoryArray[66290]["cost"]


My slow, albeit working, search code looks like this:
repeat for each element tThisItem in tInventoryArray
       if tThisItem["description"] contains tSearchQuery then
put tThisItem into tSortedInventoryArray[(the number of elements of tSortedInventoryArray) + 1]
       end if
end repeat


This does work, but is taking almost 2 minutes to search through the 48000+ item database. Can someone point out a flaw in my process? My next experiment is converting this array to a SQLlite database and just throwing SELECT * WHERE commands at it.

--Andrew Bell


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to