First, thanks to everyone who replied, but especially to Nosanity. Your code reminded me that you can effectively tell when you are inside an encapsulated bit of data by an odd/even count of the encapsulation character. So, for anyone who wants it, here is a generalized function that I just wrote to parse a CSV file, regardless of the field or record delimiters (commas, tabs or whatever) and to deal with encapsulation appropriately.

This assumes you read the entire CSV file into a variable you pass into pData, so a call would look like:

put csvToArray(myEntireCSVData,return,comma,quote) into myDataAsArray

I have tested it a bit in the last 30 minutes and it working in the cases I tried, but did not test exhaustively and have not checked performance on large datasets. If any one uses this and run into an issue, please let me know.

function csvToArray pData, pRecordDelimiter, pFieldDelimiter, pEncapsulationDelimiter
  local tReservedRecordDelimiter, tReservedFieldDelimiter, tArray

# Initialize the temporary record and field delimiters. Change these if your CSV file may contain them. put charToNum(1) into tReservedRecordDelimiter; put charToNum(2) into tReservedFieldDelimiter;

# Step 1: Replace any Record or Field delimiters that are encapsulated with temporary characters
  set itemdel to pEncapsulationDelimiter
  repeat with i = 1 to the number of items in pData
    if trunc(i/2) = (i/2) then
replace pFieldDelimiter with tReservedFieldDelimiter in item i of pData replace pRecordDelimiter with tReservedRecordDelimiter in item i of pData
    end if
  end repeat

  # Step 2: Replace all occurances of the encapsulation delimiter
  replace pEncapsulationDelimiter with empty in pData

# Step 3: Parse records and fields into the array, replace any occurances of the reserved record and field delimiters for each element
  set itemdel to pFieldDelimiter
  set lineDel to pRecordDelimiter
  repeat with i = 1 to the number of lines in pData
      repeat with j = 1 to the number of items in line i of pData
         get item j of line i of pData
         replace tReservedRecordDelimiter with pRecordDelimiter in it
         replace tReservedFieldDelimiter with pFieldDelimiter in it
         put it into tArray[i][j]
      end repeat
   end repeat

   # Step 4: return the array
   return tArray
end csvToArray


--
Paul Dupuis
Cofounder
Researchware, Inc.
http://www.researchware.com/


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to