On 8/13/2012 12:13 PM, Ken Ray wrote: > On Aug 13, 2012, at 11:10 AM, James Hale wrote: > >> Is this a bug? > No, it's a 'convention'… it mimics the way that HyperCard recognized a > "word"; as stated in the Dictionary under "word": > > "A word is delimited by one or more spaces, tabs, or returns, or enclosed by > double quotes. A single word can contain multiple characters and multiple > items, but not multiple lines." > >> If not, has anyone got a workaround that doesn't require me testing for a >> punctuation character at the end of every word or replacing them all with >> spaces? > As Mark pointed out, the use of "token" helps separate the wheat from the > chaff (see the entry on "token" in the Dictionary for how a token is defined). > >
One caution: token does not separate . (period), ! (exclamation mark), or ? (question mark). If you are really trying to process English text, you probably will want to write your own punctuation remover as it can then figure the difference between a period at the end of a sentence and a period at the end of abbreviations like "Dr." or "Mr." -- Paul Dupuis Cofounder Researchware, Inc. http://www.researchware.com/ http://www.twitter.com/researchware http://www.facebook.com/researchware http://www.linkedin.com/company/researchware-inc _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode