Hi Bob,

I have an app that I use for manuscript analysis and have done some of this 
"prep."

For example to deal with decimal points, since the digit following the decimal 
point is always a number (0123456789):

     repeat with i = 0 to 9
      replace "." & i with "\" & i in tText
   end repeat

So that 3.14 becomes 3\14 and 2.3456 becomes 2\3456

But that's only the beginning.  There are a massive number of abbreviations: 
Mr. Smith, etc. Jan. , A.D., Dr.  

I have a list of about 70.

But then there are names: P. G. Wodehouse and on and on.

I find my script useful for my own needs, but I don't see any commercial 
application. Just too many unexpected places where a period will show up.

So I really can't see the purpose of RR's "sentence chunk". I wish they would 
explain.

Jim

> 
> Message: 25
> Date: Wed, 12 Mar 2014 23:58:46 +0000
> From: Bob Sneidar <bobsnei...@iotecdigital.com>
> To: How to use LiveCode <use-livecode@lists.runrev.com>
> Subject: Re: New chunks
> Message-ID: <e3ab1833-c8fc-47f0-af8e-358e6ec80...@iotecdigital.com>
> Content-Type: text/plain; charset="Windows-1252"
> 
> Pretty sure Livecode is going to do a simple delimiter on period. You would 
> have to prep the data first by replacing periods in any word that is a number 
> with a placeholder, processing your sentences, then restoring the 
> placeholders (if you need to). 
> 
> You could get fancy by setting the lineDelimiter to space, then finding every 
> line that ends in a period and processing everything in-between. It?s 
> doubtful a number would end in a period without it being the end of a 
> sentence. 
> 
> Bob
> 
> 
> On Mar 11, 2014, at 15:34 , Jim Hurley <jhurley0...@sbcglobal.net> wrote:
> 
>> Can someone explain how the ?sentence" chunk would work?
>> How are decimal points, and points in an abbreviation distinguished from the 
>> ?period? that deliniates the end of a ?sentence??
>> Does it presume that the exitsing text has special embedded ?periods??
>> 
>> I?ve written my own, but it is very cumbersome and not flawless. I use it to 
>> do manuscript analysis.
>> Like: Find all sentences in which ?time? and ?party? occur anywhere in the 
>> same sentence.
>> 
>> My ignorance on unicode is profound.
>> Jim
>> For some reason data
> in certain rows didn't 'register' correctly and so WHERE clauses based on
> those rows didn't work. A bug report was issued and the problem solved.
> 
> Currently the WHERE clauses in SQLite + LC 6.6.1 (6.6 rc1) seem to be
> working for me, but I haven't really stress tested it.
> 
> Try creating a brand new db and see if you can get WHERE clauses to work.
> If so, what about dumping your data, build a new db and see if that works?
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> ------------------------------
> 
> End of use-livecode Digest, Vol 126, Issue 19
> *********************************************


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to