Hi Terry, I see, thanks for sharing your handler.  I'm going to run it on
some text and see the output.  LC is sooo good with chunks.....I find it
really fast as well.

All the best, Tom



On Thu, Oct 25, 2018 at 5:07 PM, Terry Judd via use-livecode <
use-livecode@lists.runrev.com> wrote:

> On 26/10/2018 4:27 am, "use-livecode on behalf of Tom Glod via
> use-livecode" <use-livecode-boun...@lists.runrev.com on behalf of
> use-livecode@lists.runrev.com> wrote:
>
>     Hi Terry, glad you found a solution.....
>
>     I have a similar challenge.
>
>     I did a word count, but would love to recognize the same phrases.  Did
> you
>     just compare chunks? ... hash them? (probably redundant?)
>
>     Are there any more hints you can drop about this?
>
>     Thanks,
>
>     Tom
>
> Hi Tom - I've just done something like the code below, which accepts a
> block of text and the maximum 'phrase' length as input and provides an
> array with sorted counts of word runs (so not necessarily sensible phrases)
> of different lengths as output. I think it will be good enough for my
> purposes.
>
> function getWordAndPhraseCounts pText, pMaxPhraseLength
>    put empty into tA1
>    set the itemDel to tab
>    repeat for each sentence tSentence in pText
>       put the number of words in tSentence into tMax
>       repeat with i = 1 to pMaxPhraseLength
>          repeat with j = 1 to (tMax-i+1)
>             put word j to j+i-1 of tSentence into tPhrase
>             add 1 to tA1[i][tPhrase]
>          end repeat
>       end repeat
>    end repeat
>    put empty into tA2
>    repeat for each line tLength in the keys of tA1
>       put empty into tList
>       repeat for each line tPhrase in the keys of tA1[tLength]
>          put tPhrase&tab& tA1[tLength][tPhrase]&cr after tList
>       end repeat
>       delete last char of tList
>       sort lines of tList descending numeric by item 2 of each
>       put tList into tA2[tLength]
>    end repeat
>    return tA2
> end getWordAndPhraseCounts
>
>
>     On Thu, Oct 25, 2018 at 4:27 AM Terry Judd via use-livecode <
>     use-livecode@lists.runrev.com> wrote:
>
>     > OK - was easier than I thought. I have something that works fast
> enough by
>     > iterating through runs of words in each sentence in a block of text,
>     > incrementing counts into an array and then sorting the contents of
> that
>     > array by phrase length and frequency.
>     >
>     > Terry...
>     >
>     > On 25/10/2018 4:56 pm, "use-livecode on behalf of Terry Judd via
>     > use-livecode" <use-livecode-boun...@lists.runrev.com on behalf of
>     > use-livecode@lists.runrev.com> wrote:
>     >
>     >     Hi – I’m looking to analyse some large block of text (journal
>     > abstracts from key educational technology journals over a several
> year
>     > period) to find common words and phrases. Finding common words
> should be
>     > easy enough but I’m not clear on what approach to take for finding
> common
>     > phrases (iterating through the text capturing overlapping word runs
> of
>     > various lengths?). Any ideas on how best to proceed?
>     >
>     >     TIA,
>     >
>     >     Terry...
>     >     _______________________________________________
>     >     use-livecode mailing list
>     >     use-livecode@lists.runrev.com
>     >     Please visit this url to subscribe, unsubscribe and manage your
>     > subscription preferences:
>     >     http://lists.runrev.com/mailman/listinfo/use-livecode
>     >
>     >
>     > _______________________________________________
>     > use-livecode mailing list
>     > use-livecode@lists.runrev.com
>     > Please visit this url to subscribe, unsubscribe and manage your
>     > subscription preferences:
>     > http://lists.runrev.com/mailman/listinfo/use-livecode
>     _______________________________________________
>     use-livecode mailing list
>     use-livecode@lists.runrev.com
>     Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
>     http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to