I've already shovelled Ruyton of the Eleven Towns quite effectively:

https://www.dropbox.com/s/n7r7u0c2m9ny3eb/Text%20analyzer%20X.livecode.zip?dl=0

No tokenising, in fact very basic stuff indeed.

Not wishing to bang on about over-complcating things . . . . .

Probably time for both Thee and Me to get out and get some fresh air before we ruin our weekends.

Richmond.

On 1/9/2018 2:05 pm, Mark Waddingham via use-livecode wrote:
On 2018-09-01 12:50, Richmond Mathewson via use-livecode wrote:
Yup: indeed: fairly coarse.

However, see my next posting re "Ruyton of the Eleven Towns"

that should make some folk feel that they need a set of sewing needles
rather than "just" a silver teaspoon.

I think you'll find my 'silver teaspoon' approach (as you put it) deals with all those cases :D

Interestingly, as I said, the multi-word match problem can be reduced to your 'shovel' - with pre and post processing.

Let's say that the phrase list is:

  Ruyton of the Eleven Towns
  East Hartfordshire
  Colchester
  Chester

First create a mapping from phrase words to individual characters (the choice of character is arbitrary):

  Ruyton <-> A
  of <-> B
  the <-> C
  Eleven <-> D
  Towns <-> E
  East <-> F
  Hartfordshire <-> G
  Colchester <-> H
  Chester <-> I

Now iterate through the source text, generating an output source text consisting of words from the new alphabet, and a 'unknown' letter '*'. For example:

The man from Ruyton of the Eleven Towns, who is of the order of shovels, travelled from Chester to Colchester via the towns in East Hartfordshire

Would become:

  C**ABCDE**BC*B***I*H**E*FG

The original phrase list is processed similarly to give:

  ABCDE
  FG
  H
  I

Searching the transformed source text using your algorithm with the list of transformed phrases would give the correct set of found phrases as required by the original problem.

Warmest Regards,

Mark.


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to