I've already shovelled Ruyton of the Eleven Towns quite effectively:
https://www.dropbox.com/s/n7r7u0c2m9ny3eb/Text%20analyzer%20X.livecode.zip?dl=0
No tokenising, in fact very basic stuff indeed.
Not wishing to bang on about over-complcating things . . . . .
Probably time for both Thee and Me to get out and get some fresh air
before we ruin our weekends.
Richmond.
On 1/9/2018 2:05 pm, Mark Waddingham via use-livecode wrote:
On 2018-09-01 12:50, Richmond Mathewson via use-livecode wrote:
Yup: indeed: fairly coarse.
However, see my next posting re "Ruyton of the Eleven Towns"
that should make some folk feel that they need a set of sewing needles
rather than "just" a silver teaspoon.
I think you'll find my 'silver teaspoon' approach (as you put it)
deals with all those cases :D
Interestingly, as I said, the multi-word match problem can be reduced
to your 'shovel' - with pre and post processing.
Let's say that the phrase list is:
Ruyton of the Eleven Towns
East Hartfordshire
Colchester
Chester
First create a mapping from phrase words to individual characters (the
choice of character is arbitrary):
Ruyton <-> A
of <-> B
the <-> C
Eleven <-> D
Towns <-> E
East <-> F
Hartfordshire <-> G
Colchester <-> H
Chester <-> I
Now iterate through the source text, generating an output source text
consisting of words from the new alphabet, and a 'unknown' letter '*'.
For example:
The man from Ruyton of the Eleven Towns, who is of the order of
shovels, travelled from Chester to Colchester via the towns in East
Hartfordshire
Would become:
C**ABCDE**BC*B***I*H**E*FG
The original phrase list is processed similarly to give:
ABCDE
FG
H
I
Searching the transformed source text using your algorithm with the
list of transformed phrases would give the correct set of found
phrases as required by the original problem.
Warmest Regards,
Mark.
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode