Hi.
I may not really understand what you want, but doesn't the "Find string" 
variant solve your problem?
If you have a field 1 with "cat" on line 1, with "cât" on line 2 and "cat" on 
line 3, that is, the line 2 "cat" has charToNum(137) in place of the standard 
"a".
on mouseUpfind string numToChar(137) in fld 1put the number of words in char 1 
to word 2 of the foundChunk of fld 1 into tempanswer "Word" && temp && "="&& 
word temp of fld 1end mouseUp
The point being that once you have the result of "find String", you can 
engineer all the other stuff you need, such as the words that contain the odd 
char, the lines they reside in, etc.
Craig



-----Original Message-----
From: Peter Bogdanoff via use-livecode <use-livecode@lists.runrev.com>
To: How to use LiveCode <use-livecode@lists.runrev.com>
Cc: Peter Bogdanoff <bogdan...@me.com>
Sent: Sat, Mar 14, 2020 7:48 pm
Subject: Finding words with diacriticals

Hi,

I have a text search that in which I’m trying to improve the UI.

I have this text:

Edgard Varèse (Poème électronique) was a pioneer in the application of tape 
recording technology to composition.

The search database, built with Scott McDonald’s rrpSearch plugin, can only be 
searched using the exact characters. So, I’m building a supplementary array of 
words with alternate spellings that the user might type in the search box. I 
would reference the array to get an equivalent word and so provide the user 
with a usable result.

So if the user types in “poeme” — I would find “poeme” in the array and its 
equivalent “Poème” and I would actually search for “Poème” — and the user would 
get a result that included “Poème”.


So I want to build this array of word equivalents. The search database is built 
by rrpSearch from text on cards, so I have to go back to these cards to get my 
data. I’m using the find command to search cards to find every instance of  “è” 
or “é” or “ü” or “î” or whatever. There are many non-English words in the text. 
The foundText function should give me the words that contain that 
character—except it doesn’t in every case. It only finds words that BEGIN with 
the search text. So

électronique — found (char begins the word)
Varèse — not found (char is in middle of the word)
Poème — not found (char is in middle of the word)

I’m using “find” and “the foundText” which returns the whole word that contains 
the search character. No other form of find will return the whole word. The 
dictionary for foundText:

<For example, the command find "hurl" can find any word that starts with the 
string "hurl", such as "hurling" or "hurler". In this case, the entire word 
--not just the portion specified in the find command --is surrounded by a box, 
and the foundText returns the entire word.>

Is there another relatively simple way to get the whole word in which the 
desired characters live? There are dozens of fields on thousands of cards to 
search.

(I realize that there are far better ways to handle a search, and in the 
future, I will have a database that I will design myself--but not yet.)

Thanks,

Peter Bogdanoff
ArtsInteractive





_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to