On 11/03/14 20:15, Benjamin Beaumont wrote:
Hi All,
We're in the process of adding some new chunk types in LiveCode 7 and we
would appreciate suggestions for a particular chunk name.
The new chunk types are:
naturalword (breaks on unicode word boundaries)
Well; in theory that looks good until you start to think about languages
which are
written (such as Sanskrit) with no obvious word boundaries and both
vowel mutation (Sandhi)
at what would be word boundaries, and consonant fusion.
Languages such as Inuit and Hungarian are agglutinative, and in some
cases what we (speakers of West
European languages) would term a sentence consists of a single word with
loads of affixes; some at
the front (prefixes).
Many Austronesian languages use infixes (i.e. twiddly bits shoved into
the middle of 'words').
These also crop up in Afro-Asiatic languages such as Arabic.
There are also some examples in English such as "fan-f*cking-tabulous".
We could also get sweaty about circumfixes, where a bit gets put on the
front and a bit gets put on the back as
a sort of split morpheme (not to be confused with split-pea bara).
sentence (breaks on unicode sentence boundaries)
That looks a bit fishy.
How are you going to work out what marks a sentence boundary in every
language that one can write
with Unicode? And there are languages where the idea of a 'sentence' is
absent.
paragraph (Same behaviour as current 'line' chunk)
The first chunk is called 'naturalword' because 'word' is already in use.
Renaming the current 'word' chunk to 'token' to free up 'word' is not an
option for backward compatibility. We are also limited by the current
parser which doesn't allow us to use the form:
put natural word 1 of "this is a string of words"
'naturalword' is the clearest internal suggestion at the moment and we'd
love to get the input from community members if there is an even clearer
option.
I'm sorry to be such a "pill", but word and sentence boundaries are such
culture-bound concepts
that they will only be any good for languages that mark word and
sentence boundaries.
This is about the same as stating dogmatically that "all bananas are
yellow", when they are not.
Warm regards and thank you for your input.
You may not thank me.
Richmond.
Ben
_____________________________________________
Benjamin Beaumont . RunRev Ltd
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode