Re: Parsing and Extracting Text from ePub XHTML

2015-07-30 Thread James Hale
Hi Brahmanathaswami, My code was begun back in LC 5.5 slowly making the transition through 6 and then 7. I think I still have a switch in there in case the stack is opened in LC 6 to ensure it does some of the fudging required. I have learnt in doing all this that standards (such as ePub 2) seem

Re: Parsing and Extracting Text from ePub XHTML

2015-07-30 Thread Mark Schonewille
Yep, Unicode is the future :-) It is also a considerable part of the digital past already, just not so much LiveCode's. -- Best regards, Mark Schonewille Economy-x-Talk Consulting and Software Engineering Homepage: http://economy-x-talk.com Twitter: http://twitter.com/xtalkprogrammer KvK: 5027

Re: Parsing and Extracting Text from ePub XHTML

2015-07-29 Thread Brahmanathaswami
pursuant to my last long winded email. I've boiled it down to something very simple The company doing our ePubs is mixing 1) unicode HTML dec entities for diacriticals in the IAST roman char range 2) Unicode UTF strings for Tamil and Devanagari script and 3) old fashioned punctuation in the r

Re: Parsing and Extracting Text from ePub XHTML

2015-07-29 Thread Brahmanathaswami
Aloha, James: great. thanks for this... seems we are each re-inventing the wheel here. Your code is useful though. I see you are still having to deal with the pesky "—" But only in your TOC xml processing routines. What I don't understand (which makes it hard to make good strategic decisio

Re: Parsing and Extracting Text from ePub XHTML

2015-07-28 Thread James Hale
Hi Brahmanathaswami, I wrote a sample stack that opens and displays pubs if that is of any use. You can find it here... http://livecodeshare.runrev.com/stack/761/Epub-Opener If it is of help, let me know :-) James ___ use-livecode mailing list use-

Re: Parsing and Extracting Text from ePub XHTML

2015-07-27 Thread Brahmanathaswami
Oh My! You are Lord Ganesha's Grace manifest in human form (smile, Elephant Faced One is the "Brains of the universe" This works great! I get output like this: awesome! Dancing with Siva There is on Earth no diversity. He gets death after death who perceives here seeming diversity. As a unity

Re: Parsing and Extracting Text from ePub XHTML

2015-07-26 Thread Mark Schonewille
Hi Brahmanathaswami, This works on LC 6.7.3: on mouseUp put fld 1 into x if the platform is not "MacOS" then // not sure why this works put isoToMac(x) into x end if put uniDecode(uniEncode(x,"UTF8")) into x set the htmlText of fld 2 to x end mouseUp

Re: Parsing and Extracting Text from ePub XHTML

2015-07-26 Thread Richmond
Presumably you have already tried SET THE HTML TEXT OF FIELD "BLAH" TO . . . Richmond. from my jail-broken, recycled iPad 1 On 26 Jul 2015, at 20:31, Brahmanathaswami wrote: > We do a lot of work with the contents of ePubs. For those who don't know the > spec: > > "someBook.epub" is just "s

Parsing and Extracting Text from ePub XHTML

2015-07-26 Thread Brahmanathaswami
We do a lot of work with the contents of ePubs. For those who don't know the spec: "someBook.epub" is just "someBook.zip" which when inflated has a mini-portable web site based on responsive CSS (all percentages). You get someBook /ops # "Open Package Structure" / fonts / images / styles / x