I imagine the problem isn't storing it, but rather dealing with all the
possible exceptions to proper html.

Perhaps you could pass it through some kind of html fixer. I found a half
dozen websites and an open source project with one Google search
http://tidy.sourceforge.net/
On Nov 14, 2015 06:50, "Mike Kerner" <mikeker...@roadrunner.com> wrote:

> Has anyone embarked on parsing out a web page?  I would think the best
> thing to do would be to encode it as an array, but I'm open to other ideas.
>
> My scraper is straining.  I need to try something different...
>
> --
> On the first day, God created the heavens and the Earth
> On the second day, God created the oceans.
> On the third day, God put the animals on hold for a few hours,
>    and did a little diving.
> And God said, "This is good."
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to