>
> Try
> [...]encoding='ISO-8859-4'[...]
>
> ISO-8859-1 (aka Latin-1) coveres W. Europe, ISO-8859-4 is the specific
> Scandinavian character set (almost, but not quite, the same as -1).
>
> If this does not work, have a look at using UTF-8 (but this means those
> accented characters wil
Chas Owens [mailto:[EMAIL PROTECTED]] wrote:
> On 21 Jun 2001 10:38:08 +0200, Morgan wrote:
> > This script is exelent but I need the script to read the
> letters "åäö"
> > and "ÅÄÖ" too.
> > Cuz this is part of my launguage (Swedish) and those
> letters are in the
> > articles.
> I am working o
On 21 Jun 2001 10:38:08 +0200, Morgan wrote:
> This script is exelent but I need the script to read the letters "åäö"
> and "ÅÄÖ" too.
> Cuz this is part of my launguage (Swedish) and those letters are in the
> articles.
I am working on this, I don't understand what it is doing with them. If
I ad
This script is exelent but I need the script to read the letters "åäö"
and "ÅÄÖ" too.
Cuz this is part of my launguage (Swedish) and those letters are in the
articles.
And I need to have the word between the tags in too.
Finaly I how do I enclose the article with cuz
Chas has right there, it can
>
> Not withstanding my other comment, this code is also inefficient,
> both tactically and strategically.
I know it was horrendous code, it was just the first thing that popped
into my head. After I had it working I was going to make it more
efficient.
>
> Take for example the string "\200a
> "Randal" == Randal L Schwartz <[EMAIL PROTECTED]> writes:
Randal> Take for example the string "\200abc"...
Randal> After you replace "\200" with "&200;",
Oh blah. I can't do math this early. "\200" is replaced with "€",
but the rest of the comment stands. :)
--
Randal L. Schwartz - St
> "Chas" == Chas Owens <[EMAIL PROTECTED]> writes:
Chas> #replace anything not in lower ASCII, Damn Americans
Chas> for (my $i = 0; $i < length($file); $i++) {
Chas> my $char = ord(substr($file, $i, 1));
Chas> if ($char > 128) {
Chas> print "replacing ", chr($
> "Chas" == Chas Owens <[EMAIL PROTECTED]> writes:
Chas> #replace anything not in lower ASCII, Damn Americans
Chas> for (my $i = 0; $i < length($file); $i++) {
Chas> my $char = ord(substr($file, $i, 1));
Chas> if ($char > 128) {
Chas> print "replacing ", chr($
On 20 Jun 2001 11:54:12 -0400, Chas Owens wrote:
> open FH, ">$ARGV[0].tmp.$$" or die "Could not open $ARGV[0]:$!";
>
> print FH $file;
>
> close FH;
>
>
> and change
>
> $parser->parsefile($ARGV[0]);
>
> to
>
> $parser->parsefile("$ARGV[0].tmp.$$")
I have removed the writing of $file t
Hrmmm...
There is a classic joke:
What do you call someone who speaks many languages?
A polygot.
What do you call someone who speaks two languages?
A bilingual.
What do you call someone who speaks one language?
An American.
My first attempt to fix this was to add:
open FH, $ARGV[0] or die "Cou
Thank you very much for this script.
And if English had been my native lauage it had been perfect.
I need the script to add my native letters as well, I'm Swedish and
therfor use "åäöÅÄÖ"
in the articles. Is this possible?
And in the text some words is wrapped in tags, is it
possible to remove
> > I don't know about how XML::Parser handles memory - last time
> > I tried to use it to parse content.rdf from http://dmoz.org ,
> > it soaked up all my memory, then bombed. Sometimes, you need
> > to write your own parsing subs :)
>
> Is the file you referred to a really big file?
dmoz is
From: Nigel Wetters [mailto:[EMAIL PROTECTED]]
> I don't know about how XML::Parser handles memory - last time
> I tried to use it to parse content.rdf from http://dmoz.org ,
> it soaked up all my memory, then bombed. Sometimes, you need
> to write your own parsing subs :)
A casual reader coul
TMTOWTDI.
I don't know about how XML::Parser handles memory - last time I tried to use it to
parse content.rdf from http://dmoz.org , it soaked up all my memory, then bombed.
Sometimes, you need to write your own parsing subs :)
>>> Chas Owens <[EMAIL PROTECTED]> 06/19/01 09:39pm >>>
Please, p
Please, please, please, do not try to parse XML with regexps. They only
work in the simplest cases. There are perfectly good XML modules
designed to parse XML for you and they are not that hard to use.
The following code parses an XML file similar to the one you described,
but has an additional
On Tue, 19 Jun 2001, Morgan wrote:
> Here is the problem.
> I will recive newsarticles three times a day in xml format and I need to
> automaticly publish those articels on a web page, on the first page it
> should only show the tags down to
> tag and a link to the whole page.
Well - as mention
I think I can give you some clues. Here's some code out of the Perl Cookbook (6.8
Extracting a Range of Lines), which I've adapted for you. You should be able to nest
such structures to get what you want.
my $extracted_lines = '';
while (<>) {
if (/BEGIN PATTERN/ .. /END PATTERN/) {
17 matches
Mail list logo