Does it matter what language, or what charset? 

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>

You can always look at this and the lang="foo" tags to try to determine
what language, or at least what charset, the page is in. Of course, a
charset (like iso-8869-1) can cover many languages, but at least you
can narrow it down a little if you don't find a lang="foo" tag.

Cheers,
Kevin

On Sun, Sep 08, 2002 at 08:05:18AM +0300, Octavian Rasnita ([EMAIL PROTECTED]) said 
something similar to:
> Hi all,
> 
> I want to create a search engine. Please tell me how can I find out the
> languages used in a web page.
> I know that HTML 4.01 uses <html lang="en"> for example, but most of the web
> pages don't use this tag.
> 
> What should I test to find the language used?
> 
> Thank you.
> 
> Teddy's Center: http://teddy.fcc.ro/
> Mail: [EMAIL PROTECTED]
> 
> 
> 
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

-- 
[Writing CGI Applications with Perl - http://perlcgi-book.com]
You are all the Buddha.
        -- Buddha (last words)

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to