Re: [perl-win32-gui-users] General Perl Text Extraction doubt

2004-01-09 Thread Jonathan Southwick
I agree.  I have written some HTML parsers in the past (probably could be 
written better) for pulling definitions from the web, and for getting quick 
stock information on various stock symbols.  This saves me from having to 
open a web browser, go to the site, enter the text I am searching for, and 
wait for the response.  Since the sites I have chosen have a number of 
graphics the wait time increases slightly.


These command line and GUI perl scripts are convenient.  The time it takes 
to figure out how to parse the data isn't too bad so I would recommend a 
unique hack per site as Jez has.


Also, like Jez stated, a very good source for general perl questions not 
related to the Win32::GUI module is http://www.perlmonks.org.  There is a 
wealth of information and help there.


Jonathan

Jonathan Southwick
[EMAIL PROTECTED]
Technical & Network Services
Allegheny College
Meadville, PA 16335
(814) 332-2755

At 1/8/2004  05:49 PM, Jez White wrote:

Hi,

As a basic reply: coming up with a generic HTML parser for the kind of thing
your doing will be difficult. You may find it quicker (in terms of
development time) to do a custom hack for every website you'll be looking
at. Other perl sites/lists would be able to help you better, a good place to
start would be www.perlmonks.net. Saying that there might be an online
database for the kind of information you are looking for ( I'd be surprised
if there isn't).

For what it's worth, you've probably chosen the correct language - but there
will be a learning curve - it wont be easy:)

cheers,

jez.



Jonathan Southwick
[EMAIL PROTECTED]
Technical & Network Services
Allegheny College
Meadville, PA 16335
(814) 332-2755




[perl-win32-gui-users] pointers

2004-01-09 Thread Chris
I'm working on controlling winamp, but I have a small issue. When I get the
song name from winamp, it returns it as a pointer, and I'm not sure how to
handle the pointer, so that I can get text output from it. Thanks





Re: [perl-win32-gui-users] pointers

2004-01-09 Thread Steve Pick
Hi chris.

I recommend you download my plugin Winamp track-spamming perl script for
XChat (www.xchat.org) from http://baxpace.com/?page=projects and look at the
code. It obtains the title from the current winamp window title, and also
gets stuff like bitrate, frequency and kbps. It uses Win32::API to obtain
the values. I hope it helps, not sure if it works with Winamp 5.

Steve

- Original Message - 
From: "Chris" <[EMAIL PROTECTED]>
To: 
Sent: Friday, January 09, 2004 5:09 PM
Subject: [perl-win32-gui-users] pointers


> I'm working on controlling winamp, but I have a small issue. When I get
the
> song name from winamp, it returns it as a pointer, and I'm not sure how to
> handle the pointer, so that I can get text output from it. Thanks
>
>
>
>
> ---
> This SF.net email is sponsored by: Perforce Software.
> Perforce is the Fast Software Configuration Management System offering
> advanced branching capabilities and atomic changes on 50+ platforms.
> Free Eval! http://www.perforce.com/perforce/loadprog.html
> ___
> Perl-Win32-GUI-Users mailing list
> Perl-Win32-GUI-Users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perl-win32-gui-users
>




Re: [perl-win32-gui-users] General Perl Text Extraction doubt

2004-01-09 Thread Steve Pick
Hi,

I suggest you obtain HTML::Parser from CPAN (it might be included with
ActivePerl - dont know).
http://search.cpan.org

You're probably going to need to be VERY accomplished to acheive something
like this :/ while it's pretty easy to regex out phone numbers and things,
it's not easy to obtain the other data. You'd need some kind of artificial
intelligence routines to recognise every possible organisation of the data,
I wouldn't quite know where to begin.

As other people have said, this list is primarilly for Win32::GUI, so if
your questions aren't win32::gui oriented you would probably get better
results from perlmonks or some other list.

Steve


- Original Message - 
From: "#SHUCHI MITTAL#" <[EMAIL PROTECTED]>
To: 
Sent: Thursday, January 08, 2004 5:04 PM
Subject: [perl-win32-gui-users] General Perl Text Extraction doubt


> Hi all
>
> Since everyone here is a perl expert and im a total newbie i would be very
very grateful if someone could help me out with my doubts.
>
> I am doing a project to develop a student professor system including
databases etc. To start off I need lots of professor data from various
websites of educational institutions( for populating my database) . To
extract this data and get started I decided to use perl since its text
extraction capabilities are known to one n all.
>
> The problem is all these sites have a totally different HTML format and
structure and differ in which the info of all profs is listed, and I cant
seem to come up with a generic PERL code to extract this data and put it in
text files on my local hard disk. Therefore I think ill need to use REGEX
and PATTERN MATCHING to do the task but im not sure how to go about it. I
wrote one code that works with www.ntu.edu.sg/sce/staffacad.asp but this is
way to specific and doesnt work with any other staff sites.!
> I need to do the following:
>
> 1. Visit the base site of any institute and extract professor information
which includes NAME,EMAIL,DEGREE,RESEARCH INTERESTS AND PUBLICATIONS
RELEASED
> 2. For publications the listing either appears via a link on the profs
homepages or as a chunk of data under the heading "PUBLICATIONS" etc. I
think i can get the data if its via a link but i dunno hoe to extract that
exact chunk in the middle of a page.
> 3. All this info shud be extracted to external text files
>
> I can manage if someone just helps me with snippets of code to gt started
with the extraction...accurate extraction of information from any random
site of a intitution which has profs listed etc.
> For example some sites are www.ntu.edu.sg/sce/staffacad.asp ,
http://www.ntu.edu.sg/eee/people/, http://www.ie.cuhk.edu.hk/index.php?id=6,
http://www.ntu.edu.sg/mpe/admin/staff.asp
>
> Greatly appreciate any help in any direction...totally lost here..please
feel free to ask if u have any doubts regarding my question!
>
> shuchi
>
>
>
> ---
> This SF.net email is sponsored by: Perforce Software.
> Perforce is the Fast Software Configuration Management System offering
> advanced branching capabilities and atomic changes on 50+ platforms.
> Free Eval! http://www.perforce.com/perforce/loadprog.html
> ___
> Perl-Win32-GUI-Users mailing list
> Perl-Win32-GUI-Users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perl-win32-gui-users
>
>