wget isn't the right tool for that job. However its brother wput may be able to do the job. On Sat, 22 Sep 2012, Gary Dale wrote:
> On 22/09/12 11:27 AM, Gary Dale wrote: > > On 22/09/12 11:01 AM, cr...@gtek.biz wrote: > > > Greetings, > > > > > > I have a small book collection (~150) that I thought would be neat to > > > catalog by the Library of Congress catalog numbers. I have found a LOC > > > search form that will allow me to input the ISBN, and it will return the > > > information I want: > > > > > > [code]http://www.loc.gov/cgi-bin/zgate?ACTION=INIT&FORM_HOST_PORT=/prod/www/data/z3950/locils2.html,z3950.loc.gov,7090[/code] > > > > > > > > > > > > I have the list of book ISBNs in a text file, so scripting this should be > > > quite easy. The problem is I can't figure out how to submit the form from > > > the command line. I figured wget would be the best way, but everything I > > > try results in downloading a single line that reads "Your form didn't > > > include an ACTION!" So I thought I would turn to here for help. The test > > > ISBN I am using is for The Linux Cookbook: 1886411484, QA76.76.O63S788 > > > 2001. > > > > > > And a related side question. From my reading, I've learned that the Z39.50 > > > protocol is used to query databases, usually library related. Is anyone > > > aware of an ISBN database table that can be downloaded by the user, > > > preferably in a format that can be imported into MySQL or PostgreSQL? > > > > > > Thanks, Craig > > > > > The url you give is for the form. If you enter an ISBN number it will do the > > search. > > > > What you need to do is capture the http header sent when you click "submit > > query" then replace the test ISBN number with whatever number you want to > > search. Wireshark can do this. Simply look for the query packet(s). > > > The fields you need are shown in the page source: > > <FORM METHOD="POST"ACTION="/cgi-bin/zgate"> > <INPUT NAME="ACTION"VALUE="SEARCH"TYPE="HIDDEN"> > <INPUT NAME="DBNAME"VALUE="VOYAGER"TYPE="HIDDEN"> > <INPUT NAME="ESNAME"VALUE="B"TYPE="HIDDEN"> > <INPUT NAME="MAXRECORDS"VALUE="20"TYPE="HIDDEN"> > <INPUT NAME="RECSYNTAX"VALUE="1.2.840.10003.5.10"TYPE="HIDDEN"> > <INPUT > NAME="REINIT"TYPE="HIDDEN"VALUE="/cgi-bin/zgate?ACTION=INIT&FORM_HOST_PORT=/prod/www/data/z3950/locils2.html,z3950.loc.gov,7090"> > <INPUT NAME="srchtype"VALUE="1,1016,2,102,3,3,4,2,5,100,6,1"TYPE="HIDDEN"> > > <P> > <STRONG>Enter Search Term(s):</STRONG><br>(The search term can be a single > word or a phrase from anywhere in the record. Enter an author's name in > indirect order, i.e., last_name, first_name.)<p> > <INPUT NAME="TERM_1"SIZE="60"> > <p> > <INPUT TYPE="SUBMIT"VALUE="Submit Query"> > <INPUT Type="RESET"VALUE="Clear Form"> > <HR> > Use of this form results in a search of the LC Voyager database (approximately > 14 million records). This database contains records in all bibliographic > formats (i.e., books, serials, music, maps, manuscripts, computer files, and > visual materials), and includes the retrospective, unedited older > bibliographic > records known as the PreMARC File. LC name and subject authority records > cannot be searched. > <INPUT NAME="SESSION_ID"VALUE="5923056"TYPE="HIDDEN"> > </FORM> > > > You need to construct the query using those fields with those values, with > TERM_1 containing the ISBN number. > > From the error you are getting, it seems like your query either didn't include > the SEARCH action or the header wasn't understood. > > > > > --------------------------------------------------------------------------- jude <jdash...@shellworld.net> Adobe fiend for failing to Flash -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/alpine.bsf.2.01.1209222344190.74...@freire1.furyyjbeyq.arg