On 9/22/12 6:01 PM, [email protected] wrote: > Greetings, > > I have a small book collection (~150) that I thought would be neat to > catalog by the Library of Congress catalog numbers. I have found a > LOC search form that will allow me to input the ISBN, and it will > return the information I want: > > [code]http://www.loc.gov/cgi-bin/zgate?ACTION=INIT&FORM_HOST_PORT=/prod/www/data/z3950/locils2.html,z3950.loc.gov,7090[/code] > > I have the list of book ISBNs in a text file, so scripting this > should be quite easy. The problem is I can't figure out how to submit > the form from the command line. I figured wget would be the best way, > but everything I try results in downloading a single line that reads > "Your form didn't include an ACTION!" So I thought I would turn to > here for help. The test ISBN I am using is for The Linux Cookbook: > 1886411484, QA76.76.O63S788 2001. [snip]
If you want to screen scrape, the URI would be like this: http://www.loc.gov/cgi-bin/zgate?ACTION=SEARCH&DBNAME=VOYAGER&ESNAME=B&MAXRECORDS=20&RECSYNTAX=1.2.840.10003.5.10&REINIT=/cgi-bin/zgate?ACTION=INIT&FORM_HOST_PORT=/prod/www/data/z3950/locils2.html,z3950.loc.gov,7090&srchtype=1,1016,2,102,3,3,4,2,5,100,6,1&SESSION_ID=4493330&TERM_1=1886411484 However, the session ID expires after only a few minutes so you will need a fresh one. Regards, /Lars -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

