Re: [Tutor] PDF Scrapping

Python Beginner Wed, 25 Nov 2015 06:07:57 -0800

Oh, I forgot to mention that I am using Python 3.4. Thanks again for your
help pointing me in the right direction.


~Chris

On Tue, Nov 24, 2015 at 1:36 PM, Python Beginner <
pythonbeginner...@gmail.com> wrote:

> Hi,
>
> I am looking for the best way to scrape the following PDF's:
>
> (1)
> http://minerals.usgs.gov/minerals/pubs/commodity/gold/mcs-2015-gold.pdf
> (table on page 1)
>
> (2)
> http://minerals.usgs.gov/minerals/pubs/commodity/gold/myb1-2013-gold.pdf
> (table 1)
>
> I have done a lot of research and have read that pdftables 0.0.4 is an
> excellent way to scrape tabular data from PDF'S (see
> https://blog.scraperwiki.com/2013/07/pdftables-a-python-library-for-getting-tables-out-of-pdf-files/
> ).
>
> I downloaded pdftables 0.0.4 (see https://pypi.python.org/pypi/pdftables).
>
> I am new to Python and having trouble finding good documentation for how
> to use this library.
>
> Has anybody used pdftables before that could help me get started or point
> me to the ideal library for scrapping the PDF links above? I have read that
> different PDF libraries are used depending on the format of the PDF. What
> library would be best for the PDF formats above? Knowing this will help me
> get started, then I can write up some code and ask further questions if
> needed.
>
> Thanks in advance for your help!
>
> ~Chris
>
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] PDF Scrapping

Reply via email to