OK, I am a bit of an idiot, but in case someone else reads this thread 
looking for answers, here are the resolutions I've chased down so far.

1. **I did not read the docs for pdf-read well enough** because right there 
at the top of the main page, it says it only works on Linux and macOS. 
*facepalm*

2. When I initially had the problem with pdf-read on Windows, I incorrectly 
thought that pdf-read needed me to install the racket-poppler package in 
order to work.

3. I did not realize that racket-poppler and pdf-read were completely 
independent of each other, and that racket-poppler on its own provides 
roughly everything that pdf-read does.

4. By using racket-poppler I can get a *little* closer:
    #lang racket
    (require racket-poppler)
    
   → 
AppData\Roaming\Racket\6.12\pkgs\racket-poppler\racket-poppler\ffi.rkt:99:0: 
ffi-obj: couldn't get "g_filename_to_uri" from "C:\\Program 
Files\\Racket\\lib\\libglib-2.0-0.dll" (The specified procedure could not 
be found.; errid=127)

5. I have the growing sense that it's probably better, in this case, to 
roll my own functions rather than risk hassles for Windows users (not that 
I'm any fan of Windows). I've learned a lot, though!

6. So, nudged by Neil's reply above, I went ahead and wrote my own 
super-lazy functions to do what I need for now. Here they are:

(define (page-count pdf-filename)
  (define pdf (open-input-file pdf-filename))
  
  (for/sum ([line (in-port read-line pdf)])
    (let ([x (regexp-match #px"/Type[\\s]*/Page[^s]" line)])
      (if x (count values x) 0))))

; Look for first occurrence of the form "/MediaBox [0 0 612.0 792.0]" - 
Returns the width and height of the box, or #f
(define (has-media-box? str)
  (define mediabox-px 
#px"/MediaBox\\s*\\[\\s*([0-9\\.])+\\s+([0-9\\.])+\\s+([0-9\\.]+)\\s+([0-9\\.]+)\\s*\\]")
  (let* ([x (regexp-match mediabox-px str)])
    (cond
      [x
       (match-let ([(list start-x start-y end-x end-y) (map string->number 
(rest x))])
         (list (- end-x start-x) (- end-y start-y)))]
      [else #f])))

(define (pagesize pdf-filename)
  (define pdf (open-input-file pdf-filename))

  (for/last ([line (stop-after (in-port read-line pdf) has-media-box?)])
    (has-media-box? line)))

I can already think of a few cases where these would return inaccurate 
results (unlinked page objects, varying page sizes, etc). But they are 
reasonably fast and have worked correctly on the dozen or so test PDFs I've 
thrown at them.

8. Thank you all for helping and putting up with me.

—Joel D.

On Friday, March 23, 2018 at 12:50:41 PM UTC-5, Matthew Flatt wrote:
>
> At Fri, 23 Mar 2018 09:39:41 -0700 (PDT), Joel Dueck wrote: 
> > On Friday, March 23, 2018 at 9:16:55 AM UTC-5, Greg Trzeciak wrote: 
> > > 
> > > So it's clear it's not the path issue, see neighbouring thread on 
> another 
> > > possibility - some dependency of libpoppler is missing on Windows 
> > > 
> > > 
> >  Ok, I tried Dependency Walker on libpoppler-glib-8.dll, and got all 
> kinds 
> > of missing dependencies. 
> > 
> > Mostly of the form API-MS-WIN-CORE-PROFILE-L1-1-0.DLL 
>
> You can ignore anything under the system DLLs like "kernel32.dll". 
>
>
> Are you using "racket-poppler/ffi.rkt" to load the DLLs, or are you 
> loading the DLL yourself? If you're not using "racket-poppler/ffi.rkt", 
> then you'll need to do something like the way it explicitly loads all 
> of the dependencies of a DLL before loading a DLL: 
>
>
> https://github.com/soegaard/racket-poppler/blob/master/racket-poppler/ffi.rkt#L34
>  
>
> I think "libgobject-2.0-0.dll" is missing there, but it gets loaded 
> anyway by `racket/draw/unsafe/glib`. 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to