Hi Hugh, I have been planning to do some Go work with PDF files, so your email triggered me to do some research.
Not sure it using heussd/pdftotext-go is critical to you, or if you are just trying to read text in a PDF? I tried to get pdf2text installed but my dev laptop is still running macOS Monterey and I couldn't get it working so I looked for other options. If you are just interested in reading PDF text and do not have a specific need to use pdf2text then one those others I looked at might work. I came across a package originally developed by Russ Cox that was forked by many others, and to evaluate it I forked one of those and then converted it from using a reader to returning a slice of strings so I could easily split out the new lines. (I could probably have make it work with the reader, but I was just going for quick.) If you think it can help your use-case, please check it out (but be aware, my additions to the forked code are rather hacky): https://github.com/mikeschinkel/go-pdf-content-reader <https://github.com/mikeschinkel/go-pdf-content-reader> -Mike > On Jan 22, 2025, at 11:08 AM, Hugh Myrie <hugh.my...@gmail.com> wrote: > > I want to extract text from a PDF and preserve any table or at least convert > it to a CSV. I am using the PDFtoText package (which uses the Poppler > software). The text is extracted vertically (i.e. one column at a time) and > each text is separated by a space. There is no line break making it difficult > to manipulate. I want to extract the text horizontally to preserve and > possible add line breaks to allow for further manipulation. > > Your help in this matter is appreciated. Suggest alternatives if available. > > Here is the Go code: > > package main > > import ( > "fmt" > "log" > "os" > > pdftotext "github.com/heussd/pdftotext-go" > ) > > func main() { > // Replace "test.pdf" with the path to your PDF file > pdfPath := "test.pdf" > // Open the PDF file > f, err := os.Open(pdfPath) > if err != nil { > log.Fatalf("Failed to open PDF file: %v", err) > } > defer f.Close() > // Read the file content > content, err := os.ReadFile(pdfPath) > if err != nil { > log.Fatalf("Failed to read PDF file: %v", err) > } > // Extract text from the PDF file > text, err := pdftotext.Extract(content) > if err != nil { > log.Fatalf("Failed to extract text from PDF file: %v", err) > } > // Print the extracted text > fmt.Println(text) > } > > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com > <mailto:golang-nuts+unsubscr...@googlegroups.com>. > To view this discussion visit > https://groups.google.com/d/msgid/golang-nuts/c19e212d-a81f-4525-ae0d-a9abb0b292fbn%40googlegroups.com > > <https://groups.google.com/d/msgid/golang-nuts/c19e212d-a81f-4525-ae0d-a9abb0b292fbn%40googlegroups.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/86F9E39B-789B-4D39-8AB1-3C3A20367035%40newclarity.net.