The function extract https://pkg.go.dev/github.com/heussd/pdftotext-go#Extract actually says: Extract PDF text content in simplified format That might mean it will return text only and not tables /etc. You might find a better support if you raise a git issue in: https://github.com/heussd/pdftotext-go/issues as an idea for getting more information Also, LLM like gemini or chatGpt might get you a good direction: * https://g.co/gemini/share/e146a8428b63 * https://chatgpt.com/share/679166c7-3c9c-8008-9d21-e97ae2d90f4a
On Wednesday, January 22, 2025 at 10:08:51 AM UTC-6 Hugh Myrie wrote: > I want to extract text from a PDF and preserve any table or at least > convert it to a CSV. I am using the PDFtoText package (which uses the > Poppler software). The text is extracted vertically (i.e. one column at a > time) and each text is separated by a space. There is no line break making > it difficult to manipulate. I want to extract the text horizontally to > preserve and possible add line breaks to allow for further manipulation. > > Your help in this matter is appreciated. Suggest alternatives if available. > > Here is the Go code: > > package main > > import ( > "fmt" > "log" > "os" > > pdftotext "github.com/heussd/pdftotext-go" > ) > > func main() { > // Replace "test.pdf" with the path to your PDF file > pdfPath := "test.pdf" > // Open the PDF file > f, err := os.Open(pdfPath) > if err != nil { > log.Fatalf("Failed to open PDF file: %v", err) > } > defer f.Close() > // Read the file content > content, err := os.ReadFile(pdfPath) > if err != nil { > log.Fatalf("Failed to read PDF file: %v", err) > } > // Extract text from the PDF file > text, err := pdftotext.Extract(content) > if err != nil { > log.Fatalf("Failed to extract text from PDF file: %v", err) > } > // Print the extracted text > fmt.Println(text) > } > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/20f4feb9-f392-4bf8-b9f9-129f6270704dn%40googlegroups.com.