Hi there, Thank you for the suggestions. I'll explore the Document AI as a potential solution and will look into PDF to XML converters as well.
Best regards, Hugh On Thu, Jan 23, 2025, 4:35 PM <golang-nuts@googlegroups.com> wrote: > golang-nuts@googlegroups.com > <https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/golang-nuts/topics> > Google > Groups > <https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview> > [image: > Google Groups Logo] > <https://groups.google.com/forum/?utm_source=digest&utm_medium=email/#!overview> > Today's topic summary > View all topics > <https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/golang-nuts/topics> > > - PDF to text > <#m_-5809117193273972794_m_-1916359406536345700_group_thread_0> - 6 > Updates > > PDF to text > <http://groups.google.com/group/golang-nuts/t/7fb689c074dc6704?utm_source=digest&utm_medium=email> > Edgar Madrigal <zaib...@gmail.com>: Jan 22 01:44PM -0800 > > The function extract > https://pkg.go.dev/github.com/heussd/pdftotext-go#Extract actually says: > Extract > PDF text content in simplified format > That might mean it will return text only and not ...more > <http://groups.google.com/group/golang-nuts/msg/d4fb70cf8c828?utm_source=digest&utm_medium=email> > Mike Schinkel <m...@newclarity.net>: Jan 22 10:25PM -0500 > > Hi Hugh, > > I have been planning to do some Go work with PDF files, so your email > triggered me to do some research. > > Not sure it using heussd/pdftotext-go is critical to you, or if you are > just ...more > <http://groups.google.com/group/golang-nuts/msg/d5ef2741384a6?utm_source=digest&utm_medium=email> > Hugh Myrie <hugh.my...@gmail.com>: Jan 23 07:29AM -0500 > > Hi Mike, > > Thanks for the suggestion! I'm interested in checking out your forked code. > It seems like a good alternative to what I'm currently using. > > Hugh > > ...more > <http://groups.google.com/group/golang-nuts/msg/d7ca4045d1b88?utm_source=digest&utm_medium=email> > Michael Bright <mjbrigh...@gmail.com>: Jan 23 09:17AM -0800 > > Hi Mike, > > Not wanting to suggest that you take the Python route, but just sharing my > experience. > > I've tried Acrobat Reader's "Save as Text" functionality, and also one or > two Python libraries ...more > <http://groups.google.com/group/golang-nuts/msg/d6b9a60ac6069?utm_source=digest&utm_medium=email> > robert engels <reng...@ix.netcom.com>: Jan 23 11:55AM -0600 > > You typically can’t convert a PDF to text and do what you are trying to do. > > Look for PDF to XML converters - you need the “blocks” and the hierarchy > in order to interpret most PDFs with any ...more > <http://groups.google.com/group/golang-nuts/msg/d6bff07d434d2?utm_source=digest&utm_medium=email> > Sharon Mafgaoker <sha...@cloud5.co.il>: Jan 23 08:56PM +0200 > > Hey, > > I’m using > https://cloud.google.com/document-ai > > I’m sending my pdf and getting back extracted text json object. > > Work fast and not expensive 🙏 > > I hope this will help you . > ...more > <http://groups.google.com/group/golang-nuts/msg/d6f543da14d8c?utm_source=digest&utm_medium=email> > Back to top <#m_-5809117193273972794_m_-1916359406536345700_digest_top> > You received this digest because you're subscribed to updates for this > group. You can change your settings on the group membership page > <https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/golang-nuts/join> > . > To unsubscribe from this group and stop receiving emails from it send an > email to golang-nuts+unsubscr...@googlegroups.com. > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/CAN-X3%3DZq7Cv0tb3%2B3C5WL72pqZnOERKK5dOi8%3DcuaFXU9TuxSA%40mail.gmail.com.