Re: [go-nuts] Re: Getting new coverage output format from go test -cover

2025-01-23 Thread Byungjun You
I'm so glad to hear it helped you! > I wonder where the blog author found that flag? It's not exposed in `go test` I guess "-test.gocoverdir" parameter is not exposed to go test users well. There is an another comment that it was very hard to find documents about "-test.gocoverdir" parameter.

Re: [go-nuts] Looking for Go module path (contains version) suggestion for github.com/spdx/spdx-go-model

2025-01-23 Thread 'Dan Kortschak' via golang-nuts
On Thu, 2025-01-23 at 19:17 -0800, Meng Zhuo wrote: > Hi,  > > What should we do if module named with version? > ``` > import ( >     "testing" > >     Spdx3_0_1 "github.com/spdx/spdx-go-model/v3_0_1" > ``` > > https://github.com/spdx/spdx-go-model/pull/1 >From the code that's in that change, t

[go-nuts] Looking for Go module path (contains version) suggestion for github.com/spdx/spdx-go-model

2025-01-23 Thread Meng Zhuo
Hi, What should we do if module named with version? ``` import ( "testing" Spdx3_0_1 "github.com/spdx/spdx-go-model/v3_0_1" ``` https://github.com/spdx/spdx-go-model/pull/1 -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscrib

Re: [go-nuts] Abridged summary of golang-nuts@googlegroups.com - 6 updates in 1 topic

2025-01-23 Thread Hugh Myrie
Hi there, Thank you for the suggestions. I'll explore the Document AI as a potential solution and will look into PDF to XML converters as well. Best regards, Hugh On Thu, Jan 23, 2025, 4:35 PM wrote: > golang-nuts@googlegroups.com >

Re: [go-nuts] PDF to text

2025-01-23 Thread Hugh Myrie
Hi Michael, You're absolutely right, PDF extraction can be a real headache! I've tried Mike's suggestion, but unfortunately, it didn't quite work as I'd hoped – it put each character on a separate line, which made it just as difficult to work with. I think I'll give OCR a shot and see if that yie

Re: [go-nuts] PDF to text

2025-01-23 Thread Robert Engels
Glyph boundaries maintains the positional information but you still need to effectively treat it as an image - it’s just very course. Which leads to the OCR/vision AI model. If the pdf author is intentionally hindering the ability to “grab the data” then there is no text at all - and it is an image

Re: [go-nuts] PDF to text

2025-01-23 Thread Duncan Harris
Amusingly we wrote our PDF table extractor largely in Go: https://pdftables.com/ It identifies tables and cells by looking at the statistical distribution of glyph boundaries on the pages rather than inferring anything from the way the text is logically grouped within the PDF. There are many ap

Re: [go-nuts] PDF to text

2025-01-23 Thread Sharon Mafgaoker
Hey, I’m using https://cloud.google.com/document-ai I’m sending my pdf and getting back extracted text json object. Work fast and not expensive 🙏 I hope this will help you . Sharon Mafgaoker – Senior Solutions Architect M. 050 995 99 16 | sha...@cloud5.co.il On Thu, 23 Jan 2025 at 19:56 r

Re: [go-nuts] PDF to text

2025-01-23 Thread robert engels
You typically can’t convert a PDF to text and do what you are trying to do. Look for PDF to XML converters - you need the “blocks” and the hierarchy in order to interpret most PDFs with any sort of complex formatting. But even with XML, tables may not work, because there is no guarantee that the

Re: [go-nuts] PDF to text

2025-01-23 Thread Michael Bright
Hi Mike, Not wanting to suggest that you take the Python route, but just sharing my experience. I've tried Acrobat Reader's "Save as Text" functionality, and also one or two Python libraries to extract text from PDFs (PyPDF2 is the one I've settled on). But what I learnt - without really dig

Re: [go-nuts] PDF to text

2025-01-23 Thread Hugh Myrie
Hi Mike, Thanks for the suggestion! I'm interested in checking out your forked code. It seems like a good alternative to what I'm currently using. Hugh On Wed, Jan 22, 2025, 10:25 PM Mike Schinkel wrote: > Hi Hugh, > > I have been planning to do some Go work with PDF files, so your email > tri