from:"a.ko...@gmail.com"

Re: [go-nuts] Re: pure-Go MATLAB file I/O package ?

2025-11-03 Thread a.ko...@gmail.com

Hi! Such a package now exists: **github.com/scigolib/matlab** (v0.1.1-beta)

✅ Pure Go (no CGo)
✅ Read/Write v5-v7.3 formats
✅ All numeric types + complex numbers
✅ 90% test coverage, production tested

```bash
go get github.com/scigolib/matlab@latest
```

Docs: https://github.com/scigolib/matlab

Part of SciGoLib ecosystem. v0.2.0 with v5 writer coming December 2025.
On Saturday, 6 May 2017 at 13:41:59 UTC+3 Sebastien Binet wrote:

> Elia,
>
> On Fri, May 5, 2017 at 8:31 PM,  wrote:
>
>> Hi Sebastien, I'm writing a program that needs to do the exact same thing 
>> you were asking, have you by any chance developed that library? :P
>>
>
> I started it:
> https://github.com/sbinet/matfio
>
> but had to put it a bit on the back burner.
> it's also a candidate for migration to a hypothetical github.com/gonum/io 
> package...
>
> PRs accepted :)
>
> -s 
>
>>
>>
>> On Thursday, 7 July 2016 11:27:36 UTC+2, Sebastien Binet wrote:
>>>
>>> hi there,
>>>
>>> before I go ahead and implement it, has anyone released a pure-Go 
>>> read/write package for MATLAB files ?
>>>
>>> https://www.mathworks.com/help/pdf_doc/matlab/matfile_format.pdf
>>>
>>> I've found:
>>>  https://github.com/ready-steady/mat
>>>
>>> but that package is actually using the C-API exposed by a MATLAB 
>>> installation...
>>>
>>> -s
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/55b38bad-035d-4c53-9365-536fb70e9d5fn%40googlegroups.com.

Re: [go-nuts] Re: Scientific file formats: HDF5,FITS, NetCDF?

2025-10-30 Thread a.ko...@gmail.com


Subject: Pure Go HDF5 implementation - 9 years later

Hi all,

Replying to this 9-year-old thread because the original question has 
finally been answered.

Back in 2015, the consensus was that HDF5 is "so complicated that there is 
only one implementation" and too difficult for pure Go. I'm happy to report 
that's no longer true.

**What exists now (2025)**:

A pure Go HDF5 library with full read support and beta write support:
- Repository: https://github.com/scigolib/hdf5
- Read: Feature-complete (superblock v0/2/3, all layouts, compression, 
attributes)
- Write: Beta (v0.11.1-beta - chunked datasets, GZIP, dense groups, 
attributes)

```go
// Reading (works today)
file, _ := hdf5.Open("data.h5")
dataset := file.Datasets["/temperature"]
data := dataset.Data() // []float64, []int32, etc.

// Writing (beta, but functional)
file, _ := hdf5.CreateForWrite("output.h5", hdf5.Truncate)
file.CreateDataset("data", myData,
hdf5.WithChunked([]uint64{100, 100}),
hdf5.WithCompression(6),
)
```

**How it was done**:

The C library (D:\projects\scigolibs\hdf5c\src) served as reference 
implementation. Instead of "figuring out" the format, we ported proven 
algorithms to Go. Format spec + reference code = solvable problem.

Development time: ~1 year from concept to write MVP (with AI assistance for 
rapid prototyping).

**Why it matters**:

- No CGo = actual cross-compilation, no C dependencies
- Type safety = Go's compiler catches HDF5 format errors at compile time
- Standard library integration (io.ReaderAt, encoding/binary)
- Single binary deployment

**Current status**:

- Test coverage: 70-88% depending on package
- Platforms: Linux, macOS, Windows
- Recognition: HDF Group acknowledged it on their forum
- Beta limitations: Some write features in progress (dense storage 
read-modify-write, h5dump compatibility)

**For the scientific Go community**:

If you're working with HDF5 files and want to avoid CGo, this is now 
viable. Looking for beta testers with real-world datasets (astronomy, 
climate, genomics, etc.).

Installation: `go get github.com/scigolib/[email protected]`

The format is indeed complex, but it's been tackled. Thought this group 
might appreciate the update after 9 years.

Best,
Andrey Kolkov

P.S. NetCDF4 being "stripped down HDF5" means this library could 
potentially support it too, though that's not implemented yet.
On Monday, 2 April 2012 at 12:34:23 UTC+4 Sebastien Binet wrote:

> Rémy Oudompheng  writes:
>
> > Le 31 mars 2012 22:30, Fazlul Shahriar  a écrit :
> >> I started on hdf4 a while ago: https://bitbucket.org/fhs/gohdf
> >> I haven't had time to work on it further, but I'm very much interested.
> >>
> >> I also want to work on hdf5 but I don't deal with hdf5 as often as
> >> hdf4. NetCDF4 format is pretty much a stripped down version of HDF5 as
> >> far as I know, and I think NetCDF3 is the simplest format to
> >> implement.
> >>
> >> Of course, you can always use cgo. Both pytables and pyhdf are
> >> bindings to the C libraries.
> >
> > Sébastien Binet has Go bindings for libhdf5 at
> > https://bitbucket.org/binet/go-hdf5/
> > I didn't try them and I don't know if it uses reflection for dataset
> > reading/writing.
>
> it does:
> https://bitbucket.org/binet/go-hdf5/src/50c6c6f0bdc4/pkg/hdf5/h5t.go#cl-386
>
> -s
>
> -- 
> #
> # Dr. Sebastien Binet
> # Laboratoire de l'Accelerateur Lineaire
> # Universite Paris-Sud XI
> # Batiment 200
> # 91898 Orsay
> #
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/6ffaff5a-d93d-480a-8f4e-8ebefeebf0e0n%40googlegroups.com.

Re: [go-nuts] Re: Scientific file formats: HDF5,FITS, NetCDF?

2025-10-30 Thread a.ko...@gmail.com

Subject: Pure Go HDF5 implementation - 9 years later

Hi all,

Replying to this 9-year-old thread because the original question has 
finally been answered.

Back in 2015, the consensus was that HDF5 is "so complicated that there is 
only one implementation" and too difficult for pure Go. I'm happy to report 
that's no longer true.


WHAT EXISTS NOW (2025):

A pure Go HDF5 library with full read support and beta write support:

Repository: https://github.com/scigolib/hdf5

Read: Feature-complete (superblock v0/2/3, all layouts, compression, 
attributes)

Write: Beta (v0.11.1-beta - chunked datasets, GZIP, dense groups, 
attributes)


Example - Reading:

file, _ := hdf5.Open("data.h5")
dataset := file.Datasets["/temperature"]
data := dataset.Data() // []float64, []int32, etc.


Example - Writing (beta, but functional):

file, _ := hdf5.CreateForWrite("output.h5", hdf5.Truncate)
file.CreateDataset("data", myData,
hdf5.WithChunked([]uint64{100, 100}),
hdf5.WithCompression(6),
)


HOW IT WAS DONE:

The C library (https://github.com/HDFGroup/hdf5) served as reference 
implementation. Instead of "figuring out" the format from scratch, we 
ported proven algorithms to Go. Format spec + reference code = solvable 
problem.

Development time: ~1 year from concept to write MVP (with AI assistance for 
rapid prototyping).


WHY IT MATTERS:

- No CGo = actual cross-compilation, no C dependencies
- Type safety = Go's compiler catches HDF5 format errors at compile time
- Standard library integration (io.ReaderAt, encoding/binary)
- Single binary deployment


CURRENT STATUS:

- Test coverage: 70-88% depending on package
- Platforms: Linux, macOS, Windows
- Recognition: HDF Group acknowledged it on their forum
(
https://forum.hdfgroup.org/t/loking-for-an-hdf5-version-compatible-with-go1-9-2/10021/7
)
- Beta limitations: Some write features in progress (dense storage
read-modify-write, h5dump compatibility)


FOR THE SCIENTIFIC GO COMMUNITY:

If you're working with HDF5 files and want to avoid CGo, this is now 
viable. Looking for beta testers with real-world datasets (astronomy, 
climate, genomics, etc.).

Installation:

go get github.com/scigolib/[email protected]


The format is indeed complex, but it's been tackled. Thought this group 
might appreciate the update after 9 years.

Best,
Andrey Kolkov


P.S. NetCDF4 being "stripped down HDF5" (as mentioned in the original 
thread) means this library could potentially support it too, though that's 
not implemented yet.

On Friday, 31 October 2025 at 06:16:14 UTC+3 [email protected] wrote:

>
> Subject: Pure Go HDF5 implementation - 9 years later
>
> Hi all,
>
> Replying to this 9-year-old thread because the original question has 
> finally been answered.
>
> Back in 2015, the consensus was that HDF5 is "so complicated that there is 
> only one implementation" and too difficult for pure Go. I'm happy to report 
> that's no longer true.
>
> **What exists now (2025)**:
>
> A pure Go HDF5 library with full read support and beta write support:
> - Repository: https://github.com/scigolib/hdf5
> - Read: Feature-complete (superblock v0/2/3, all layouts, compression, 
> attributes)
> - Write: Beta (v0.11.1-beta - chunked datasets, GZIP, dense groups, 
> attributes)
>
> ```go
> // Reading (works today)
> file, _ := hdf5.Open("data.h5")
> dataset := file.Datasets["/temperature"]
> data := dataset.Data() // []float64, []int32, etc.
>
> // Writing (beta, but functional)
> file, _ := hdf5.CreateForWrite("output.h5", hdf5.Truncate)
> file.CreateDataset("data", myData,
> hdf5.WithChunked([]uint64{100, 100}),
> hdf5.WithCompression(6),
> )
> ```
>
> **How it was done**:
>
> The C library (D:\projects\scigolibs\hdf5c\src) served as reference 
> implementation. Instead of "figuring out" the format, we ported proven 
> algorithms to Go. Format spec + reference code = solvable problem.
>
> Development time: ~1 year from concept to write MVP (with AI assistance 
> for rapid prototyping).
>
> **Why it matters**:
>
> - No CGo = actual cross-compilation, no C dependencies
> - Type safety = Go's compiler catches HDF5 format errors at compile time
> - Standard library integration (io.ReaderAt, encoding/binary)
> - Single binary deployment
>
> **Current status**:
>
> - Test coverage: 70-88% depending on package
> - Platforms: Linux, macOS, Windows
> - Recognition: HDF Group acknowledged it on their forum
> - Beta limitations: Some write features in progress (dense storage 
> read-modify-write, h5dump compatibility)
>
> **For the scientific Go community**:
>
> If you're working with HDF5 files and want to avoid CGo, this is now 
> viable. Lo

[go-nuts] Re: Scientific file formats: HDF5,FITS, NetCDF? Pure Go HDF5 implementation - 9 years later

2025-10-30 Thread a.ko...@gmail.com

Hi all,

Replying to this 9-year-old thread because the original question has 
finally been answered.

Back in 2015, the consensus was that HDF5 is "so complicated that there is 
only one implementation" and too difficult for pure Go. I'm happy to report 
that's no longer true.

WHAT EXISTS NOW (2025):

A pure Go HDF5 library with full read support and beta write support:

Repository: https://github.com/scigolib/hdf5

Read: Feature-complete (superblock v0/2/3, all layouts, compression, 
attributes)

Write: Beta (v0.11.1-beta - chunked datasets, GZIP, dense groups, 
attributes)

Example - Reading:

file, _ := hdf5.Open("data.h5")
dataset := file.Datasets["/temperature"]
data := dataset.Data() // []float64, []int32, etc.

Example - Writing (beta, but functional):

file, _ := hdf5.CreateForWrite("output.h5", hdf5.Truncate)
file.CreateDataset("data", myData,
hdf5.WithChunked([]uint64{100, 100}),
hdf5.WithCompression(6),
)

HOW IT WAS DONE:

The C library (https://github.com/HDFGroup/hdf5) served as reference 
implementation. Instead of "figuring out" the format from scratch, we 
ported proven algorithms to Go. Format spec + reference code = solvable 
problem.

Development time: ~1 year from concept to write MVP (with AI assistance for 
rapid prototyping).

WHY IT MATTERS:

- No CGo = actual cross-compilation, no C dependencies
- Type safety = Go's compiler catches HDF5 format errors at compile time
- Standard library integration (io.ReaderAt, encoding/binary)
- Single binary deployment

CURRENT STATUS:

- Test coverage: 70-88% depending on package
- Platforms: Linux, macOS, Windows
- Recognition: HDF Group acknowledged it on their forum
(
https://forum.hdfgroup.org/t/loking-for-an-hdf5-version-compatible-with-go1-9-2/10021/7
)
- Beta limitations: Some write features in progress (dense storage
read-modify-write, h5dump compatibility)

FOR THE SCIENTIFIC GO COMMUNITY:

If you're working with HDF5 files and want to avoid CGo, this is now 
viable. Looking for beta testers with real-world datasets (astronomy, 
climate, genomics, etc.).

Installation:

go get github.com/scigolib/[email protected]

The format is indeed complex, but it's been tackled. Thought this group 
might appreciate the update after 9 years.

Best,
[Your name]

P.S. NetCDF4 being "stripped down HDF5" (as mentioned in the original 
thread) means this library could potentially support it too, though that's 
not implemented yet.

On Tuesday, 28 December 2010 at 03:47:30 UTC+3 darenw wrote:

> How well would a Go program be able to read and write the major
> scientific file formats? Do libraries already exist?
>
> Although I've read that Go might not be the best choice for massive
> number crunching, many science and engineering apps tody have complex
> GUIs, connect to other apps via dbus or other interconnection systems,
> and do a lot of threading for example to let the user continue playing
> with the GUI while data is saved to a file in the background. The
> thing I'm working on currently has over 600 classes in a messy
> hierarchy, many objects having pointers to other objects to access
> methods that only return pointers to yet other objects, and most of
> the real work done is done by short snips of code, one to several
> lines, scattered hither thither and yon throughout a very large
> directory tree.
>
> I'm fantasizing the whole thing rewritten in Go might be 1/4 the size,
> 1/10th the bugs, and way easier to maintain. Never mind performance
> of huge arrays of numbers - that can be dealt with in various ways.
> I might try a small demo program in Go, but it'll be far easier if I
> can at least read FITS files.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/5a80a4e1-9b08-44dd-bd47-936977fa97aen%40googlegroups.com.

Re: [go-nuts] RegEx/string performance benchmarks

2025-11-29 Thread a.ko...@gmail.com

Hi all,

I know this thread is 10+ years old, but I wanted to follow up since
the regexp performance discussion is still highly relevant today.

TL;DR: The situation has improved slightly over the years, but the
fundamental performance characteristics haven't changed dramatically.
So I built coregex - an alternative regex engine for Go that addresses
the performance issues discussed here.

What's Changed in Go stdlib (2013-2025)

The good:
  - Bug fixes and stability improvements
  - Better Unicode handling
  - Minor optimizations here and there

The unchanged:
  - Still uses Thompson's NFA exclusively
  - No SIMD optimizations
  - No prefilter strategies
  - Same single-engine architecture

Go's regexp prioritizes correctness and simplicity over raw performance.
That's a valid design choice - it guarantees O(n) time complexity and
prevents ReDoS attacks. But for regex-heavy workloads, the performance
gap vs other languages remains significant.

The Performance Gap Today (2025)
=

Benchmarking against Rust's regex crate on patterns like 
.*error.*connection.*:

  - Go stdlib: 12.6ms (250KB input)
  - Rust regex: ~20µs (same input)
  - Gap: ~600x slower

This isn't a criticism of Go - it's a different set of trade-offs.
But it shows the problem hasn't gone away.

What I Built: coregex
=

After hitting regex bottlenecks in production, I spent 6 months building
coregex - a drop-in replacement for Go's regexp.

GitHub: https://github.com/coregx/coregex

Architecture:
  - Multi-engine strategy selection (DFA/NFA/specialized engines)
  - SIMD-accelerated prefilters (AVX2 assembly)
  - Bidirectional search for patterns like .*keyword.*
  - Zero allocations in hot paths

Performance (vs stdlib):
  - 3-3000x faster depending on pattern
  - Maintains O(n) guarantees (no backtracking)
  - Drop-in API compatibility

Real benchmarks:

  Pattern  Input   stdlibcoregex   Speedup
  ---
  .*\.txt$1MB 27ms  21µs  1,314x
  .*error.*   250KB   12.6ms4µs   3,154x
  (?i)error   32KB1.23ms4.7µs 263x
  \w+@\w+\.\w+1KB 688ns 196ns 3.5x

Status: v0.8.0 released, MIT licensed, 88% test coverage

Could This Go Into stdlib?
===

That's the interesting question. I've been thinking about this from
several angles:

Challenges:
  1. Complexity - Multi-engine architecture is significantly more
 complex than current implementation
  2. Maintenance burden - SIMD assembly needs platform-specific
 variants (AVX2, NEON, etc.)
  3. Binary size - Multiple engines increase compiled binary size
  4. API stability - stdlib changes need extreme care

Opportunities:
  1. Incremental adoption - Could start with just SIMD primitives
 (internal/bytealg improvements)
  2. Opt-in optimizations - Keep current implementation as default,
 offer regexp/fast package
  3. Strategy selection - Add smart path selection without breaking
 existing code
  4. Knowledge transfer - Techniques from coregex could inform stdlib
 improvements

What I'm Proposing
==

Rather than a direct "merge coregex into stdlib" proposal, I'm suggesting:

  1. Short term: Community uses coregex for performance-critical workloads
  2. Medium term: Discuss which techniques could benefit stdlib
 (SIMD byte search, prefilters)
  3. Long term: Potential collaboration on stdlib improvements
 (if there's interest)

I'd be happy to:
  - Help with stdlib patches for incremental improvements
  - Share implementation learnings and benchmarks
  - Discuss compatibility considerations

For Those Interested

Try it:
  go get github.com/coregx/[email protected]

Read more:
  - Dev.to article:

https://dev.to/kolkov/gos-regexp-is-slow-so-i-built-my-own-3000x-faster-3i6h
  - GitHub repo:
https://github.com/coregx/coregex
  - v0.8.0 release:
https://github.com/coregx/coregex/releases/tag/v0.8.0

Feedback welcome on:
  - API compatibility issues
  - Performance on your specific patterns
  - Ideas for stdlib integration

The Bottom Line
===

The regexp performance discussion from 10+ years ago was valid then and
remains valid now. The good news: we have options today. The better news:
maybe some of these ideas will make their way into stdlib eventually.

In the meantime, coregex is production-ready and MIT-licensed. Use it if
it helps.

Cheers,
Andrey Kolkov
GitHub: https://github.com/kolkov
CoreGX (Production Go Libraries): https://github.com/coregx

On Thursday, 28 April 2011 at 18:13:21 UTC+4 Russ Cox wrote:

> > In some areas Go kann keep up with Java but when it comes to string
> > operations ("regex-dna" benchmark), Go is even much slower than Ruby
> > or Python. Is the status quo going to improve anytime soon? And why is
> > Go so terribly slow wh

Re: [go-nuts] RegEx/string performance benchmarks

2025-11-29 Thread a.ko...@gmail.com

Hi all,

I know this thread is 10+ years old, but I wanted to follow up since
the regexp performance discussion is still highly relevant today.

TL;DR: The situation has improved slightly over the years, but the
fundamental performance characteristics haven't changed dramatically.
So I built coregex - an alternative regex engine for Go that addresses
the performance issues discussed here.


What's Changed in Go stdlib (2013-2025)


The good:
  - Bug fixes and stability improvements
  - Better Unicode handling
  - Minor optimizations here and there

The unchanged:
  - Still uses Thompson's NFA exclusively
  - No SIMD optimizations
  - No prefilter strategies
  - Same single-engine architecture

Go's regexp prioritizes correctness and simplicity over raw performance.
That's a valid design choice - it guarantees O(n) time complexity and
prevents ReDoS attacks. But for regex-heavy workloads, the performance
gap vs other languages remains significant.


The Performance Gap Today (2025)
=

Benchmarking against Rust's regex crate on patterns like 
.*error.*connection.*:

  - Go stdlib: 12.6ms (250KB input)
  - Rust regex: ~20µs (same input)
  - Gap: ~600x slower

This isn't a criticism of Go - it's a different set of trade-offs.
But it shows the problem hasn't gone away.


What I Built: coregex
=

After hitting regex bottlenecks in production, I spent 6 months building
coregex - a drop-in replacement for Go's regexp.

GitHub: https://github.com/coregx/coregex

Architecture:
  - Multi-engine strategy selection (DFA/NFA/specialized engines)
  - SIMD-accelerated prefilters (AVX2 assembly)
  - Bidirectional search for patterns like .*keyword.*
  - Zero allocations in hot paths

Performance (vs stdlib):
  - 3-3000x faster depending on pattern
  - Maintains O(n) guarantees (no backtracking)
  - Drop-in API compatibility

Real benchmarks:

  Pattern  Input   stdlibcoregex   Speedup
  ---
  .*\.txt$1MB 27ms  21µs  1,314x
  .*error.*   250KB   12.6ms4µs   3,154x
  (?i)error   32KB1.23ms4.7µs 263x
  \w+@\w+\.\w+1KB 688ns 196ns 3.5x

Status: v0.8.0 released, MIT licensed, 88% test coverage


Could This Go Into stdlib?
===

That's the interesting question. I've been thinking about this from
several angles:

Challenges:
  1. Complexity - Multi-engine architecture is significantly more
 complex than current implementation
  2. Maintenance burden - SIMD assembly needs platform-specific
 variants (AVX2, NEON, etc.)
  3. Binary size - Multiple engines increase compiled binary size
  4. API stability - stdlib changes need extreme care

Opportunities:
  1. Incremental adoption - Could start with just SIMD primitives
 (internal/bytealg improvements)
  2. Opt-in optimizations - Keep current implementation as default,
 offer regexp/fast package
  3. Strategy selection - Add smart path selection without breaking
 existing code
  4. Knowledge transfer - Techniques from coregex could inform stdlib
 improvements


What I'm Proposing
==

Rather than a direct "merge coregex into stdlib" proposal, I'm suggesting:

  1. Short term: Community uses coregex for performance-critical workloads
  2. Medium term: Discuss which techniques could benefit stdlib
 (SIMD byte search, prefilters)
  3. Long term: Potential collaboration on stdlib improvements
 (if there's interest)

I'd be happy to:
  - Help with stdlib patches for incremental improvements
  - Share implementation learnings and benchmarks
  - Discuss compatibility considerations


For Those Interested


Try it:
  go get github.com/coregx/[email protected]

Read more:
  - Dev.to article:

https://dev.to/kolkov/gos-regexp-is-slow-so-i-built-my-own-3000x-faster-3i6h
  - GitHub repo:
https://github.com/coregx/coregex
  - v0.8.0 release:
https://github.com/coregx/coregex/releases/tag/v0.8.0

Feedback welcome on:
  - API compatibility issues
  - Performance on your specific patterns
  - Ideas for stdlib integration


The Bottom Line
===

The regexp performance discussion from 10+ years ago was valid then and
remains valid now. The good news: we have options today. The better news:
maybe some of these ideas will make their way into stdlib eventually.

In the meantime, coregex is production-ready and MIT-licensed. Use it if
it helps.

Cheers,
Andrey Kolkov
GitHub: https://github.com/kolkov
CoreGX (Production Go Libraries): https://github.com/coregx



ALTERNATIVE: SHORTER VERSION (if brevity needed)


Hi all,

Quick update on this 10+ year old regexp performance discussion:

Status quo: Go's regexp still prioritizes correctness over speed
(good trade-off for std

[go-nuts] Re: Race detection with CGO_ENABLED=0?

2025-11-28 Thread a.ko...@gmail.com


  2025 update: This is now possible.

  I've built a Pure-Go race detector that works with CGO_ENABLED=0:
  https://github.com/kolkov/racedetector

  Works in:
  - Alpine/scratch Docker containers
  - AWS Lambda / Cloud Functions
  - Cross-compilation scenarios
  - Any CGO_ENABLED=0 environment

  Usage:
go install github.com/kolkov/racedetector/cmd/racedetector@latest
racedetector build -o myapp main.go

  It's a standalone tool (not runtime integration yet), but it detects 
races without any CGO dependency. FastTrack algorithm, 15-22% overhead, 
pure Go.

  Feedback welcome: https://github.com/kolkov/racedetector/discussions

Best regards!
On Wednesday, 18 February 2015 at 00:09:22 UTC+3 Blake Caldwell wrote:

I'm building my service without CGO, so ideally, I'd like to run my tests 
with the same settings, and I really like race detection. Is there any way 
to use the race detector with CGO_ENABLED=0?

# testmain
runtime/race(.text): __libc_malloc: not defined
runtime/race(.text): getuid: not defined
runtime/race(.text): pthread_self: not defined
runtime/race(.text): madvise: not defined
runtime/race(.text): sleep: not defined
runtime/race(.text): usleep: not defined
runtime/race(.text): abort: not defined
runtime/race(.text): isatty: not defined
runtime/race(.text): __libc_free: not defined
runtime/race(.text): getrlimit: not defined
runtime/race(.text): __libc_stack_end: not defined
runtime/race(.text): getrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): exit: not defined
runtime/race(.text.unlikely): __errno_location: not defined
runtime/race(.text): undefined: __libc_malloc
runtime/race(.text): undefined: getuid
runtime/race(.text): undefined: pthread_self
runtime/race(.text): undefined: madvise
too many errors

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/cce76d7a-e076-4d62-9c89-d9e2788d74b0n%40googlegroups.com.

Re: [go-nuts] RegEx/string performance benchmarks

2025-11-30 Thread a.ko...@gmail.com


Thanks! )
On Sunday, 30 November 2025 at 10:00:27 UTC+3 Robert Engels wrote:

> Very cool. 
>
> On Nov 30, 2025, at 12:19 AM, [email protected]  wrote:
>
> Hi all,
>
>
>
> I know this thread is 10+ years old, but I wanted to follow up since
> the regexp performance discussion is still highly relevant today.
>
> TL;DR: The situation has improved slightly over the years, but the
> fundamental performance characteristics haven't changed dramatically.
> So I built coregex - an alternative regex engine for Go that addresses
> the performance issues discussed here.
>
>
> What's Changed in Go stdlib (2013-2025)
> 
>
> The good:
>   - Bug fixes and stability improvements
>   - Better Unicode handling
>   - Minor optimizations here and there
>
> The unchanged:
>   - Still uses Thompson's NFA exclusively
>   - No SIMD optimizations
>   - No prefilter strategies
>   - Same single-engine architecture
>
> Go's regexp prioritizes correctness and simplicity over raw performance.
> That's a valid design choice - it guarantees O(n) time complexity and
> prevents ReDoS attacks. But for regex-heavy workloads, the performance
> gap vs other languages remains significant.
>
>
> The Performance Gap Today (2025)
> =
>
> Benchmarking against Rust's regex crate on patterns like 
> .*error.*connection.*:
>
>   - Go stdlib: 12.6ms (250KB input)
>   - Rust regex: ~20µs (same input)
>   - Gap: ~600x slower
>
> This isn't a criticism of Go - it's a different set of trade-offs.
> But it shows the problem hasn't gone away.
>
>
> What I Built: coregex
> =
>
> After hitting regex bottlenecks in production, I spent 6 months building
> coregex - a drop-in replacement for Go's regexp.
>
> GitHub: https://github.com/coregx/coregex
>
> Architecture:
>   - Multi-engine strategy selection (DFA/NFA/specialized engines)
>   - SIMD-accelerated prefilters (AVX2 assembly)
>   - Bidirectional search for patterns like .*keyword.*
>   - Zero allocations in hot paths
>
> Performance (vs stdlib):
>   - 3-3000x faster depending on pattern
>   - Maintains O(n) guarantees (no backtracking)
>   - Drop-in API compatibility
>
> Real benchmarks:
>
>   Pattern  Input   stdlibcoregex   Speedup
>   ---
>   .*\.txt$1MB 27ms  21µs  1,314x
>   .*error.*   250KB   12.6ms4µs   3,154x
>   (?i)error   32KB1.23ms4.7µs 263x
>   \w+@\w+\.\w+1KB 688ns 196ns 3.5x
>
> Status: v0.8.0 released, MIT licensed, 88% test coverage
>
>
> Could This Go Into stdlib?
> ===
>
> That's the interesting question. I've been thinking about this from
> several angles:
>
> Challenges:
>   1. Complexity - Multi-engine architecture is significantly more
>  complex than current implementation
>   2. Maintenance burden - SIMD assembly needs platform-specific
>  variants (AVX2, NEON, etc.)
>   3. Binary size - Multiple engines increase compiled binary size
>   4. API stability - stdlib changes need extreme care
>
> Opportunities:
>   1. Incremental adoption - Could start with just SIMD primitives
>  (internal/bytealg improvements)
>   2. Opt-in optimizations - Keep current implementation as default,
>  offer regexp/fast package
>   3. Strategy selection - Add smart path selection without breaking
>  existing code
>   4. Knowledge transfer - Techniques from coregex could inform stdlib
>  improvements
>
>
> What I'm Proposing
> ==
>
> Rather than a direct "merge coregex into stdlib" proposal, I'm suggesting:
>
>   1. Short term: Community uses coregex for performance-critical workloads
>   2. Medium term: Discuss which techniques could benefit stdlib
>  (SIMD byte search, prefilters)
>   3. Long term: Potential collaboration on stdlib improvements
>  (if there's interest)
>
> I'd be happy to:
>   - Help with stdlib patches for incremental improvements
>   - Share implementation learnings and benchmarks
>   - Discuss compatibility considerations
>
>
> For Those Interested
> 
>
> Try it:
>   go get github.com/coregx/[email protected] 
> <http://github.com/coregx/[email protected]>
>
> Read more:
>   - Dev.to article:
> 
> https://dev.to/kolkov/gos-regexp-is-slow-so-i-built-my-own-3000x-faster-3i6h
>   - GitHub repo:
> https://github.com/coregx/coregex
>   - v0.8.0 release:
&

Re: [go-nuts] Re: pure-Go MATLAB file I/O package ?

Re: [go-nuts] Re: Scientific file formats: HDF5,FITS, NetCDF?

Re: [go-nuts] Re: Scientific file formats: HDF5,FITS, NetCDF?

[go-nuts] Re: Scientific file formats: HDF5,FITS, NetCDF? Pure Go HDF5 implementation - 9 years later

Re: [go-nuts] RegEx/string performance benchmarks

Re: [go-nuts] RegEx/string performance benchmarks

[go-nuts] Re: Race detection with CGO_ENABLED=0?

Re: [go-nuts] RegEx/string performance benchmarks

8 matches

Site Navigation

Mail list logo

Footer information