From a performance standpoint there are a few things going on here.

1) I would highly suggest to have a compiled NSRegularExpression stored once 
per pattern. From what I can tell this is true for the code listed? Regexes in 
general are not always best to re-create all the time since it has to have a 
“compiled" engine from ICU to be made each time.

2) Last time I looked at this specific sample the cost is bridging strings back 
and forth between NSString and String. In swift 4 we have made some 
improvements for bridging but I am not certain if any specifically apply to 
this context (when run on Darwin). For linux builds we are missing the 
referencing string variants so this can cause some severe performance hits when 
copying large strings. 

3) I would avoid utf8.count in this case for measuring perf (it is probably 
going to be slow for large files)

4) per your commentary on parallelized cases, I am not certain on why that is 
slower. Presuming the source data is large (order of megabytes) it should not 
contend on the access to the regular expression. So I find this odd that it is 
not better to utilize all cores of your machine.

Now I think with some tuning we could probably get swift-corelibs-foundation to 
have some faster paths here. As well as fixing some missteps in the code listed 
for the two tests.

I have some branches that I have been working on for swift-corelibs-foundation 
that might reduce some allocation times and improve string conversions back and 
forth from reference types to structural types but those are not fully baked 
yet. Partially you have to realize that swift-corelibs-foundation is still 
quite new in comparison to the Foundation on Darwin. So we have been focusing 
on getting API coverage to a closer point than per-se performance work. Granted 
however pull requests are welcomed in both cases ;)

> On Jun 29, 2017, at 10:15 AM, Francois Green via swift-corelibs-dev 
> <swift-corelibs-dev@swift.org> wrote:
> 
> I’m uncertain if I’m using the correct forum, but I asked this question on 
> the user list a few months back and no one responded.  The 
> NSRegularExpression library seems to perform poorly and I’m wondering if this 
> is a performance bug or is it being used improperly?  I’ve added links to two 
> algorithms from the Benchmark Game project that seem quite slow when compared 
> to other languages.  While I understand that direct comparisons are not 
> possible, this one benchmark really stands out. 
> 
> http://benchmarksgame.alioth.debian.org/u64q/program.php?test=regexredux&lang=swift&id=2
> 
> http://benchmarksgame.alioth.debian.org/u64q/program.php?test=regexredux&lang=swift&id=1
> _______________________________________________
> swift-corelibs-dev mailing list
> swift-corelibs-dev@swift.org
> https://lists.swift.org/mailman/listinfo/swift-corelibs-dev

_______________________________________________
swift-corelibs-dev mailing list
swift-corelibs-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-corelibs-dev

Reply via email to