I've just released what I'm calling 'RegexKitLite'. It targets a different group of people than the full fledged RegexKit (http://regexkit.sourceforge.net/ ).

I put RegexKitLite together after helping some users with some RegexKit problems, specifically word breaking Thai. After putting together a quick and dirty wrapper around the ICU regex engine, I realized I had most of the pieces for a light weight, no frills objective-c regular expression system. All I needed to do was write up some docs... which always seems to take the longest amount of time.

To give you an idea of how 'lightweight' the whole package is, the tarball distribution is a scant 27004 (sic) bytes. Almost all of that is the documentation, which is packaged as a single HTML file that covers everything. The documentation file weighs in at 135544 (sic) bytes.

You can view the documentation online at: 
http://regexkit.sourceforge.net/RegexKitLite/index.html
You can download the distribution via: 
http://downloads.sourceforge.net/regexkit/RegexKitLite-1.0.tar.bz2

Highlights are:

Distributed under the terms of the BSD license.

Links to /usr/lib/libicucore so no external regex library is required.

Consists of only two files: the RegexKitLite.h header and the RegexKitLite.m source file.

Very small. The header is 4498 bytes and the source is 19625 bytes. That's it!

Multithreading safe.

Small, pseudo least recently used compiled regex cache.

Since ICU requires the string it operates on to be in UTF-16 format, it tries to first get direct access to the NSStrings UTF-16 buffer. If it can't, it goes through the full conversion process. It caches the the conversion result of the last string that required conversion so subsequent matches are much faster. The caching is actually a little more complicated than this, though, see the documentation for more details.

In a nutshell, it provides some glue between Cocoa and the ICU regex system. It consists of a handful of primitives that are added as a category extension to NSString. The core methods are:

+ (NSInteger)captureCountForRegex:(NSString *)regexString options: (RKLRegexOptions)options error:(NSError **)error; - (BOOL)isMatchedByRegex:(NSString *)regexString options: (RKLRegexOptions)options inRange:(NSRange)range error:(NSError **)error; - (NSRange)rangeOfRegex:(NSString *)regexString options: (RKLRegexOptions)options inRange:(NSRange)range capture: (NSInteger)capture error:(NSError **)error; - (NSString *)stringByMatching:(NSString *)regexString options: (RKLRegexOptions)options inRange:(NSRange)range capture: (NSInteger)capture error:(NSError **)error;

There's also a handful of obvious 'convenience' methods that are just wrappers for the above. In reality, everything is done with the rangeOfRegex: method except for the regex capture count. The documentation builds a match enumerator as an example.

As you can see, it's pretty minimal. It's ideal for people who need to get a bit of regex work done and don't want to have to add a whole lot of cruft to do it. No new classes are added either, it's all just messages to NSString objects, and you supply the regular expressions as ordinary NSStrings:

NSString *site = [@"http://www.something.com/link/to/page.html"; stringByMatching:@"http://(.*?)/(.*)" capture:1];
// site == @"www.something.com"

Just that easy.
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to