On Sun, 05 Oct 2014 16:51:25 -0400, Sven Van Caekenberghe <s...@stfx.eu> wrote:
[snip]
Apart from that, the tokenisation is not very efficient, #lines is a copy of your whole contents, so is the #split: and #trimmed. The algorithm sounds a bit lazy as well, writing it 'on purpose' with an eye for performance might yield better results.

So I was reflecting on this more. If String and WideString were immutable, then it'd be easy to avoid all of these copies; you could instead pass around very tiny objects that had only three members (a String, a start position, a stop position), and avoid copying very much data. It's that String and WideString are mutable that preclude that. For fun, since I know I won't mutate the stringsin this example, I actually did a quick spike where I replaced #copyFrom:to: with a new method I introduced called #viewFrom:to: that returned a StringView. I'll post the code when I have a chance to clean it up if there's interest, but it looks like it pretty handedly chops off 120-150ms from that runtime (i.e., double the speed).

Has there been any thought to introducing some immutable collections? Or maybe I'm just missing them? They'd be useful not just for String and WideString, but really for basically any of the collection types. The implementation in most cases would be as simple as overriding #at:put: and friends to throw "self shouldNotImplement", and then providing methods/classes like the one I introduced to allow taking advantage of the newfound immutability.

If there's interest, I'd be happy to submit a Slice we could use as a concrete RFC of what this could look like.

--Benjamin

Reply via email to