> Another thing you can do is to only preallocate a fixed size buffer > and add to a list of buffers. Every time you cross block boundary, you alloc > a new buffer and attach it at the end of your list. > > There are many ways to do this, I'd go for the simplest approach in terms of > code > readability and stop worrying about performance. > > If it is slow or memory hungry, it can be fixed later incrementally.
My 2 cents. Array of gap buffers is simple, imperative and efficient. I implemented what looks like a VI clone at this point to which I plan to add ACME like features. The data structure I used is a linked list of gap buffers, to stress the performance I use gap buffers of at most 11 runes. If these buffers are scaled up to 4k, I can enjoy real time edition in a file of 5 megs. When the file gets real big, seeking forward is nice and fast (because I use a cache that memorizes the last gap buffer accessed), but seeking backwards is slow. One solution to this is to move to a doubly linked list instead. Another solution is simply an array, and moreover, we enjoy locality (which is the most important property of a data structure with the current processors). This is why I recommend an array of gap buffers. If you want some real good inspiration I recommend reading QEmacs' source code (http://bellard.org/qemacs/). It is awesome C written by a super skilled hacker. It uses mmap to seamlessly edit files of several gigabytes, basically, the address space is the limit. Of course, this is overkill, and I do not advocate this but it shows that combined with caching (read the code to understand what I mean), an array of gap buffers is enough for all our purposes. Functional languages have ropes, I like functional languages, but in this case I find the imperative solution more pragmatic and elegant by its simplicity. -- mpu