Re: Really big files and encodings

2009-04-22 Thread Seth Willits
On Apr 22, 2009, at 8:09 AM, Michael Ash wrote: Do your files have regularly occurring newlines like most normal text files? If so, then you can just scan for a \r or \n and break it up there. Virtually every encoding you'll encounter today encodes \r and \n as \r and \n, and will not use those

Re: Really big files and encodings

2009-04-22 Thread Greg Guerin
Seth Willits wrote: In my app, I import data from potentially very large files. In the first pass, I simply mmap'd the entire file, created a string using CFStringCreateWithBytesNoCopy, and go about my business. This works great until it hits the address limit when it's running as a 32-bit

Re: Really big files and encodings

2009-04-22 Thread Michael Ash
On Wed, Apr 22, 2009 at 1:57 AM, Seth Willits wrote: > So, I generally know what I should do, but the problem is that I don't know > how to identify an encoding as fixed-width or variable. I could spend the > time to look up each and every encoding on the internet, but there are kind > of a lot of

Re: Really big files and encodings

2009-04-22 Thread Alastair Houghton
On 22 Apr 2009, at 06:57, Seth Willits wrote: In my app, I import data from potentially very large files. In the first pass, I simply mmap'd the entire file, created a string using CFStringCreateWithBytesNoCopy, and go about my business. This works great until it hits the address limit when

Really big files and encodings

2009-04-21 Thread Seth Willits
There's actually just one simple question, but there's a bit of background for context: -- In my app, I import data from potentially very large files. In the first pass, I simply mmap'd the entire file, created a string using CFStringCreateWithBytesNoCopy, and go about my business. This