Avoid massive reductions in runtime while maintaining the same API? I did move to using ByteString's internally for those bits later on, but reading String's from Data.Binary with a ByteString+unpack went much more quickly than reading String's
On Thu, Mar 5, 2009 at 7:35 PM, Don Stewart <[email protected]> wrote: > Avoid unpack! > > ndmitchell: >> Hi Gwern, >> >> I get String/Data.Binary issues too. My suggestion would be to change >> your strings to ByteString's, serisalise, and then do the reverse >> conversion when reading. Interestingly, a String and a ByteString have >> identical Data.Binary reps, but in my experiments converting, >> including the cost of BS.unpack, makes the reading substantially >> cheaper. >> >> Thanks >> >> Neil >> >> On Thu, Mar 5, 2009 at 2:33 AM, Gwern Branwen <[email protected]> wrote: >> > On Tue, Mar 3, 2009 at 11:50 PM, Spencer Janssen >> > <[email protected]> wrote: >> >> On Tue, Mar 3, 2009 at 10:30 PM, Gwern Branwen <[email protected]> wrote: >> >>> So recently I've been having issues with Data.Binary & Data.Sequence; >> >>> I serialize a 'Seq String' >> >>> >> >>> You can see the file here: http://code.haskell.org/yi/Yi/IReader.hs >> >>> >> >>> The relevant function seems to be: >> >>> >> >>> -- | Read in database from 'dbLocation' and then parse it into an >> >>> 'ArticleDB'. >> >>> readDB :: YiM ArticleDB >> >>> readDB = io $ (dbLocation >>= r) `catch` (\_ -> return empty) >> >>> where r x = fmap (decode . BL.fromChunks . return) $ B.readFile >> >>> x >> >>> -- We read in with strict bytestrings to guarantee the >> >>> file is closed, >> >>> -- and then we convert it to the lazy bytestring >> >>> data.binary expects. >> >>> -- This is inefficient, but alas... >> >>> >> >>> My current serialized file is about 9.4M. I originally thought that >> >>> the issue might be the recent upgrade in Yi to binary 0.5, but I >> >>> unpulled patches back to past that, and the problem still manifested. >> >>> >> >>> Whenever yi tries to read the articles.db file, it stack overflows. It >> >>> actually stack-overflowed on even smaller files, but I managed to bump >> >>> the size upwards, it seems, by the strict-Bytestring trick. >> >>> Unfortunately, my personal file has since passed whatever that limit >> >>> was. >> >>> >> >>> I've read carefully the previous threads on Data.Binary and Data.Map >> >>> stack-overflows, but none of them seem to help; hacking some $!s or >> >>> seqs into readDB seems to make no difference, and Seq is supposed to >> >>> be a strict datastructure already! Doing things in GHCi has been >> >>> tedious, and hasn't enlightened me much: sometimes things overflow and >> >>> sometimes they don't. It's all very frustrating and I'm seriously >> >>> considering going back to using the original read/show code unless >> >>> anyone knows how to fix this - that approach may be many times slower, >> >>> but I know it will work. >> >>> >> >>> -- >> >>> gwern >> >> >> >> Have you tried the darcs version of binary? It has a new instance >> >> which looks more efficient than the old. >> >> >> >> >> >> Cheers, >> >> Spencer Janssen >> > >> > I have. It still stack-overflows on my 9.8 meg file. (The magic number >> > seems to be somewhere between 9 and 10 megabytes.) >> > >> > -- >> > gwern >> > _______________________________________________ >> > Haskell-Cafe mailing list >> > [email protected] >> > http://www.haskell.org/mailman/listinfo/haskell-cafe >> > >> _______________________________________________ >> Haskell-Cafe mailing list >> [email protected] >> http://www.haskell.org/mailman/listinfo/haskell-cafe >> > _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
