On Thu, 2007-11-29 at 16:16 +0200, Alex Mizrahi wrote: > AP> Am I missing something really basic here? > > actually it's quite strange situation that you have *many* employees with > same name but you want just one (random one). i cannot imagine why one needs > this in real world.. > > or you're saying that all have different names, but it still does consing? > this could be a bug then.. >
With respect to consing, it is important to point out that our serializer is very consing (for postmodern and CL-SQL backends.) This is because I used base64 to transform the byte-streams into character strings. Most relational databases (including Postgres) provide a way of storing byte sequences directly. However, this is not standardized and not portable. In fact, I spoke to Kevin Rosenberg, the author of CL-SQL, and he and CL-SQL don't have a good way to do it. However, since postmodern is Postgres specific, it could avoid this step, by using a back-end specific serializer. I suspect this would have a huge impact on performance, both by decreasing consing (minor) and by decreasing the amount of disc I/O that has to be done (major). (BDB doesn't have this problem, because it natively uses byte-sequences, not character-sequences.) Please see the code below, which demonstrates that pushing 1 million bytes through the serializer (without even going to the database) creates 8 million bytes of garbage in 0.433 seconds. (This is on a new, fast, 2 gigabyte 64-bit machine, against postmodern: asdf:operate 'asdf:load-op :elephant) (asdf:operate 'asdf:load-op :ele-clsql) (asdf:operate 'asdf:load-op :postmodern) (asdf:operate 'asdf:load-op :elephant-tests) (in-package "ELEPHANT-TESTS") (setq *default-spec* *testpm-spec*) (setq teststring "supercalifragiliciousexpialidocious") (setq testint 42) (setq totalseriazationload (* 1000 1000)) (setq n (ceiling (/ totalseriazationload (length teststring)))) (open-store *default-spec*) (time (dotimes (x n) (in-out-value teststring))) (close-store) ***** Results in: Evaluation took: 0.433 seconds of real time 0.172974 seconds of user run time 0.058991 seconds of system run time 0 calls to %EVAL 0 page faults and 8,731,728 bytes consed. NIL ELE-TESTS> I personally think making a back-end specific serializer to avoid the base64 encoding would make a significant performance difference. This is not much of an issue for me personally, since I keep everything cached in memory anyway. -- Robert L. Read, PhD http://konsenti.com _______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel