The problem is more that whole records for already existing keys disappeared along the road. Like if conj failed to do its job.
I would be surprised that a smaller subset recreates the problem. I had other instances were the input data was small and the index was correct, all keys and all the records attached to it were present I am not convinced that the specific key values are the root cause of the problem. I lack time today to get deeper into this now. The rough approach would be to to take a copy of the Clojure master branch and add debugging traces into it, run the only test case I have at hand (several thousands records) and analyze the trace. I could correlate the data and the traces to see where I lose some entries. I will have more time in a week or so to investigate. That reporting tool needs to get out by Monday. I am almost there and I have another hot pie in the oven that is waiting for me. Maybe meanwhile some light will pop up in my brains as to which approach to take to find the cause. Luc On Fri, 2010-04-16 at 12:09 +0700, Per Vognsen wrote: > That sounds weird. If you know what keys weren't making it into the > index as expected, did you try reducing the problem to a smaller test > case involving those exceptional keys? > > -Per > > On Fri, Apr 16, 2010 at 11:50 AM, Luc Préfontaine > <lprefonta...@softaddicts.ca> wrote: > > Hi all, > > > > I tripped over something strange yesterday. I work on a tool to create > > reports in the REPL from data in an SQL database. > > Instead of building SQL statements with complex where clauses, the user can > > defined which tables/fields he wants to fetch > > and then he can work on these locally. > > > > I implemented local joins and a number of other related operations to > > marshal the data. > > > > While doing this, I worked with clojure.set and used the index function to > > regroup records matching the same > > keys. I then found out that I was not getting the same # of record from my > > "local" join compared to the equivalent > > SQL statement. The input data was the same however (I compared SQL statement > > outputs with my local "fetch" outputs. > > I was "losing" records, instead of getting 18000 something records, I was > > down to 14000 something. > > In fact some keys where missing in the output of index. I did not find a > > pattern in the missing entries. > > > > While digging, I started to question the index function sanity from > > clojure.set. I cloned it and after a few hours of tossing it around, decided > > to recode it using a mutable Java HashMap. > > > > It then worked... I got the same number of records as in the equivalent SQL > > statement output. > > > > The index function in clojure.set goes like this: > > > > (defn index > > "Returns a map of the distinct values of ks in the xrel mapped to a > > set of the maps in xrel with the corresponding values of ks." > > [xrel ks] > > (reduce > > (fn [m x] > > (let [ik (select-keys x ks)] > > (assoc m ik (conj (get m ik #{}) x)))) > > {} xrel)) > > > > It looks great and I suspect that the problem is deeper in the core. > > > > Anyone has encountered something similar ? The problem can be replicated in > > 1.0 and 1.1. > > I did not have the time now to test it again using the master branch. > > I will investigate this in the runtime later but maybe someone has crossed > > over this problem in the past. > > > > Luc > > > > -- > > You received this message because you are subscribed to the Google > > Groups "Clojure" group. > > To post to this group, send email to clojure@googlegroups.com > > Note that posts from new members are moderated - please be patient with your > > first post. > > To unsubscribe from this group, send email to > > clojure+unsubscr...@googlegroups.com > > For more options, visit this group at > > http://groups.google.com/group/clojure?hl=en > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en