The question marks are actual question marks. I'm not sure how to find the 
"duplicate" keys in the map in memory. As far as I can tell there is only 
one "? 5" key in the in memory map.

I thought maybe computing the frequencies of the hash values of the keys 
and looking for any with more than one would find them, but this code:

read-notes> (def dupes (filter #(> (second %) 1) (frequencies (map hash 
(keys phrases)))))
#'read-notes/dupes
read-notes> (count dupes)
8911

seems to indicate 8,911 keys with identical hash values.

On Wednesday, November 25, 2015 at 10:27:29 PM UTC-6, Ghadi Shayban wrote:
>
> While in memory before writing, are the hash codes for the "duplicate" 
> keys the same?   You can call (hash) on the keys.  I'm thinking there is 
> perhaps an issue with unicode string serialization...  Are the question 
> marks a particular character?
>
> If you can find the similar strings in memory, before they are written, 
> call:
> (map int  the-string)
> To see the actual unicode characters for the question marks.
>
> On Wednesday, November 25, 2015 at 11:07:34 PM UTC-5, Dave Kincaid wrote:
>>
>> The number of keys in the map is 8,054,160.
>>
>> On Wednesday, November 25, 2015 at 10:04:11 PM UTC-6, Dave Kincaid wrote:
>>>
>>> I have something very strange going on when I try to write a map out to 
>>> a file and read it back in. It's a perfectly fine hash-map with ????? 
>>> key/values (so it's pretty big). When I write the map out to a file using
>>>
>>> (spit "/tmp/mednotes6153968756847768349/repl-write.edn" (pr-str phrases
>>> ))
>>>
>>> and then read it back in with
>>>
>>> (edn/read (PushbackReader. (io/reader 
>>> "/tmp/mednotes6153968756847768349/repl-write.edn")))
>>>
>>> I am getting a duplicate key exception indicating that "? 5" is 
>>> duplicated. phrases is a clojure.lang.PersistentHashMap. The keys of the 
>>> map are strings and the values are numbers. When I get the value for "? 5" 
>>> from the map it returns 352.
>>>
>>> I tried to grep the file to find the occurrences of the key "? 5" (and 
>>> the 30 characters before and after it) and it seems to return 4 of them. 
>>> The second one is the right one from the map, but I have no idea where the 
>>> other 3 are coming from.
>>>
>>> [/tmp/mednotes6153968756847768349]> egrep -o ".{30}\"\? 5\" .{30}" 
>>> repl-write.edn 
>>> hasing a toothbrush for" 160, "? 5" 32, ". ) during his /" 32, "to
>>>  "is intact with sutures" 32, "? 5" 352, "4.81 pounds" 128, "ceren
>>> udden" 32, "being up all" 32, "? 5" 32, "limited financial means" 
>>> , "count , everytime she" 32, "? 5" 32, "had a partial mandibulect
>>>
>>> Does anyone have an idea what might be happening when the map is written 
>>> out to the file? How is that key getting duplicated?
>>>
>>> I have tried a few slightly different ways of writing to the file 
>>> including
>>>
>>> (spit "/tmp/mednotes6153968756847768349/repl-write.edn" (binding 
>>> [*print-dup* true] (pr-str phrases)))
>>>
>>> and
>>>
>>> (spit "/tmp/mednotes6153968756847768349/repl-write.edn" (.toString 
>>> phrases))
>>>
>>> based on some StackOverflow answers I found. They all seem to do the 
>>> same thing.
>>>
>>> Here is the exception stack trace.
>>>
>>> 1. Caused by java.lang.IllegalArgumentException
>>>    Duplicate key: ? 5
>>>
>>>         PersistentHashMap.java:   67 
>>>  clojure.lang.PersistentHashMap/createWithCheck
>>>                        RT.java: 1538  clojure.lang.RT/map
>>>                 EdnReader.java:  631 
>>>  clojure.lang.EdnReader$MapReader/invoke
>>>                 EdnReader.java:  142  clojure.lang.EdnReader/read
>>>                 EdnReader.java:  108  clojure.lang.EdnReader/read
>>>                        edn.clj:   35  clojure.edn/read
>>>                        edn.clj:   33  clojure.edn/read
>>>                       AFn.java:  154  clojure.lang.AFn/applyToHelper
>>>                       AFn.java:  144  clojure.lang.AFn/applyTo
>>>                  Compiler.java: 3623 
>>>  clojure.lang.Compiler$InvokeExpr/eval
>>>                  Compiler.java:  439  clojure.lang.Compiler$DefExpr/eval
>>>                  Compiler.java: 6787  clojure.lang.Compiler/eval
>>>                  Compiler.java: 6745  clojure.lang.Compiler/eval
>>>                       core.clj: 3081  clojure.core/eval
>>>                       main.clj:  240 
>>>  clojure.main/repl/read-eval-print/fn
>>>                       main.clj:  240  clojure.main/repl/read-eval-print
>>>                       main.clj:  258  clojure.main/repl/fn
>>>                       main.clj:  258  clojure.main/repl
>>>                    RestFn.java: 1523  clojure.lang.RestFn/invoke
>>>         interruptible_eval.clj:   58 
>>>  clojure.tools.nrepl.middleware.interruptible-eval/evaluate/fn
>>>                       AFn.java:  152  clojure.lang.AFn/applyToHelper
>>>                       AFn.java:  144  clojure.lang.AFn/applyTo
>>>                       core.clj:  630  clojure.core/apply
>>>                       core.clj: 1868  clojure.core/with-bindings*
>>>                    RestFn.java:  425  clojure.lang.RestFn/invoke
>>>         interruptible_eval.clj:   56 
>>>  clojure.tools.nrepl.middleware.interruptible-eval/evaluate
>>>         interruptible_eval.clj:  191 
>>>  clojure.tools.nrepl.middleware.interruptible-eval/interruptible-eval/fn/fn
>>>         interruptible_eval.clj:  159 
>>>  clojure.tools.nrepl.middleware.interruptible-eval/run-next/fn
>>>                       AFn.java:   22  clojure.lang.AFn/run
>>>        ThreadPoolExecutor.java: 1142 
>>>  java.util.concurrent.ThreadPoolExecutor/runWorker
>>>        ThreadPoolExecutor.java:  617 
>>>  java.util.concurrent.ThreadPoolExecutor$Worker/run
>>>                    Thread.java:  745  java.lang.Thread/run
>>>
>>>
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to