While in memory before writing, are the hash codes for the "duplicate" keys 
the same?   You can call (hash) on the keys.  I'm thinking there is perhaps 
an issue with unicode string serialization...  Are the question marks a 
particular character?

If you can find the similar strings in memory, before they are written, 
call:
(map int  the-string)
To see the actual unicode characters for the question marks.

On Wednesday, November 25, 2015 at 11:07:34 PM UTC-5, Dave Kincaid wrote:
>
> The number of keys in the map is 8,054,160.
>
> On Wednesday, November 25, 2015 at 10:04:11 PM UTC-6, Dave Kincaid wrote:
>>
>> I have something very strange going on when I try to write a map out to a 
>> file and read it back in. It's a perfectly fine hash-map with ????? 
>> key/values (so it's pretty big). When I write the map out to a file using
>>
>> (spit "/tmp/mednotes6153968756847768349/repl-write.edn" (pr-str phrases))
>>
>> and then read it back in with
>>
>> (edn/read (PushbackReader. (io/reader 
>> "/tmp/mednotes6153968756847768349/repl-write.edn")))
>>
>> I am getting a duplicate key exception indicating that "? 5" is 
>> duplicated. phrases is a clojure.lang.PersistentHashMap. The keys of the 
>> map are strings and the values are numbers. When I get the value for "? 5" 
>> from the map it returns 352.
>>
>> I tried to grep the file to find the occurrences of the key "? 5" (and 
>> the 30 characters before and after it) and it seems to return 4 of them. 
>> The second one is the right one from the map, but I have no idea where the 
>> other 3 are coming from.
>>
>> [/tmp/mednotes6153968756847768349]> egrep -o ".{30}\"\? 5\" .{30}" 
>> repl-write.edn 
>> hasing a toothbrush for" 160, "? 5" 32, ". ) during his /" 32, "to
>>  "is intact with sutures" 32, "? 5" 352, "4.81 pounds" 128, "ceren
>> udden" 32, "being up all" 32, "? 5" 32, "limited financial means" 
>> , "count , everytime she" 32, "? 5" 32, "had a partial mandibulect
>>
>> Does anyone have an idea what might be happening when the map is written 
>> out to the file? How is that key getting duplicated?
>>
>> I have tried a few slightly different ways of writing to the file 
>> including
>>
>> (spit "/tmp/mednotes6153968756847768349/repl-write.edn" (binding 
>> [*print-dup* true] (pr-str phrases)))
>>
>> and
>>
>> (spit "/tmp/mednotes6153968756847768349/repl-write.edn" (.toString 
>> phrases))
>>
>> based on some StackOverflow answers I found. They all seem to do the same 
>> thing.
>>
>> Here is the exception stack trace.
>>
>> 1. Caused by java.lang.IllegalArgumentException
>>    Duplicate key: ? 5
>>
>>         PersistentHashMap.java:   67 
>>  clojure.lang.PersistentHashMap/createWithCheck
>>                        RT.java: 1538  clojure.lang.RT/map
>>                 EdnReader.java:  631 
>>  clojure.lang.EdnReader$MapReader/invoke
>>                 EdnReader.java:  142  clojure.lang.EdnReader/read
>>                 EdnReader.java:  108  clojure.lang.EdnReader/read
>>                        edn.clj:   35  clojure.edn/read
>>                        edn.clj:   33  clojure.edn/read
>>                       AFn.java:  154  clojure.lang.AFn/applyToHelper
>>                       AFn.java:  144  clojure.lang.AFn/applyTo
>>                  Compiler.java: 3623 
>>  clojure.lang.Compiler$InvokeExpr/eval
>>                  Compiler.java:  439  clojure.lang.Compiler$DefExpr/eval
>>                  Compiler.java: 6787  clojure.lang.Compiler/eval
>>                  Compiler.java: 6745  clojure.lang.Compiler/eval
>>                       core.clj: 3081  clojure.core/eval
>>                       main.clj:  240  clojure.main/repl/read-eval-print/fn
>>                       main.clj:  240  clojure.main/repl/read-eval-print
>>                       main.clj:  258  clojure.main/repl/fn
>>                       main.clj:  258  clojure.main/repl
>>                    RestFn.java: 1523  clojure.lang.RestFn/invoke
>>         interruptible_eval.clj:   58 
>>  clojure.tools.nrepl.middleware.interruptible-eval/evaluate/fn
>>                       AFn.java:  152  clojure.lang.AFn/applyToHelper
>>                       AFn.java:  144  clojure.lang.AFn/applyTo
>>                       core.clj:  630  clojure.core/apply
>>                       core.clj: 1868  clojure.core/with-bindings*
>>                    RestFn.java:  425  clojure.lang.RestFn/invoke
>>         interruptible_eval.clj:   56 
>>  clojure.tools.nrepl.middleware.interruptible-eval/evaluate
>>         interruptible_eval.clj:  191 
>>  clojure.tools.nrepl.middleware.interruptible-eval/interruptible-eval/fn/fn
>>         interruptible_eval.clj:  159 
>>  clojure.tools.nrepl.middleware.interruptible-eval/run-next/fn
>>                       AFn.java:   22  clojure.lang.AFn/run
>>        ThreadPoolExecutor.java: 1142 
>>  java.util.concurrent.ThreadPoolExecutor/runWorker
>>        ThreadPoolExecutor.java:  617 
>>  java.util.concurrent.ThreadPoolExecutor$Worker/run
>>                    Thread.java:  745  java.lang.Thread/run
>>
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to