And is this intended behavior? Either this is something we need to document better (or I've just missed it) or I'll file a JIRA.
I have a <uniqueKey> defined as "lowercase", which is just a KeywordTokenizer followed by a LowercaseFilter. This definition does not detect duplicate IDs. I'm guessing that the check (at a client, can't dig too much this morning) for whether to replace a document is happening _before_ the id goes through the analysis chain, which is a surprise to me. So if the ID contains upper-case letters, it is not replaced and we have N live docs with the same ID. I'd argue this is a case that should be supported on the basis of my "rule of thumb" that anything a human might enter should at least not be case-sensitive on search. Since the <uniqueKey> is very often something like a catalog number or similar, at least lowercasing should be supported. Of course what that means if/when an analysis chain is more complex is...er...interesting. Question of course is whether this is expected behavior and I have to, you know, remember it or I'll file a JIRA. Thanks! Erick
