Hi All, Goal: want to create check_duplicate UDA on a blob column
Context: I have a partition of 10Million rows with size of 10GB (I know this is bad). I want to check if there are duplicate in a blob column in this partition. The blob column can at most be 256 bytes. Question: can I create state map<blob, int> ? since blob is represented as a ByteBuffer I suspect this wont work because of .equals method which would just compare the references. so can I convert the blob into hex string first? If so, how should I do that? Thanks, kant