I was wondering if anyone had a sense of performance/best practices around the 'IN' predicate.
I have a list of up to potentially ~30k keys that I want to look up in a table (typically queries will have <500, but I worry about the long tail). Most of them will not exist in the table, but, say, about 10-20% will. Would it be best to do: 1) SELECT fields FROM table WHERE id in (uuid1, uuid2, ...... uuid30000); 2) Split into smaller batches-- for group_of_100 in all_30000: // ** Issue in parallel or block after each one?? SELECT fields FROM table WHERE id in (group_of_100 uuids); 3) Something else? My guess is that (1) is fine and that the only worry is too much data returned (which won't be a problem in this case), but I wanted to check that it's not a C* anti-pattern before. [Conversely, is a batch insert with up to 30k items ok?] Thanks, Dan