Hi Colin, > From: Colin Vipurs [mailto:zodiac...@gmail.com] [...] > I've got some data that I'm doing counts on, stored in a CF as: > > <lhid> { > <rhid1> : <count> > <rhid2> : <count> > .... > } [...] > <lhid> { > <count-rhid1> : PLACEHOLDER > <count-rhid2> : PLACEHOLDER > } > > would be a better way of storing the data? Does anyone know the > relative performance differences between doing the insert in the first > instance and a delete/insert in the second?
I can't say anything about perfomance differences, but I think it will not matter, as you are about to insert the same amount of data. Just keep the following in mind: - With the second scheme, it is more difficult to delete individual columns, because you have to know the count and the name to construct the column name. You can iterate over the columns to find the names, of course, but this may or may not work for you. Maybe you want to store the rhids instead of the placeholders to solve that problem. - You will need to left-pad the counts with zeros so that lexicographical ordering works. - (may be irrelevant, but anyway) there is a limit on column names which AFAIK is lower than the limit on column values. Cheers, Martin