On Fri, Feb 19, 2016 at 5:59 PM, Dan Blum <[email protected]> wrote: > We are already using logical time. I can definitely change to using > multiple mutations or more likely sum up the values myself and make a > single put call. >
That sounds like a good way to go. I created an issue. https://issues.apache.org/jira/browse/ACCUMULO-4148 > > > *From:* Keith Turner [mailto:[email protected]] > *Sent:* Friday, February 19, 2016 5:57 PM > *To:* [email protected] > > *Subject:* Re: Bug in either InMemoryMap or NativeMap > > > > > > > > On Fri, Feb 19, 2016 at 5:14 PM, Dan Blum <[email protected]> wrote: > > Yes, please open an issue for this. > > > > In the meantime, as a workaround is it safe to assign an arbitrary > increasing timestamp when calling Mutation.put()? That seems the simplest > way to get the ColumnUpdates to be treated properly. > > > > Seems like that would work, but then you may have to keep track of the > next timestamp across processes. > > A possible alternative is to configure the table to use logical time and > multiple mutations. Logical time ensures every mutation is assigned a > unique timestamp. The following program is an example of this. > > String table = getUniqueNames(1)[0]; > Connector c = getConnector(); > c.tableOperations().create(table, > new > NewTableConfiguration().setTimeType(TimeType.LOGICAL).withoutDefaultIterators()); > > BatchWriterConfig config = new BatchWriterConfig(); > BatchWriter writer = c.createBatchWriter(table, config); > > Mutation m = new Mutation("row"); > m.put("cf1", "cq1", new Value("abc".getBytes())); > writer.addMutation(m); > m = new Mutation("row"); > m.put("cf1", "cq1", new Value("xyz".getBytes())); > writer.addMutation(m); > writer.close(); > > Scanner scanner = c.createScanner(table, Authorizations.EMPTY); > for (Entry<Key,Value> entry : scanner) { > System.out.println(entry); > } > > > This program prints > > row cf1:cq1 [] 2 false=xyz > row cf1:cq1 [] 1 false=abc > > > > Accumulo assigned the timestamps 1 and 2. In this case Accumulo will > keep track of the next timestamp for you. > > > > If you do not use logical time, then the two mutations would likely get > the same timestamp because they arrived in the same millisecond. > > > > *From:* Keith Turner [mailto:[email protected]] > *Sent:* Friday, February 19, 2016 5:11 PM > *To:* [email protected] > *Cc:* Jonathan Lasko; Maxwell Jordan; [email protected] > *Subject:* Re: Bug in either InMemoryMap or NativeMap > > > > > > > > On Fri, Feb 19, 2016 at 3:34 PM, Dan Blum <[email protected]> wrote: > > (Resend: I forgot to actually subscribe before sending originally.) > > I noticed a difference in behavior between our cluster and our tests > running > on MiniCluster: when multiple put() calls are made to a Mutation with the > same CF, CQ, and CV and no explicit timestamp, on a live cluster only the > last one is written, whereas in Mini all of them are. > > Of course in most cases it wouldn't matter but if there is a Combiner set > on > the column (which is the case I am dealing with) then it does. > > I believe the difference in behavior is due to code in NativeMap._mutate > and > InMemoryMap.DefaultMap.mutate. In the former if there are multiple > ColumnUpdates in a Mutation they all get written with the same > mutationCount > value; I haven't looked at the C++ map code but I assume that this means > that entries with the same CF/CQ/CV/timestamp will overwrite each other. In > contrast, in DefaultMap multiple ColumnUpdates are stored with an > incrementing kvCount, so the keys will necessarily be distinct. > > > > You made this issue easy to track down. > > > > This seems like a bug w/ the native map. The code allocates a unique int > for each key/value in the mutation. > > > > https://github.com/apache/accumulo/blob/rel/1.6.5/server/tserver/src/main/java/org/apache/accumulo/tserver/InMemoryMap.java#L476 > > > It seems like the native map code should increment like the DefaultMap > code does. Specifically it seems like the following code should increment > mutationCount (coordinating with the code that calls it) > > > https://github.com/apache/accumulo/blob/rel/1.6.5/server/tserver/src/main/java/org/apache/accumulo/tserver/NativeMap.java#L532 > > > > Would you like to open an issue in Jira? > > > > > My main question is: which of these is the intended behavior? We'll > obviously need to change our code to work with NativeMap's current > implementation regardless (since we don't want to use the Java maps on a > live cluster), but it would be useful to know if that change is temporary > or > permanent. > > My secondary question is whether there is any trick to getting native maps > to work in MiniCluster, which would be very helpful for our testing. I > changed the configuration XML we use and I can see that it picks up the > change - server.Accumulo logs "tserver.memory.maps.native.enabled = true," > but NativeMap never logs that it tries to load the library so the setting > seems to be dropped somewhere. > > > > >
