we figured it out - i was using client_sync.AccumuloWriter.add_mutations, and eventually calling close.
add_mutations calls client.update. but some example client python code says to use updateAndFlush. if i add a call to writer.client.flush(writer.resource_id) after the add_mutations, then the updates are visible within milliseconds On Mon, Oct 31, 2022 at 8:59 PM sjtsp2008 <sjtsp2...@gmail.com> wrote: > i don’t think it’s anything about the data in the table. if i run the > same set of mutations in jython, then they are instantly available, for > both jython and python/thrift client > > > On Mon, Oct 31, 2022 at 6:43 PM dev1 <d...@etcoleman.com> wrote: > >> Could it be data dependent? For example, if you have a lot of data that >> has passed its TTL you may be scanning across a lot of data to find data >> that is eligible to return. Similar situations could have to do with >> visibilities that you can’t access… Or, maybe it’s related to your scan >> range? You are scanning across a lot of data, but most of the rows do not >> match your scan criteria? >> >> >> >> If you think it could be related to age-off rather than visibilities or >> scan range, can you run a full compaction on the table and see if that >> improves things? That would eliminate data that has aged off and reduce >> the amount of data that must be scanned. If you can, use hdfs to determine >> the directory size of the table – it would be under >> /accumulo/tables/[TABLE-ID] then run the compaction (compact -w -t >> tablename) and when it finishes and the accumulo gc runs to remove the >> “old” files and check the size again. That should give you an idea of how >> much data was removed by the compaction. >> >> >> >> Ed Coleman >> >> >> >> *From:* Christopher <ctubb...@apache.org> >> *Sent:* Monday, October 31, 2022 5:42 PM >> *To:* accumulo-user <user@accumulo.apache.org> >> *Subject:* Re: very-high latency updates through thrift proxy server >> >> >> >> That's odd. They should be available immediately. Are you using >> replication? What kind of scan are you doing? Is it an offline scanner, or >> isolated scanner? >> >> On Mon, Oct 31, 2022, 15:41 Jeff Turner <sjtsp2...@gmail.com> wrote: >> >> any ideas why mutations via thrift proxy server would take 120 to 150 >> seconds to show up in a scan? >> >> accumulo 1.9.3 >> >> the mutations have all been submitted through thrift (from python), the >> writer has been closed, and the writing process has exited. >> >> 95% of the time, the latency is between 120.1 and 120.5 seconds. >> occasionally the latency is 150 seconds. >> >> there don't appear to be many configuration options for the proxy >> server. and other people i have talked to said that they see their >> updates through thrift proxy immediately. >> >> updates via java/jython have millisecond latency. (for a while i had >> been trying to blame tservers or the main server (maybe some delay in >> processing compactions, ...). i don't think that's the issue) >> >> >> >>