RE: very-high latency updates through thrift proxy server

dev1 Mon, 31 Oct 2022 15:43:37 -0700

Could it be data dependent?  For example, if you have a lot of data that has 
passed its TTL you may be scanning across a lot of data to find data that is 
eligible to return.  Similar situations could have to do with visibilities that 
you can’t access…  Or, maybe it’s related to your scan range?  You are scanning 
across a lot of data, but most of the rows do not match your scan criteria?


If you think it could be related to age-off rather than visibilities or scan 
range, can you run a full compaction on the table and see if that improves 
things?  That would eliminate data that has aged off and reduce the amount of 
data that must be scanned.  If you can, use hdfs to determine the directory 
size of the table – it would be under /accumulo/tables/[TABLE-ID] then run the 
compaction (compact -w -t tablename) and when it finishes and the accumulo gc 
runs to remove the “old” files and check the size again.  That should give you 
an idea of how much data was removed by the compaction.

Ed Coleman

From: Christopher <ctubb...@apache.org>
Sent: Monday, October 31, 2022 5:42 PM
To: accumulo-user <user@accumulo.apache.org>
Subject: Re: very-high latency updates through thrift proxy server

That's odd. They should be available immediately. Are you using replication? 
What kind of scan are you doing? Is it an offline scanner, or isolated scanner?
On Mon, Oct 31, 2022, 15:41 Jeff Turner 
<sjtsp2...@gmail.com<mailto:sjtsp2...@gmail.com>> wrote:
any ideas why mutations via thrift proxy server would take 120 to 150
seconds to show up in a scan?

accumulo 1.9.3

the mutations have all been submitted through thrift (from python), the
writer has been closed, and the writing process has exited.

95% of the time, the latency is between 120.1 and 120.5 seconds.
occasionally the latency is 150 seconds.

there don't appear to be many configuration options for the proxy
server.  and other people i have talked to said that they see their
updates through thrift proxy immediately.

updates via java/jython have millisecond latency.  (for a while i had
been trying to blame tservers or the main server (maybe some delay in
processing compactions, ...).  i don't think that's the issue)

RE: very-high latency updates through thrift proxy server

Reply via email to