Re: very-high latency updates through thrift proxy server

sjtsp2008 Tue, 01 Nov 2022 14:00:28 -0700

we figured it out - i was using client_sync.AccumuloWriter.add_mutations,
and eventually calling close.


add_mutations calls client.update.  but some example client python code
says to use updateAndFlush.

if i add a call to writer.client.flush(writer.resource_id) after the
add_mutations, then the updates are visible within milliseconds


On Mon, Oct 31, 2022 at 8:59 PM sjtsp2008 <sjtsp2...@gmail.com> wrote:

> i don’t think it’s anything about the data in the table.  if i run the
> same set of mutations in jython, then they are instantly available, for
> both jython and python/thrift client
>
>
> On Mon, Oct 31, 2022 at 6:43 PM dev1 <d...@etcoleman.com> wrote:
>
>> Could it be data dependent?  For example, if you have a lot of data that
>> has passed its TTL you may be scanning across a lot of data to find data
>> that is eligible to return.  Similar situations could have to do with
>> visibilities that you can’t access…  Or, maybe it’s related to your scan
>> range?  You are scanning across a lot of data, but most of the rows do not
>> match your scan criteria?
>>
>>
>>
>> If you think it could be related to age-off rather than visibilities or
>> scan range, can you run a full compaction on the table and see if that
>> improves things?  That would eliminate data that has aged off and reduce
>> the amount of data that must be scanned.  If you can, use hdfs to determine
>> the directory size of the table – it would be under
>> /accumulo/tables/[TABLE-ID] then run the compaction (compact -w -t
>> tablename) and when it finishes and the accumulo gc runs to remove the
>> “old” files and check the size again.  That should give you an idea of how
>> much data was removed by the compaction.
>>
>>
>>
>> Ed Coleman
>>
>>
>>
>> *From:* Christopher <ctubb...@apache.org>
>> *Sent:* Monday, October 31, 2022 5:42 PM
>> *To:* accumulo-user <user@accumulo.apache.org>
>> *Subject:* Re: very-high latency updates through thrift proxy server
>>
>>
>>
>> That's odd. They should be available immediately. Are you using
>> replication? What kind of scan are you doing? Is it an offline scanner, or
>> isolated scanner?
>>
>> On Mon, Oct 31, 2022, 15:41 Jeff Turner <sjtsp2...@gmail.com> wrote:
>>
>> any ideas why mutations via thrift proxy server would take 120 to 150
>> seconds to show up in a scan?
>>
>> accumulo 1.9.3
>>
>> the mutations have all been submitted through thrift (from python), the
>> writer has been closed, and the writing process has exited.
>>
>> 95% of the time, the latency is between 120.1 and 120.5 seconds.
>> occasionally the latency is 150 seconds.
>>
>> there don't appear to be many configuration options for the proxy
>> server.  and other people i have talked to said that they see their
>> updates through thrift proxy immediately.
>>
>> updates via java/jython have millisecond latency.  (for a while i had
>> been trying to blame tservers or the main server (maybe some delay in
>> processing compactions, ...).  i don't think that's the issue)
>>
>>
>>
>>

Re: very-high latency updates through thrift proxy server

Reply via email to