[ 
https://issues.apache.org/jira/browse/HIVE-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-7990:
----------------------------------
    Description: For queries with rollup and cube   (was: When loading into an 
un-paritioned ORC table WriterImpl$StructTreeWriter.write method is 
synchronized.

When hive.optimize.sort.dynamic.partition is enabled the current thread will be 
the only writer and the synchronization is not needed.

Also  checking for memory per row is an over kill , this can be done per 1K 
rows or such

{code}
  public void addRow(Object row) throws IOException {
    synchronized (this) {
      treeWriter.write(row);
      rowsInStripe += 1;
      if (buildIndex) {
        rowsInIndex += 1;

        if (rowsInIndex >= rowIndexStride) {
          createRowIndexEntry();
        }
      }
    }
    memoryManager.addedRow();
  }
{code}

This can improve ORC load performance by 7% 

{code}
Stack Trace     Sample Count    Percentage(%)
WriterImpl.addRow(Object)       5,852   65.782
   WriterImpl$StructTreeWriter.write(Object)    5,163   58.037
   MemoryManager.addedRow()     666     7.487
      MemoryManager.notifyWriters()     648     7.284
         WriterImpl.checkMemory(double) 645     7.25
            WriterImpl.flushStripe()    643     7.228
               
WriterImpl$StructTreeWriter.writeStripe(OrcProto$StripeFooter$Builder, int)     
 584     6.565
{code}



)

> With fetch column stats disabled number of elements in grouping set is not 
> taken into account
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7990
>                 URL: https://issues.apache.org/jira/browse/HIVE-7990
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>    Affects Versions: 0.13.1
>            Reporter: Mostafa Mokhtar
>            Assignee: Prasanth J
>              Labels: performance
>             Fix For: 0.14.0
>
>
> For queries with rollup and cube 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to