> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java,
> >  line 49
> > <https://reviews.apache.org/r/26661/diff/1/?file=719717#file719717line49>
> >
> >     Nit: could we put -1L or skip it and suppress the warning instead of a 
> > random number for this?

Sure. Will change it to -1L


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, 
> > line 47
> > <https://reviews.apache.org/r/26661/diff/1/?file=719718#file719718line47>
> >
> >     Do you mean one thread can access hasNext() and next(), while multiple 
> > threads can concurrently access add()? The doc here is a little unclear on 
> > this.

Yes, that's what I meant. Let me enhance it a little.

Should we make the class fully thread safe instead?


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java,
> >  line 143
> > <https://reviews.apache.org/r/26661/diff/1/?file=719719#file719719line143>
> >
> >     Same here.

Changed to -1L


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, 
> > line 158
> > <https://reviews.apache.org/r/26661/diff/1/?file=719718#file719718line158>
> >
> >     Same here, cursor and container are protected resources and they need 
> > to be in the synchnized block.

Same assumption as above. The add() method never changes "cursor". So if 
"cursor" is not 0, then it must be the iteration thread itself, so we don't 
need the synchronization.


> On Oct. 13, 2014, 10:26 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java, 
> > line 129
> > <https://reviews.apache.org/r/26661/diff/1/?file=719718#file719718line129>
> >
> >     I think the condition check should be also in the synchronized block, 
> > as it accesses protected resources: container.rowCount() and cursor.

I assume just one thread does the iteration, while there could be multiple 
threads doing add() concurrently. So if cursor is not 0, other threads won't 
touch "container". They will use the backup one.
If cursor is 0, other threads are changing rowCount, and this condition check 
doesn't return true, the next block should get the right value.  I need to make 
a little change here.

Should we make the whole class thread safe?


- Jimmy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26661/#review56449
-----------------------------------------------------------


On Oct. 13, 2014, 9:33 p.m., Jimmy Xiang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26661/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2014, 9:33 p.m.)
> 
> 
> Review request for hive and Xuefu Zhang.
> 
> 
> Bugs: HIVE-7873
>     https://issues.apache.org/jira/browse/HIVE-7873
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Re-enabled lazy HiveBaseFunctionResultList. A separate RowContainer is used 
> to work around the no-write-after-read limitation of RowContainer. The patch 
> also fixed a concurrency issue in HiveKVResultCache. Synchronized is used 
> instead of reentrant lock since I assume there won't be many threads to 
> access the cache.
> 
> 
> Diffs
> -----
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java
>  0df2580 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java 
> a6b9037 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 
> 496a11f 
> 
> Diff: https://reviews.apache.org/r/26661/diff/
> 
> 
> Testing
> -------
> 
> Unit test, some simple perf test.
> 
> 
> Thanks,
> 
> Jimmy Xiang
> 
>

Reply via email to