[ 
https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025157#comment-15025157
 ] 

Alan Gates commented on HIVE-12307:
-----------------------------------

HiveEndPoint.TransactionBatchImpl:
* I'm +1 on making this package level, but does it do any good to make the 
class non-private and leave the constructor private?
* Why did you make the isClosed value volatile?  Are you assuming users will 
use this in a multi-threaded way?  I think this class is thread safe in as much 
as users can have per-thread instances, but I don't think the class is 
re-entrant itself.  I see in other comments you're expecting heartbeating to be 
done in a separate thread.  Is that the only case you think this will be used?  
If so, it seems we should comment this well so people know what is and isn't 
safe to do multi-threaded.
* write()
** I don't understand why when the code gets a SerializationError it should 
still declare success and re-throw.  I see re-throwing so the user can see the 
error, but don't we still need to mark the batch a failure and abort?  How can 
the caller recover here?
** Since the logic here is getting more complex I would suggest that we 
refactor write(byte) to be simply: write(Collections.singleList(byte)).  This 
way we only have the logic in one place.
** A thought on the produceFault() calls.  Would it be better to create a 
special RecordWriter implementation that wraps other RecordWriters and throws 
when you want it to?  That way you don't have to pollute this code with 
produceFault() calls and you can eventually test errors at more locations.
* abortImpl()
** We should look at adding a call to rollback a group of transactions, so that 
large aborted batches don't cause a storm of metastore calls.  That obviously 
doesn't need to be part of this patch.



> Streaming API TransactionBatch.close() must abort any remaining transactions 
> in the batch
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-12307
>                 URL: https://issues.apache.org/jira/browse/HIVE-12307
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog, Transactions
>    Affects Versions: 0.14.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-12307.patch
>
>
> When the client of TransactionBatch API encounters an error it must close() 
> the batch and start a new one.  This prevents attempts to continue writing to 
> a file that may damaged in some way.
> The close() should ensure to abort the any txns that still remain in the 
> batch and close (best effort) all the files it's writing to.  The batch 
> should also put itself into a mode where any future ops on this batch fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to