meeting90 commented on issue #7836:
URL: https://github.com/apache/hudi/issues/7836#issuecomment-1418426234

   > The timestamp of a record comes from/or equals to the Instant timestamp.
   
   Let me specifiy my question with a case using Spark SQL;
   
   -  Step 1: create a table with given location that stored the HUDI table
   
   `create table hudi_trips_cow using hudi  location 
'/user/hive/hudi/hudi_trips_cow';`
   
   - Step 2:  insert one record to hudi_trips_cow with uuid = insert_1, and 
then select data after insert, I got the inserted record with 
_hoodie_commit_time  = **20230202164941321**
   
   ```
   insert into hudi_cow_pt_tbl select 0.21624150367601136, 0.14285051259466197, 
'driver-213', 0.5890949624813784, 0.0966823831927115, 93.56018115236618, 
'rider-213',   1674831576026, 'insert_1', 
'americas/united_states/san_francisco';
   
   select * from hudi_trips_cow where uuid='insert_1';
   
    > 20230202164941321 20230202164941321_0_5   insert_1        
americas/united_states/san_francisco    
565cb547-3b0b-4c05-b4aa-7d1d2434b316-0_0-21-28_20230202164941321.parquet        
0.21624150367601136     0.14285051259466197     driver-213      
0.5890949624813784      0.0966823831927115      93.56018115236618       
rider-213       1674831576026   insert_1        
americas/united_states/san_francisco
   ```
   
   
   - Step 3:  update the recrod (uuid equals to  insert_1), set two column to 
be a different name and then  select data after update, I got the inserted 
record with _hoodie_commit_time  = **20230202165000548**
   
   ```
   update hudi_trips_cow set rider = 'rider-213-update', end_lat = end_lat*2 
where uuid ='insert_1';
   select `_hoodie_commit_time`, rider, end_lat, uuid from hudi_trips_cow where 
uuid='insert_1';
   > 20230202165000548  rider-213-update        1.1781899249627568      insert_1
   ```
   
   
   
   **As you can see, I can only get one __hoodie_commit_time_ for record with 
uuid="insert_1",   the previous "_hoodie_commit_time" = 20230202164941321 after 
serverval updates is unavaliable to me if I don't memorize the return value 
after insert. How can I get all the "_hoodie_commit_time" for one record( in 
mycase it is uuid="insert_1")  so that I can create the time travel query for 
both timestamp 20230202164941321 and 20230202165000548**
   
   ```
   #time travel query
   select * from hudi_trips_cow timestamp as of '20230202164941321' where uuid 
= 'insert_1';
   >  20230202164941321 20230202164941321_0_5   insert_1        
americas/united_states/san_francisco    
565cb547-3b0b-4c05-b4aa-7d1d2434b316-0_0-21-28_20230202164941321.parquet        
0.21624150367601136     0.14285051259466197     driver-213      
0.5890949624813784      0.0966823831927115      93.56018115236618       
rider-213       1674831576026   insert_1        
americas/united_states/san_francisco
   
   
   select * from hudi_trips_cow timestamp as of '20230202165000548' where uuid 
= 'insert_1';
   >20230202165000548   20230202165000548_0_5   insert_1        
americas/united_states/san_francisco    
565cb547-3b0b-4c05-b4aa-7d1d2434b316-0_0-82-140_20230202165000548.parquet       
0.21624150367601136     0.14285051259466197     driver-213      
1.1781899249627568      0.0966823831927115      93.56018115236618       
rider-213-update        1674831576026   insert_1        
americas/united_states/san_francisco
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to