OK, that is a good question exactly pointing to our scenarios and intentions.


I would say data from online environment always comes  from log or RDBMS, 
referenced to immutable data and mutable data. mutable data is static, which 
should not have updata/delete demands. but for mutable data, there could be 
multiple updating on one row in a short time (like order status changing) , and 
proportion of deletion would not be much for better analysis, that depends. for 
years the incapability for mutable data of hive has led us to heterogeneous 
systems like kudu, druid. which cost more on data transferring, learning and 
machines. a common data lake solution would be better for minutes latency 
scenarios.




I talked about our production scenario with an aggregation demo in my proposal, 
you can read it and talk to me whenever convenient, thank you. 










At 2020-04-07 08:29:52, "Jaguar Xiong" <xiong.jag...@gmail.com> wrote:

Hi, 
  I'm new to the community, but kept an eye on this project for a while now.
  Just a quick comment on insertion/deletion. Do we expect multiple 
modification of a single row (with primary key)?
  For example, row with pk(x) is inserted (1st), deleted and insert (2nd time) 
in a single snapshot.
  How would such case recorded in a snapshot?


Best!
Jaguar Xiong


马进 <majin1...@163.com> 于2020年4月6日周一 下午9:08写道:

hey guys, 


I wrote two proposals about upsertion/deletion for tables with primary key:


dealing mutable data with primary key in iceberg
Write conficts in upserting/deleteing situation of iceberg
the first one focused on upsertion/deletion designing for tables with pk, which 
covers most of production scenario, and of course aiming for good reading and 
writing performance, the second discussed write conflicts situations and 
solutions for upsertion/deletion. 


I raised this two proposals to solve our production demands, I have talked with 
Opennix and made a rough plan for this, and I anticipate there would be 
discussions about goals, roadmap of iceberg and relations with hudi, delta etc. 
II would be grateful for making these things clear and giving your valuable 
options




thanks.






 





--

jaguar·run for ever

Reply via email to