Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1517#issuecomment-186219363
  
    OK, then lets keep the data in one partition for now. In case of var-length 
updates, this can default to a memory usage / combine behavior which is 
somewhat similar to the sort-based strategy: Filling the memory with records 
and emitting it (putting compaction aside).
    
    I'll review the PR once more will run a few end-to-end benchmarks as well.
    What kind of benchmarks have you done so far? 
    - Did you check the combine rate (input / output ratio) compared to the 
sort-based strategy? 
    - How much memory did you use for tests (upper bound)? Did you vary the 
memory?
    - Have you checked heap memory consumption / GC activity compared to the 
sort-based strategy?
    
    It might take a few more days before I actually get to this, but it is on 
my list.
    
    Thanks,
    Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to