Re: [PR] COLLECTIONS-873: create PatriciaTrieSet [commons-collections]

via GitHub Sun, 23 Feb 2025 07:24:39 -0800


ekuvardin commented on PR #589:
URL: 
https://github.com/apache/commons-collections/pull/589#issuecomment-2676921928


   Summary
   
   **How it works**
   The main idea is to build logic on the above existing PatriciaTrie not 
reinventing a wheel.
   To do this, I create a stub Object and use it as a value in PatriciaTrie 
   ` static final Object PRESENT = new Object();`
   All put methods are similar to
   ` patriciaTrie .put(e, PRESENT)`
   
   All new classes are put on a picture below.
   
![image_2025-02-21_10-15-24](https://github.com/user-attachments/assets/24e708df-b581-4121-ac85-749b9bdbc293)
   
   ***TrieSet***
   New interface TrieSet consists of methods.
   `SortedSet<K> prefixSet(K key); `
   and extends
   `SortedSet`
   
   ***PatriciaTrieSet***
   New class PatriciaTrieSet 
   implements
   `TrieSet<String>`
   extends
   ```
   AbstractSet<String>
   Serializable 
   ```
   
   ***RangePatriciaTrieSortedSet***
   User will want to work with results of methods below in a SortedSet manner.
   ```
   SortedSet<E> subSet(E fromElement, E toElement);
   SortedSet<E> headSet(E toElement);
   SortedSet<E> tailSet(E fromElement);
   SortedSet<K> prefixSet(K key);
   ```
   To do so, I create an inner class RangePatriciaTrieSortedSet in 
PatriciaTrieSet 
   extends
   ` AbstractSet<String>` 
   implements 
   `SortedSet<String>`
   
   RangePatriciaTrieSortedSet doesn't immediately construct SortedSet; it's a 
wrapper under PatriciaSet.RangeEntryMap and PatriciaSet.PrefixRangeMap works in 
the same manner.
   
   Objects in this SortedMap are sorted according to lexicographical order.
   Objects in SortedSet just ignore value.
   
   **Benchmarks**
   Also write some benchmarks to prove that Set doesn't degrade dramatically.
   Make benchmarks. I don't find JMH benchmarks in the project and write on my 
own.
   https://github.com/ekuvardin/apache-common-collections-benchmarks
   
   <details>
     <summary>Benchmarks results</summary>
   
       Benchmark                                             (sizeFill)  Mode  
Cnt    Score   Error  Units
       PatriciaBenchmarkAddExistingValue.addPatriciaTrie          10000  avgt   
10  132,648 ? 1,718  ns/op
       PatriciaBenchmarkAddExistingValue.addPatriciaTrieSet       10000  avgt   
10  141,382 ? 2,160  ns/op
       
       Benchmark                                        (sizeFill)  
(sizeTryToAdd)  Mode  Cnt    Score   Error  Units
       PatriciaBenchmarkAddNewValue.addPatriciaTrie          10000          
100000  avgt    5  158,198 ? 1,014  ns/op
       PatriciaBenchmarkAddNewValue.addPatriciaTrieSet       10000          
100000  avgt    5  175,539 ? 3,196  ns/op
   
       Benchmark                                      (sizeFill)  
(sizeTryToRemove)  Mode  Cnt    Score   Error  Units
       PatriciaBenchmarkRemove.removePatriciaTrie         100000             
100000  avgt    5  133,199 ? 0,989  ns/op
       PatriciaBenchmarkRemove.removePatriciaTrieSet      100000             
100000  avgt    5  134,057 ? 0,960  ns/op
   
       Benchmark                                                            
(size)  Mode  Cnt        Score       Error  Units
       PatriciaBenchmarkSimpleOperation.containsPatriciaTrie                  
1000  avgt    5       48,596 ?     1,230  ns/op
       PatriciaBenchmarkSimpleOperation.containsPatriciaTrie                 
10000  avgt    5       51,476 ?     0,824  ns/op
       PatriciaBenchmarkSimpleOperation.containsPatriciaTrie                
100000  avgt    5       51,834 ?     0,413  ns/op
       PatriciaBenchmarkSimpleOperation.containsPatriciaTrieSet               
1000  avgt    5       48,540 ?     1,174  ns/op
       PatriciaBenchmarkSimpleOperation.containsPatriciaTrieSet              
10000  avgt    5       52,369 ?     0,398  ns/op
       PatriciaBenchmarkSimpleOperation.containsPatriciaTrieSet             
100000  avgt    5       52,190 ?     0,547  ns/op
       PatriciaBenchmarkSimpleOperation.firstPatriciaTrie                     
1000  avgt    5        6,270 ?     0,093  ns/op
       PatriciaBenchmarkSimpleOperation.firstPatriciaTrie                    
10000  avgt    5        6,262 ?     0,061  ns/op
       PatriciaBenchmarkSimpleOperation.firstPatriciaTrie                   
100000  avgt    5        6,230 ?     0,187  ns/op
       PatriciaBenchmarkSimpleOperation.firstPatriciaTrieSet                  
1000  avgt    5        6,817 ?     0,066  ns/op
       PatriciaBenchmarkSimpleOperation.firstPatriciaTrieSet                 
10000  avgt    5        6,751 ?     0,088  ns/op
       PatriciaBenchmarkSimpleOperation.firstPatriciaTrieSet                
100000  avgt    5        6,737 ?     0,085  ns/op
       PatriciaBenchmarkSimpleOperation.headMapPatriciaTrie                   
1000  avgt    5        6,687 ?     0,023  ns/op
       PatriciaBenchmarkSimpleOperation.headMapPatriciaTrie                  
10000  avgt    5        6,808 ?     0,231  ns/op
       PatriciaBenchmarkSimpleOperation.headMapPatriciaTrie                 
100000  avgt    5        6,819 ?     0,319  ns/op
       PatriciaBenchmarkSimpleOperation.headSetPatriciaTrieSet                
1000  avgt    5        9,267 ?     0,075  ns/op
       PatriciaBenchmarkSimpleOperation.headSetPatriciaTrieSet               
10000  avgt    5        9,343 ?     0,211  ns/op
       PatriciaBenchmarkSimpleOperation.headSetPatriciaTrieSet              
100000  avgt    5        9,330 ?     0,154  ns/op
       PatriciaBenchmarkSimpleOperation.isEmptyPatriciaTrie                   
1000  avgt    5        0,792 ?     0,019  ns/op
       PatriciaBenchmarkSimpleOperation.isEmptyPatriciaTrie                  
10000  avgt    5        0,799 ?     0,005  ns/op
       PatriciaBenchmarkSimpleOperation.isEmptyPatriciaTrie                 
100000  avgt    5        0,889 ?     0,060  ns/op
       PatriciaBenchmarkSimpleOperation.isEmptyPatriciaTrieSet                
1000  avgt    5        1,021 ?     0,180  ns/op
       PatriciaBenchmarkSimpleOperation.isEmptyPatriciaTrieSet               
10000  avgt    5        1,186 ?     0,195  ns/op
       PatriciaBenchmarkSimpleOperation.isEmptyPatriciaTrieSet              
100000  avgt    5        1,024 ?     0,155  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorPatriciaTrie                  
1000  avgt    5    10405,896 ?   202,444  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorPatriciaTrie                 
10000  avgt    5   123151,828 ? 34419,185  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorPatriciaTrie                
100000  avgt    5  1727588,628 ? 25520,548  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorPatriciaTrieSet               
1000  avgt    5    10049,078 ?   130,071  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorPatriciaTrieSet              
10000  avgt    5   141096,432 ?  1309,961  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorPatriciaTrieSet             
100000  avgt    5  1664385,856 ?  8252,919  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorSubSetPatriciaTrie            
1000  avgt    5     6866,638 ?   235,368  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorSubSetPatriciaTrie           
10000  avgt    5    27320,421 ?   359,976  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorSubSetPatriciaTrie          
100000  avgt    5   718470,198 ? 16591,016  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorSubSetPatriciaTrieSet         
1000  avgt    5     7169,227 ?    51,483  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorSubSetPatriciaTrieSet        
10000  avgt    5    27232,041 ?   819,069  ns/op
       PatriciaBenchmarkSimpleOperation.iteratorSubSetPatriciaTrieSet       
100000  avgt    5   742578,217 ?  5893,876  ns/op
       PatriciaBenchmarkSimpleOperation.lastPatriciaTrie                      
1000  avgt    5       16,504 ?     0,077  ns/op
       PatriciaBenchmarkSimpleOperation.lastPatriciaTrie                     
10000  avgt    5       16,445 ?     0,250  ns/op
       PatriciaBenchmarkSimpleOperation.lastPatriciaTrie                    
100000  avgt    5       23,058 ?     0,407  ns/op
       PatriciaBenchmarkSimpleOperation.lastPatriciaTrieSet                   
1000  avgt    5       18,047 ?     0,138  ns/op
       PatriciaBenchmarkSimpleOperation.lastPatriciaTrieSet                  
10000  avgt    5       17,766 ?     0,502  ns/op
       PatriciaBenchmarkSimpleOperation.lastPatriciaTrieSet                 
100000  avgt    5       24,725 ?     0,513  ns/op
       PatriciaBenchmarkSimpleOperation.prefixMapPatriciaTrie                 
1000  avgt    5        8,907 ?     0,046  ns/op
       PatriciaBenchmarkSimpleOperation.prefixMapPatriciaTrie                
10000  avgt    5        8,943 ?     0,153  ns/op
       PatriciaBenchmarkSimpleOperation.prefixMapPatriciaTrie               
100000  avgt    5        8,887 ?     0,013  ns/op
       PatriciaBenchmarkSimpleOperation.prefixSetPatriciaTrieSet              
1000  avgt    5       11,372 ?     0,046  ns/op
       PatriciaBenchmarkSimpleOperation.prefixSetPatriciaTrieSet             
10000  avgt    5       11,398 ?     0,036  ns/op
       PatriciaBenchmarkSimpleOperation.prefixSetPatriciaTrieSet            
100000  avgt    5       11,341 ?     0,012  ns/op
       PatriciaBenchmarkSimpleOperation.sizePatriciaTrie                      
1000  avgt    5        0,948 ?     0,119  ns/op
       PatriciaBenchmarkSimpleOperation.sizePatriciaTrie                     
10000  avgt    5        0,803 ?     0,052  ns/op
       PatriciaBenchmarkSimpleOperation.sizePatriciaTrie                    
100000  avgt    5        0,889 ?     0,073  ns/op
       PatriciaBenchmarkSimpleOperation.sizePatriciaTrieSet                   
1000  avgt    5        1,038 ?     0,137  ns/op
       PatriciaBenchmarkSimpleOperation.sizePatriciaTrieSet                  
10000  avgt    5        1,102 ?     0,201  ns/op
       PatriciaBenchmarkSimpleOperation.sizePatriciaTrieSet                 
100000  avgt    5        1,250 ?     0,248  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrie                    
1000  avgt    5        8,460 ?     0,051  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrie                   
10000  avgt    5        8,982 ?     0,036  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrie                  
100000  avgt    5        8,413 ?     0,044  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrieContains            
1000  avgt    5       25,776 ?     0,592  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrieContains           
10000  avgt    5       26,233 ?     0,247  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrieContains          
100000  avgt    5       26,073 ?     0,307  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrieSetContains         
1000  avgt    5       26,864 ?     0,812  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrieSetContains        
10000  avgt    5       27,708 ?     0,398  ns/op
       PatriciaBenchmarkSimpleOperation.subMapPatriciaTrieSetContains       
100000  avgt    5       26,643 ?     0,286  ns/op
       PatriciaBenchmarkSimpleOperation.subSetPatriciaTrieSet                 
1000  avgt    5       11,157 ?     0,400  ns/op
       PatriciaBenchmarkSimpleOperation.subSetPatriciaTrieSet                
10000  avgt    5       11,364 ?     0,049  ns/op
       PatriciaBenchmarkSimpleOperation.subSetPatriciaTrieSet               
100000  avgt    5       11,009 ?     0,293  ns/op
       PatriciaBenchmarkSimpleOperation.tailMapPatriciaTrie                   
1000  avgt    5        6,842 ?     0,057  ns/op
       PatriciaBenchmarkSimpleOperation.tailMapPatriciaTrie                  
10000  avgt    5        6,839 ?     0,034  ns/op
       PatriciaBenchmarkSimpleOperation.tailMapPatriciaTrie                 
100000  avgt    5        6,850 ?     0,029  ns/op
       PatriciaBenchmarkSimpleOperation.tailSetPatriciaTrieSet                
1000  avgt    5        9,305 ?     0,007  ns/op
       PatriciaBenchmarkSimpleOperation.tailSetPatriciaTrieSet               
10000  avgt    5        9,508 ?     0,196  ns/op
       PatriciaBenchmarkSimpleOperation.tailSetPatriciaTrieSet              
100000  avgt    5        9,338 ?     0,036  ns/op
   </details>
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] COLLECTIONS-873: create PatriciaTrieSet [commons-collections]

Reply via email to