yangxk1 opened a new pull request, #697:
URL: https://github.com/apache/incubator-graphar/pull/697

   ### Reason for this PR
   
   This PR addresses the feature request in #397  to support reading certain 
set of properties.
   
   ### What changes are included in this PR?
   
   I have modified `VertexPropertyArrowChunkReader::Make` so that when a single 
`property_name` is provided, it no longer loads all properties in the 
corresponding `propertyGroup`, but only reads the **internal ID** and the 
specified **property column**.
   Additionally, a `std::vector<std::string>` of `property_names` can be passed 
to read a specific set of properties, provided they all belong to the same 
`propertyGroup`.
   
   Essentially, these `property_names` are added to `FilterOptions.columns`, 
serving as conditions for predicate pushdown.
   
   ### Are these changes tested?
   
   Yes, I have added:
   
   - Examples
   - Unit tests
   - Benchmarks
   
   The benchmark results show performance improvements, especially on large 
chunk sizes.
   
   - `ReadChunkSelectAllColumnsIn*`: read using a property group (3 properties 
+ internal ID) 
   - `ReadChunkSelectOneColumnIn*`: read only the internal ID and a single 
property
   - `ReadChunkSelectTwoColumnIn*`: read only the internal ID and 2 properties
   
   - `*FirstGraph`: read person vertex in ldbc_sample (chunk_size: 100)
   - `*SecondGraph`: read organisation vertex in ldbc (chunk_size: 4096)
   
   
![image](https://github.com/user-attachments/assets/330f36c8-c266-4b99-b792-b390f47d7c2d)
   
   ### Are there any user-facing changes?
   
   Yes, there are breaking changes:
   
   Previously, users could pass a `property_name` directly to access data from 
its `propertyGroup`.
   Now, to read `propertyGroup`, users should explicitly provide the 
corresponding `propertyGroup`.
   This change improves clarity and ensures consistency when filtering 
properties.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@graphar.apache.org
For additional commands, e-mail: commits-h...@graphar.apache.org

Reply via email to