waynexia commented on issue #9929: URL: https://github.com/apache/datafusion/issues/9929#issuecomment-2073949931
Related code is here https://github.com/GreptimeTeam/greptimedb/commit/9e1e4a518143236371b76ecb6f1da5c694eb867b#diff-ac43dc13456cf41e4fabb9d577101e245366687d49064aff99bf10aab20b9cd0R429-R480 First, get the precise row number of rows to read (in the file level) ```rust let mut selected_row = applier.apply(file_id).unwrap(); ``` Then translate the file level row number into row group selection: ```rust // translate `selected_row` into row groups selection selected_row.sort_unstable(); let mut row_groups_selected = BTreeMap::new(); for row_id in selected_row.iter() { let row_group_id = row_id / row_group_size; let rg_row_id = row_id % row_group_size; row_groups_selected .entry(row_group_id) .or_insert_with(Vec::new) .push(rg_row_id); } let row_group = row_groups_selected .into_iter() .map(|(row_group_id, row_ids)| { let mut current_row = 0; let mut selection = vec![]; for row_id in row_ids { selection.push(RowSelector::skip(row_id - current_row)); selection.push(RowSelector::select(1)); current_row = row_id + 1; } (row_group_id, Some(RowSelection::from(selection))) }) .collect(); ``` The result `BTreeMap<usize, Option<RowSelection>>` is a map from "the number of row group" to "selection within that row group". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
