tisonkun commented on code in PR #92:
URL: https://github.com/apache/datasketches-rust/pull/92#discussion_r3469470644
##########
tools/generate_serialization_test_data.py:
##########
@@ -166,13 +168,22 @@ def generate_cpp_files(workspace_dir, project_root):
output_dir.mkdir(parents=True, exist_ok=True)
files_copied = 0
- # Search recursively in build directory for *_cpp.sk
+
+ # Search recursively in build directory for standard C++ compatibility
snapshots.
for file_path in build_dir.rglob("*_cpp.sk"):
- # Avoid copying from CMakeFiles or other intermediate dirs if
possible, but the pattern is specific enough
shutil.copy2(file_path, output_dir)
print(f"Copied: {file_path.name}")
files_copied += 1
+ # Count-Min test binaries are produced as `count_min-*.bin`.
+ # Normalize names to match the repository snapshot convention.
+ for file_path in build_dir.rglob("count_min-*.bin"):
+ base_name = file_path.stem[len("count_min-"):].replace("-", "_")
+ output_name = f"countmin_{base_name}_cpp.sk"
+ shutil.copy2(file_path, output_dir / output_name)
+ print(f"Copied: {output_name} (from {file_path.name})")
+ files_copied += 1
Review Comment:
I don't think we need this hack. We should use `count_min-*.bin` in the Rust
test code.
It doesn't matter that we just follow what cpp uses. If we want to unify the
name conventions, we should unify from the source side, not arbitrarily
patching within downstream.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]