tisonkun commented on code in PR #92:
URL: https://github.com/apache/datasketches-rust/pull/92#discussion_r3469470644


##########
tools/generate_serialization_test_data.py:
##########
@@ -166,13 +168,22 @@ def generate_cpp_files(workspace_dir, project_root):
     output_dir.mkdir(parents=True, exist_ok=True)
 
     files_copied = 0
-    # Search recursively in build directory for *_cpp.sk
+
+    # Search recursively in build directory for standard C++ compatibility 
snapshots.
     for file_path in build_dir.rglob("*_cpp.sk"):
-        # Avoid copying from CMakeFiles or other intermediate dirs if 
possible, but the pattern is specific enough
         shutil.copy2(file_path, output_dir)
         print(f"Copied: {file_path.name}")
         files_copied += 1
 
+    # Count-Min test binaries are produced as `count_min-*.bin`.
+    # Normalize names to match the repository snapshot convention.
+    for file_path in build_dir.rglob("count_min-*.bin"):
+        base_name = file_path.stem[len("count_min-"):].replace("-", "_")
+        output_name = f"countmin_{base_name}_cpp.sk"
+        shutil.copy2(file_path, output_dir / output_name)
+        print(f"Copied: {output_name} (from {file_path.name})")
+        files_copied += 1

Review Comment:
   I don't think we need this hack. We should use `count_min-*.bin` in the Rust 
test code.
   
   It doesn't matter that we just follow what cpp uses. If we want to unify the 
name conventions, we should unify from the source side, not arbitrarily 
patching within downstream.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to