MonkeyCanCode commented on code in PR #4075:
URL: https://github.com/apache/polaris/pull/4075#discussion_r3024328272


##########
client/python/apache_polaris/cli/command/utils.py:
##########
@@ -64,3 +69,140 @@ def format_timestamp(ms_since_epoch: int) -> str:
         ms_since_epoch / 1000, tz=datetime.timezone.utc
     )
     return dt.strftime("%Y-%m-%d %H:%M:%S UTC")
+
+
+def is_fuzzy_match(query: str, target: str, threshold: float = 0.85) -> bool:
+    """
+    Determine if a query matches a target using multi-stage fuzzy strategies 
and case-insensitive.
+    """
+    if not query:
+        return False
+    q = query.lower()
+    t = target.lower()
+    query_len = len(q)
+    # Exact match
+    if q == t:
+        return True
+    # Prefix match
+    if t.startswith(q):
+        return True
+    # Substring match: enabled for length > 1
+    if query_len > 1 and q in t:
+        return True
+    # Subsequence match: enabled for length > 2
+    if query_len > 2:
+        iterator = iter(t)
+        if all(char in iterator for char in q):

Review Comment:
   I am more than happy to take feedback on how to better handle this and if 
min of 3 characters is too verbose to trigger a fuzzy search. This requirement 
is from @flyrain , any thoughts on this route?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to