MonetDB: Dec2023 - Hash blurb.
Changeset: 8ba0afd0e566 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/8ba0afd0e566 Modified Files: gdk/ChangeLog.Dec2023 Branch: Dec2023 Log Message: Hash blurb. diffs (13 lines): diff --git a/gdk/ChangeLog.Dec2023 b/gdk/ChangeLog.Dec2023 --- a/gdk/ChangeLog.Dec2023 +++ b/gdk/ChangeLog.Dec2023 @@ -1,3 +1,9 @@ # ChangeLog file for GDK # This file is updated with Maddlog +* Fri Mar 8 2024 Sjoerd Mullender +- The internal hash function for floating point types has been changed. + It is now no longer based on the bit representation, but on the value, + meaning that +0 and -0 (yes, they both exist in floating point) now + hash to the same value. + ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - Use correct hash function for floats.
Changeset: 0db7e1e99bf8 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/0db7e1e99bf8 Modified Files: gdk/gdk_hash.c Branch: Dec2023 Log Message: Use correct hash function for floats. diffs (23 lines): diff --git a/gdk/gdk_hash.c b/gdk/gdk_hash.c --- a/gdk/gdk_hash.c +++ b/gdk/gdk_hash.c @@ -1082,15 +1082,17 @@ HASHprobe(const Hash *h, const void *v) case TYPE_sht: return hash_sht(h, v); case TYPE_int: - case TYPE_flt: return hash_int(h, v); - case TYPE_dbl: case TYPE_lng: return hash_lng(h, v); #ifdef HAVE_HGE case TYPE_hge: return hash_hge(h, v); #endif + case TYPE_flt: + return hash_flt(h, v); + case TYPE_dbl: + return hash_dbl(h, v); case TYPE_uuid: return hash_uuid(h, v); default: ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: default - Merge with Dec2023 branch.
Changeset: e703616b2ba5 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/e703616b2ba5 Removed Files: sql/test/strimps/Tests/strimps_stable_counts2.test Modified Files: clients/Tests/MAL-signatures-hge.test clients/Tests/MAL-signatures.test clients/odbc/tests/ODBCmetadata.c gdk/gdk_hash.c monetdb5/modules/atoms/str.c monetdb5/modules/atoms/str.h monetdb5/modules/kernel/batstr.c sql/test/BugTracker-2023/Tests/misc-crashes-7390.test Branch: default Log Message: Merge with Dec2023 branch. diffs (truncated from 5202 to 300 lines): diff --git a/clients/Tests/MAL-signatures-hge.test b/clients/Tests/MAL-signatures-hge.test --- a/clients/Tests/MAL-signatures-hge.test +++ b/clients/Tests/MAL-signatures-hge.test @@ -50695,25 +50695,25 @@ STRcontains; Check if string chaystack contains string needle, icase flag. str containsjoin -pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] -STRcontainsjoin1; -The same as STRcontainsjoin, but only produce one output + icase. -str -containsjoin -pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] -STRcontainsjoin1; -The same as STRcontainsjoin, but only produce one output. -str -containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) STRcontainsjoin; Join the string bat L with the bat R if L contains the string of R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows + icase. str containsjoin +pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] +STRcontainsjoin; +The same as STRcontainsjoin, but only produce one output + icase. +str +containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit) (X_7:bat[:oid], X_8:bat[:oid]) STRcontainsjoin; Join the string bat L with the bat R if L contains the string of R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows. str +containsjoin +pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] +STRcontainsjoin; +The same as STRcontainsjoin, but only produce one output. +str containsselect pattern str.containsselect(X_0:bat[:str], X_1:bat[:oid], X_2:str, X_3:bit):bat[:oid] STRcontainsselect; @@ -50735,25 +50735,25 @@ STRendswith; Check if string ends with substring, icase flag. str endswithjoin -pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] -STRendswithjoin1; -The same as STRendswithjoin, but only produce one output + icase. -str -endswithjoin -pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] -STRendswithjoin1; -The same as STRendswithjoin, but only produce one output. -str -endswithjoin pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) STRendswithjoin; Join the string bat L with the suffix bat R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows + icase. str endswithjoin +pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] +STRendswithjoin; +The same as STRendswithjoin, but only produce one output + icase. +str +endswithjoin pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit) (X_7:bat[:oid], X_8:bat[:oid]) STRendswithjoin; Join the string bat L with the suffix bat R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows. str +endswithjoin +pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] +STRendswithjoin; +The same as STRendswithjoin, but only produce one output. +str endswithselect pattern str.endswithselect(X_0:bat[:str], X_1:bat[:oid], X_2:str, X_3:bit):bat[:oid] STRendswithselect; @@ -50905,25 +50905,25 @@ STRstartswith; Check if string starts with substring, icase flag. str startswithjoin -pattern str.startswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] -STRstartswithjoin1; -The same as STRstartswithjoin, but only produce one output + icase. -str -startswithjoin -pattern str.startswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):
MonetDB: Dec2023 - Typo.
Changeset: f72b06f0c54b for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/f72b06f0c54b Modified Files: clients/Tests/MAL-signatures-hge.test clients/Tests/MAL-signatures.test monetdb5/modules/atoms/str.c Branch: Dec2023 Log Message: Typo. diffs (36 lines): diff --git a/clients/Tests/MAL-signatures-hge.test b/clients/Tests/MAL-signatures-hge.test --- a/clients/Tests/MAL-signatures-hge.test +++ b/clients/Tests/MAL-signatures-hge.test @@ -50592,7 +50592,7 @@ str contains pattern str.contains(X_0:str, X_1:str, X_2:bit):bit STRcontains; -Check if string chaystack contains string needle, icase flag. +Check if string haystack contains string needle, icase flag. str containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) diff --git a/clients/Tests/MAL-signatures.test b/clients/Tests/MAL-signatures.test --- a/clients/Tests/MAL-signatures.test +++ b/clients/Tests/MAL-signatures.test @@ -38917,7 +38917,7 @@ str contains pattern str.contains(X_0:str, X_1:str, X_2:bit):bit STRcontains; -Check if string chaystack contains string needle, icase flag. +Check if string haystack contains string needle, icase flag. str containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -6433,7 +6433,7 @@ mel_func str_init_funcs[] = { pattern("str", "endswith", STRendswith, false, "Check if string ends with substring.", args(1,3, arg("",bit),arg("s",str),arg("suffix",str))), pattern("str", "endswith", STRendswith, false, "Check if string ends with substring, icase flag.", args(1,4, arg("",bit),arg("s",str),arg("suffix",str),arg("icase",bit))), pattern("str", "contains", STRcontains, false, "Check if string haystack contains string needle.", args(1,3, arg("",bit),arg("haystack",str),arg("needle",str))), - pattern("str", "contains", STRcontains, false, "Check if string chaystack contains string needle, icase flag.", args(1,4, arg("",bit),arg("haystack",str),arg("needle",str),arg("icase",bit))), + pattern("str", "contains", STRcontains, false, "Check if string haystack contains string needle, icase flag.", args(1,4, arg("",bit),arg("haystack",str),arg("needle",str),arg("icase",bit))), command("str", "toLower", STRlower, false, "Convert a string to lower case.", args(1,2, arg("",str),arg("s",str))), command("str", "toUpper", STRupper, false, "Convert a string to upper case.", args(1,2, arg("",str),arg("s",str))), pattern("str", "search", STRstr_search, false, "Search for a substring. Returns\nposition, -1 if not found.", args(1,3, arg("",int),arg("s",str),arg("c",str))), ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - Reenable test.
Changeset: 0c600defebd3 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/0c600defebd3 Modified Files: sql/test/strimps/Tests/All Branch: Dec2023 Log Message: Reenable test. diffs (10 lines): diff --git a/sql/test/strimps/Tests/All b/sql/test/strimps/Tests/All --- a/sql/test/strimps/Tests/All +++ b/sql/test/strimps/Tests/All @@ -1,5 +1,5 @@ strimps_stable_counts -# strimps_stable_counts_starts_ends_contains +strimps_stable_counts_starts_ends_contains persisted_strimp strimps_not_like small_string_crash ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - fixed atom_cmp on null's with different types
Changeset: aa690f802055 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/aa690f802055 Modified Files: sql/server/sql_atom.c Branch: Dec2023 Log Message: fixed atom_cmp on null's with different types diffs (12 lines): diff --git a/sql/server/sql_atom.c b/sql/server/sql_atom.c --- a/sql/server/sql_atom.c +++ b/sql/server/sql_atom.c @@ -822,7 +822,7 @@ atom_cmp(atom *a1, atom *a2) if (a1->isnull != a2->isnull) return -1; if ( a1->isnull) - return 0; + return !(a1->tpe.type->localtype == a2->tpe.type->localtype); if ( a1->tpe.type->localtype != a2->tpe.type->localtype) { switch (ATOMstorage(a1->tpe.type->localtype)) { case TYPE_bte: ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - Move starts ends with tests to proper place
Changeset: 2361b6e2bb65 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/2361b6e2bb65 Modified Files: monetdb5/modules/atoms/Tests/startswith_join.test sql/test/strimps/Tests/strimps_stable_counts_starts_ends_contains.test Branch: Dec2023 Log Message: Move starts ends with tests to proper place diffs (truncated from 605 to 300 lines): diff --git a/monetdb5/modules/atoms/Tests/startswith_join.test b/monetdb5/modules/atoms/Tests/startswith_join.test --- a/monetdb5/modules/atoms/Tests/startswith_join.test +++ b/monetdb5/modules/atoms/Tests/startswith_join.test @@ -99,3 +99,333 @@ drop table foo statement ok drop table bar + +statement ok +CREATE TABLE fal(x STRING) + +statement ok +CREATE TABLE f(y STRING) + +statement ok +COPY 100 RECORDS INTO fal FROM STDIN + +Mary Garcia +James Ballard +Alexandria Harris +Dakota Howell +Tracy Glover +Mark Cook +James Woodard +Sophia Stone +Jeffrey Ramirez +Ryan Knight +Taylor Lane +Christopher Russell +Daniel Sims +Tony Watts +Dwayne Johnson +Jason Dunlap +Abigail Burton +Maria Lewis +Ashley Taylor +Emma Abbott +James Whitney +Philip Maldonado +Rachel Taylor +Tina Singleton +Ricky Johnson +Anthony Peterson +Eugene Mata +Tyler Terry +Thomas Morales +Kathy Moore +William Franco +Christopher Williams +David Carter +Andrew Alvarado +John Jenkins +Anthony Charles +Jose Tran +Amy Stafford +Vincent Malone +Ashley Waters +Cindy Huffman +Anthony Hernandez +Brett Hardy +Lisa Matthews +Jeffrey Ingram +Jessica Miller +Karen Jones +Terry Sanders +Aaron Rodriguez +Kyle Ortega +David Clark +Brent Garrett +Scott Young +Shannon Edwards +Tiffany Macias +Ricky Gonzalez +Devin Logan +Russell Walker +Michael Nguyen +Heather Robinson +April Lawrence +Christopher Williams +Laura Gonzalez +Patrick Ortiz +Sylvia Phillips +Cynthia Kemp +Stephanie Gillespie +Elizabeth Joseph +Jay Collins +Johnny Gibson +Dr. Audrey Sellers MD +Desiree Li +Heather Brown +Shelly Bauer +Donna Anderson +Amy Sharp +Olivia Howell +Margaret Tran +Alexandra Jarvis +Glen Ray +Michael Mendoza +Sarah Hall +Dennis Moss +Wanda Brooks +Debra Powers +Shannon Nguyen +Daisy Mcdonald +Donna Rivera +Samuel Jackson +Wendy Howe +Connor Howell +Jeffrey Newman +Daniel Sullivan +Megan Dunn +Laura Holland +Brendan Bates +Mary Miller +Thomas Ramirez +Leah Holland +Megan Warren + +statement ok +COPY 100 RECORDS INTO f FROM STDIN + +Noah +Ronald +Mary +Jennifer +Tanya +Ivan +Randy +Erin +Ryan +Scott +Kathryn +Brandi +Rebecca +Katie +Diane +Stephen +Michael +Jeremiah +Timothy +James +Mark +Thomas +Leslie +Robert +Joel +James +Anna +Alan +Janet +Samuel +Tanya +Russell +Alexis +Scott +Jenna +Eric +Andrew +Sandra +Stephanie +Jeremy +Don +Lisa +Jacqueline +Melissa +Patricia +Ana +Danielle +Cheryl +Justin +Karen +Pamela +Beverly +Becky +Caitlin +Michael +Emma +Darlene +Darrell +David +Wanda +Sydney +Susan +Louis +Brittany +William +Daniel +Laura +Kevin +Jonathon +James +Robert +Denise +Cassandra +Stephanie +Samuel +Kaitlyn +David +Katrina +Nathan +Jessica +Michelle +Veronica +Rachel +Andrew +Jennifer +William +Melanie +Larry +Ronald +Sally +Joshua +Chelsea +Ashley +Johnny +Chad +Nicole +Joshua +Michele +Joseph +Carolyn + +query TT rowsort +SELECT * FROM fal,f WHERE [fal.x] startswith [f.y] + +Andrew Alvarado +Andrew +Andrew Alvarado +Andrew +Ashley Taylor +Ashley +Ashley Waters +Ashley +Daniel Sims +Daniel +Daniel Sullivan +Daniel +David Carter +David +David Carter +David +David Clark +David +David Clark +David +Donna Anderson +Don +Donna Rivera +Don +Emma Abbott +Emma +James Ballard +James +James Ballard +James +James Ballard +James +James Whitney +James +James Whitney +James +James Whitney +James +James Woodard +James +James Woodard +James +James Woodard +James +Jessica Miller +Jessica +Johnny Gibson +Johnny +Karen Jones +Karen +Laura Gonzalez +Laura +Laura Holland +Laura +Lisa Matthews +Lisa +Mark Cook +Mark +Mary Garcia +Mary +Mary Miller +Mary +Michael Mendoza +Michael +Michael Mendoza +Michael +Michael Nguyen +Michael +Michael Nguyen +Michael +Rachel Taylor +Rachel +Russell Walker +Russell +Ryan Knight ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - Rename test with proper description
Changeset: 7a9aa58fc680 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/7a9aa58fc680 Added Files: sql/test/strimps/Tests/strimps_stable_counts_contains.test Removed Files: sql/test/strimps/Tests/strimps_stable_counts_starts_ends_contains.test Branch: Dec2023 Log Message: Rename test with proper description diffs (3 lines): diff --git a/sql/test/strimps/Tests/strimps_stable_counts_starts_ends_contains.test b/sql/test/strimps/Tests/strimps_stable_counts_contains.test rename from sql/test/strimps/Tests/strimps_stable_counts_starts_ends_contains.test rename to sql/test/strimps/Tests/strimps_stable_counts_contains.test ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - rename test in All file
Changeset: f9d34e78e779 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/f9d34e78e779 Modified Files: sql/test/strimps/Tests/All Branch: Dec2023 Log Message: rename test in All file diffs (10 lines): diff --git a/sql/test/strimps/Tests/All b/sql/test/strimps/Tests/All --- a/sql/test/strimps/Tests/All +++ b/sql/test/strimps/Tests/All @@ -1,5 +1,5 @@ strimps_stable_counts -strimps_stable_counts_starts_ends_contains +strimps_stable_counts_contains persisted_strimp strimps_not_like small_string_crash ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - Fixes for missing tests.
Changeset: fbb12277abdc for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/fbb12277abdc Added Files: sql/test/rel-optimizers/Tests/join-merge-remote-replica-plan.test Removed Files: sql/test/rel-optimizers/Tests/join-merge-remote-replica.test Modified Files: monetdb5/mal/Tests/All Branch: Dec2023 Log Message: Fixes for missing tests. diffs (14 lines): diff --git a/monetdb5/mal/Tests/All b/monetdb5/mal/Tests/All --- a/monetdb5/mal/Tests/All +++ b/monetdb5/mal/Tests/All @@ -7,7 +7,6 @@ tst005 tst006 tst007 tst008 -tst009 tst010 tst011 tst012 diff --git a/sql/test/rel-optimizers/Tests/join-merge-remote-replica.test b/sql/test/rel-optimizers/Tests/join-merge-remote-replica-plan.test rename from sql/test/rel-optimizers/Tests/join-merge-remote-replica.test rename to sql/test/rel-optimizers/Tests/join-merge-remote-replica-plan.test ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - Backport some changes from the default branch.
Changeset: 0cd7b7d1e433 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/0cd7b7d1e433 Modified Files: monetdb5/modules/atoms/str.c Branch: Dec2023 Log Message: Backport some changes from the default branch. diffs (truncated from 334 to 300 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -108,7 +108,7 @@ */ /* These tables were generated from the Unicode 13.0.0 spec. */ -const struct UTF8_lower_upper { +static const struct UTF8_lower_upper { const unsigned int from, to; } UTF8_toUpper[] = { /* code points with non-null uppercase conversion */ {0x0061, 0x0041,}, @@ -3102,55 +3102,29 @@ STRepilogue(void *ret) } #ifndef NDEBUG -static void -UTF8_assert(const char *restrict s) +static inline void +UTF8_assert(const char *s) { - int c; - - if (s == NULL) - return; - if (*s == '\200' && s[1] == '\0') - return; /* str_nil */ - while ((c = *s++) != '\0') { - if ((c & 0x80) == 0) - continue; - if ((*s++ & 0xC0) != 0x80) - assert(0); - if ((c & 0xE0) == 0xC0) - continue; - if ((*s++ & 0xC0) != 0x80) - assert(0); - if ((c & 0xF0) == 0xE0) - continue; - if ((*s++ & 0xC0) != 0x80) - assert(0); - if ((c & 0xF8) == 0xF0) - continue; - assert(0); - } + assert(strNil(s) || utf8valid(s) == 0); } #else #define UTF8_assert(s) ((void) 0) #endif +/* return how many codepoints in the substring end in s starts */ static inline int UTF8_strpos(const char *s, const char *end) { - int pos = 0; - UTF8_assert(s); if (s > end) { return -1; } - while (s < end) { - /* just count leading bytes of encoded code points; only works -* for correctly encoded UTF-8 */ - pos += (*s++ & 0xC0) != 0x80; - } - return pos; + return (int) utf8nlen(s, (size_t) (end - s)); } +/* return a pointer to the byte that starts the pos'th (0-based) + * codepoint in s */ static inline str UTF8_strtail(const char *s, int pos) { @@ -3166,6 +3140,7 @@ UTF8_strtail(const char *s, int pos) return (str) s; } +/* copy n Unicode codepoints from s to dst, return pointer to new end */ static inline str UTF8_strncpy(char *restrict dst, const char *restrict s, int n) { @@ -3196,56 +3171,29 @@ UTF8_strncpy(char *restrict dst, const c return dst; } -static inline str -UTF8_offset(char *restrict s, int n) -{ - UTF8_assert(s); - while (*s && n) { - if ((*s & 0xF8) == 0xF0) { - /* 4 byte UTF-8 sequence */ - s += 4; - } else if ((*s & 0xF0) == 0xE0) { - /* 3 byte UTF-8 sequence */ - s += 3; - } else if ((*s & 0xE0) == 0xC0) { - /* 2 byte UTF-8 sequence */ - s += 2; - } else { - /* 1 byte UTF-8 "sequence" */ - s++; - } - n--; - } - return s; -} - +/* return number of Unicode codepoints in s; s is not nil */ int -UTF8_strlen(const char *restrict s) -{ /* This function assumes, s is never nil */ - size_t pos = 0; - +UTF8_strlen(const char *s) +{ /* This function assumes s is never nil */ UTF8_assert(s); assert(!strNil(s)); - while (*s) { - /* just count leading bytes of encoded code points; only works -* for correctly encoded UTF-8 */ - pos += (*s++ & 0xC0) != 0x80; - } - assert(pos < INT_MAX); - return (int) pos; + return (int) utf8len(s); } +/* return (int) strlen(s); s is not nil */ int -str_strlen(const char *restrict s) -{ /* This function assumes, s is never nil */ - size_t pos = strlen(s); - assert(pos < INT_MAX); - return (int) pos; +str_strlen(const char *s) +{ /* This function assumes s is never nil */ + UTF8_assert(s); + assert(!strNil(s)); + + return (int) strlen(s); } +/* return the display width of s */ int -UTF8_strwidth(const char *restrict s) +UTF8_strwidth(const char *s) { int len = 0; int c; @@ -3632,6 +3580,10 @@ STRTail(str *res, const str *arg1, const return msg; } +/* copy the substring s[o
MonetDB: default - Merge with Dec2023 branch.
Changeset: be41c8cdf35d for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/be41c8cdf35d Modified Files: clients/Tests/MAL-signatures-hge.test clients/Tests/MAL-signatures.test monetdb5/modules/atoms/str.c sql/server/sql_atom.c Branch: default Log Message: Merge with Dec2023 branch. diffs (truncated from 676 to 300 lines): diff --git a/clients/Tests/MAL-signatures-hge.test b/clients/Tests/MAL-signatures-hge.test --- a/clients/Tests/MAL-signatures-hge.test +++ b/clients/Tests/MAL-signatures-hge.test @@ -50692,7 +50692,7 @@ str contains pattern str.contains(X_0:str, X_1:str, X_2:bit):bit STRcontains; -Check if string chaystack contains string needle, icase flag. +Check if string haystack contains string needle, icase flag. str containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) diff --git a/clients/Tests/MAL-signatures.test b/clients/Tests/MAL-signatures.test --- a/clients/Tests/MAL-signatures.test +++ b/clients/Tests/MAL-signatures.test @@ -39017,7 +39017,7 @@ str contains pattern str.contains(X_0:str, X_1:str, X_2:bit):bit STRcontains; -Check if string chaystack contains string needle, icase flag. +Check if string haystack contains string needle, icase flag. str containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) diff --git a/monetdb5/mal/Tests/All b/monetdb5/mal/Tests/All --- a/monetdb5/mal/Tests/All +++ b/monetdb5/mal/Tests/All @@ -7,7 +7,6 @@ tst005 tst006 tst007 tst008 -tst009 tst010 tst011 tst012 diff --git a/monetdb5/modules/atoms/Tests/startswith_join.test b/monetdb5/modules/atoms/Tests/startswith_join.test --- a/monetdb5/modules/atoms/Tests/startswith_join.test +++ b/monetdb5/modules/atoms/Tests/startswith_join.test @@ -99,3 +99,333 @@ drop table foo statement ok drop table bar + +statement ok +CREATE TABLE fal(x STRING) + +statement ok +CREATE TABLE f(y STRING) + +statement ok +COPY 100 RECORDS INTO fal FROM STDIN + +Mary Garcia +James Ballard +Alexandria Harris +Dakota Howell +Tracy Glover +Mark Cook +James Woodard +Sophia Stone +Jeffrey Ramirez +Ryan Knight +Taylor Lane +Christopher Russell +Daniel Sims +Tony Watts +Dwayne Johnson +Jason Dunlap +Abigail Burton +Maria Lewis +Ashley Taylor +Emma Abbott +James Whitney +Philip Maldonado +Rachel Taylor +Tina Singleton +Ricky Johnson +Anthony Peterson +Eugene Mata +Tyler Terry +Thomas Morales +Kathy Moore +William Franco +Christopher Williams +David Carter +Andrew Alvarado +John Jenkins +Anthony Charles +Jose Tran +Amy Stafford +Vincent Malone +Ashley Waters +Cindy Huffman +Anthony Hernandez +Brett Hardy +Lisa Matthews +Jeffrey Ingram +Jessica Miller +Karen Jones +Terry Sanders +Aaron Rodriguez +Kyle Ortega +David Clark +Brent Garrett +Scott Young +Shannon Edwards +Tiffany Macias +Ricky Gonzalez +Devin Logan +Russell Walker +Michael Nguyen +Heather Robinson +April Lawrence +Christopher Williams +Laura Gonzalez +Patrick Ortiz +Sylvia Phillips +Cynthia Kemp +Stephanie Gillespie +Elizabeth Joseph +Jay Collins +Johnny Gibson +Dr. Audrey Sellers MD +Desiree Li +Heather Brown +Shelly Bauer +Donna Anderson +Amy Sharp +Olivia Howell +Margaret Tran +Alexandra Jarvis +Glen Ray +Michael Mendoza +Sarah Hall +Dennis Moss +Wanda Brooks +Debra Powers +Shannon Nguyen +Daisy Mcdonald +Donna Rivera +Samuel Jackson +Wendy Howe +Connor Howell +Jeffrey Newman +Daniel Sullivan +Megan Dunn +Laura Holland +Brendan Bates +Mary Miller +Thomas Ramirez +Leah Holland +Megan Warren + +statement ok +COPY 100 RECORDS INTO f FROM STDIN + +Noah +Ronald +Mary +Jennifer +Tanya +Ivan +Randy +Erin +Ryan +Scott +Kathryn +Brandi +Rebecca +Katie +Diane +Stephen +Michael +Jeremiah +Timothy +James +Mark +Thomas +Leslie +Robert +Joel +James +Anna +Alan +Janet +Samuel +Tanya +Russell +Alexis +Scott +Jenna +Eric +Andrew +Sandra +Stephanie +Jeremy +Don +Lisa +Jacqueline +Melissa +Patricia +Ana +Danielle +Cheryl +Justin +Karen +Pamela +Beverly +Becky +Caitlin +Michael +Emma +Darlene +Darrell +David +Wanda +Sydney +Susan +Louis +Brittany +William +Daniel +Laura +Kevin +Jonathon +James +Robert +Denise +Cassandra +Stephanie +Samuel +Kaitlyn +David +Katrina +Nathan +Jessica +Michelle +Veronica +Rachel +Andrew +Jennifer +William +Melanie +Larry +Ronald +Sally +Joshua +Chelsea +Ashley +Johnny +Chad +Nicole +Joshua +Michele +Joseph +Carolyn + +query TT rowsort +SELECT * FROM fal,f WHERE [fal.x] startswith [f.y] + +Andrew Alvarado +Andrew +Andrew Alvarado +Andrew +Ashley Taylor +Ashley +Ashley Waters +Ashley +Daniel Sims +Daniel +Daniel Sullivan +Daniel +David Carter +David +David Carter +David +David Clark +David +David Clark +David +Donna Anderson +Don +Donna Rivera +Don +Emma Abbott +Emma +James Ballard +James +James Ballard +James +James Ballard +James +James Whitney +James +James Whitney +Jam
MonetDB: ascii-flag - Merge with default branch.
Changeset: d9691b2657a4 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/d9691b2657a4 Modified Files: clients/Tests/MAL-signatures-hge.test clients/Tests/MAL-signatures.test monetdb5/modules/atoms/str.c monetdb5/modules/atoms/str.h monetdb5/modules/kernel/batstr.c Branch: ascii-flag Log Message: Merge with default branch. diffs (truncated from 5629 to 300 lines): diff --git a/clients/ChangeLog.Dec2023 b/clients/ChangeLog.Dec2023 --- a/clients/ChangeLog.Dec2023 +++ b/clients/ChangeLog.Dec2023 @@ -1,3 +1,8 @@ # ChangeLog file for clients # This file is updated with Maddlog +* Wed Mar 6 2024 Sjoerd Mullender +- Fixed an issue where mclient wouldn't exit if the server it had + connected to exited for whatever reason while the client was waiting + for a query result. + diff --git a/clients/Tests/MAL-signatures-hge.test b/clients/Tests/MAL-signatures-hge.test --- a/clients/Tests/MAL-signatures-hge.test +++ b/clients/Tests/MAL-signatures-hge.test @@ -50692,17 +50692,7 @@ str contains pattern str.contains(X_0:str, X_1:str, X_2:bit):bit STRcontains; -Check if string chaystack contains string needle, icase flag. -str -containsjoin -pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] -STRcontainsjoin1; -The same as STRcontainsjoin, but only produce one output + icase. -str -containsjoin -pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] -STRcontainsjoin1; -The same as STRcontainsjoin, but only produce one output. +Check if string haystack contains string needle, icase flag. str containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) @@ -50710,10 +50700,20 @@ STRcontainsjoin; Join the string bat L with the bat R if L contains the string of R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows + icase. str containsjoin +pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] +STRcontainsjoin; +The same as STRcontainsjoin, but only produce one output + icase. +str +containsjoin pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit) (X_7:bat[:oid], X_8:bat[:oid]) STRcontainsjoin; Join the string bat L with the bat R if L contains the string of R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows. str +containsjoin +pattern str.containsjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] +STRcontainsjoin; +The same as STRcontainsjoin, but only produce one output. +str containsselect pattern str.containsselect(X_0:bat[:str], X_1:bat[:oid], X_2:str, X_3:bit):bat[:oid] STRcontainsselect; @@ -50735,25 +50735,25 @@ STRendswith; Check if string ends with substring, icase flag. str endswithjoin -pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] -STRendswithjoin1; -The same as STRendswithjoin, but only produce one output + icase. -str -endswithjoin -pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] -STRendswithjoin1; -The same as STRendswithjoin, but only produce one output. -str -endswithjoin pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit) (X_8:bat[:oid], X_9:bat[:oid]) STRendswithjoin; Join the string bat L with the suffix bat R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows + icase. str endswithjoin +pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:bit], X_3:bat[:oid], X_4:bat[:oid], X_5:bit, X_6:lng, X_7:bit):bat[:oid] +STRendswithjoin; +The same as STRendswithjoin, but only produce one output + icase. +str +endswithjoin pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit) (X_7:bat[:oid], X_8:bat[:oid]) STRendswithjoin; Join the string bat L with the suffix bat R@with optional candidate lists SL and SR@The result is two aligned bats with oids of matching rows. str +endswithjoin +pattern str.endswithjoin(X_0:bat[:str], X_1:bat[:str], X_2:bat[:oid], X_3:bat[:oid], X_4:bit, X_5:lng, X_6:bit):bat[:oid] +STRendswithjoin; +The same as STRendswithjoin, but only produce one output. +str endswithselect pattern str.endswithselect(X_0:bat[:str], X_1:bat[:oid], X_2:str, X_3:bit):bat[:oid] STRendswithselect; @@ -50900,25 +50900,25 @@ STRstartswith; Check if string starts with substring, icase flag. str startswith
MonetDB: Dec2023 - Uppercase and lowercase versions aren't neces...
Changeset: fab41aa10761 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/fab41aa10761 Modified Files: monetdb5/modules/atoms/str.c Branch: Dec2023 Log Message: Uppercase and lowercase versions aren't necessarily the same length. diffs (33 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -3809,15 +3809,23 @@ str_is_suffix(const char *s, const char return strcmp(s + sl - sul, suffix); } +/* case insensitive endswith check */ int str_is_isuffix(const char *s, const char *suffix, int sul) { - int sl = str_strlen(s); - - if (sl < sul) - return -1; - else - return utf8casecmp(s + sl - sul, suffix); + const char *e = s + strlen(s); + const char *sf; + + (void) sul; + /* note that the uppercase and lowercase forms of a character aren't +* necessarily the same length in their UTF-8 encodings */ + for (sf = suffix; *sf && e > s; sf++) { + if ((*sf & 0xC0) != 0x80) { + while ((*--e & 0xC0) == 0x80) + ; + } + } + return *sf != 0 || utf8casecmp(e, suffix) != 0; } int ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - A little cleanup.
Changeset: 9f76d438c740 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/9f76d438c740 Modified Files: monetdb5/modules/atoms/str.c Branch: Dec2023 Log Message: A little cleanup. diffs (40 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -3832,7 +3832,6 @@ int str_contains(const char *h, const char *n, int nlen) { (void) nlen; - /* 64bit: should return lng */ return strstr(h, n) ? 0 : 1; } @@ -3840,7 +3839,6 @@ int str_icontains(const char *h, const char *n, int nlen) { (void) nlen; - /* 64bit: should return lng */ return utf8casestr(h, n) ? 0 : 1; } @@ -5237,9 +5235,9 @@ str_select(BAT *bn, BAT *b, BAT *s, stru qry_ctx->querytimeout) : 0; if (anti) /* keep nulls ? (use false for now) */ - scanloop_anti(v && *v != '\200' && str_cmp(v, key, klen) != 0, keep_nulls); + scanloop_anti(!strNil(v) && str_cmp(v, key, klen) != 0, keep_nulls); else - scanloop(v && *v != '\200' && str_cmp(v, key, klen) == 0, keep_nulls); + scanloop(!strNil(v) && str_cmp(v, key, klen) == 0, keep_nulls); bailout: bat_iterator_end(&bi); @@ -6097,7 +6095,7 @@ startswith_join(BAT **rl_ptr, BAT **rr_p size_t counter = 0; if (anti) - STR_JOIN_NESTED_LOOP((str_cmp(vl, vr, vr_len) != 0), str_strlen(vr), fname); + STR_JOIN_NESTED_LOOP(str_cmp(vl, vr, vr_len) != 0, str_strlen(vr), fname); else STARTSWITH_SORTED_LOOP(str_cmp(vl, vr, vr_len), str_strlen(vr), fname); ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - A little more cleanup.
Changeset: eeec21074587 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/eeec21074587 Modified Files: monetdb5/modules/atoms/str.c Branch: Dec2023 Log Message: A little more cleanup. diffs (155 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -3832,24 +3832,22 @@ int str_contains(const char *h, const char *n, int nlen) { (void) nlen; - return strstr(h, n) ? 0 : 1; + return strstr(h, n) == NULL; } int str_icontains(const char *h, const char *n, int nlen) { (void) nlen; - return utf8casestr(h, n) ? 0 : 1; + return utf8casestr(h, n) == NULL; } -#define STR_MAPARGS(STK, PCI, R, S1, S2, ICASE) \ - do{ \ - R = getArgReference(STK, PCI, 0); \ - S1 = *getArgReference_str(STK, PCI, 1); \ - S2 = *getArgReference_str(STK, PCI, 2); \ - icase = PCI->argc == 4 && \ - *getArgReference_bit(STK, PCI, 3) ? true : false; \ - \ +#define STR_MAPARGS(STK, PCI, R, S1, S2, ICASE) \ + do{ \ + R = getArgReference(STK, PCI, 0); \ + S1 = *getArgReference_str(STK, PCI, 1); \ + S2 = *getArgReference_str(STK, PCI, 2); \ + icase = PCI->argc == 4 && *getArgReference_bit(STK, PCI, 3); \ } while(0) static str @@ -3950,8 +3948,7 @@ STRstr_search(Client cntxt, MalBlkPtr mb bit *res = getArgReference(stk, pci, 0); const str *haystack = getArgReference(stk, pci, 1), *needle = getArgReference(stk, pci, 2); - bit icase = pci->argc == 4 - && *getArgReference_bit(stk, pci, 3) ? true : false; + bit icase = pci->argc == 4 && *getArgReference_bit(stk, pci, 3); str s = *haystack, h = *needle, msg = MAL_SUCCEED; if (strNil(s) || strNil(h)) { *res = bit_nil; @@ -4012,8 +4009,7 @@ STRrevstr_search(Client cntxt, MalBlkPtr bit *res = getArgReference(stk, pci, 0); const str *haystack = getArgReference(stk, pci, 1); const str *needle = getArgReference(stk, pci, 2); - bit icase = pci->argc == 4 - && *getArgReference_bit(stk, pci, 3) ? true : false; + bit icase = pci->argc == 4 && *getArgReference_bit(stk, pci, 3); str s = *haystack, h = *needle, msg = MAL_SUCCEED; if (strNil(s) || strNil(h)) { *res = bit_nil; @@ -5364,8 +5360,8 @@ STRselect(bat *r_id, const bat *b_id, co B_ID = getArgReference(STK, PCI, 1); \ CB_ID = getArgReference(STK, PCI, 2); \ KEY = *getArgReference_str(STK, PCI, 3); \ - ICASE = PCI->argc == 5 ? false : true; \ - ANTI = PCI->argc == 5 ? *getArgReference_bit(STK, PCI, 4) : \ + ICASE = PCI->argc != 5; \ + ANTI = PCI->argc == 5 ? *getArgReference_bit(STK, PCI, 4) : \ *getArgReference_bit(STK, PCI, 5); \ } while (0) @@ -5596,7 +5592,7 @@ STRcontainsselect(Client cntxt, MalBlkPt } \ } while (0) -#define STARTSWITH_SORTED_LOOP(STR_CMP, STR_LEN, FNAME) \ +#define STARTSWITH_SORTED_LOOP(STR_CMP, STR_LEN, FNAME) \ do { \ canditer_init(&rci, sorted_r, sorted_cr); \ canditer_init(&lci, sorted_l, sorted_cl); \
MonetDB: default - Merge with Dec2023 branch.
Changeset: 13c81f68c45d for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/13c81f68c45d Modified Files: monetdb5/modules/atoms/str.c Branch: default Log Message: Merge with Dec2023 branch. diffs (207 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -3835,41 +3835,45 @@ str_is_suffix(const char *s, const char return strcmp(s + sl - sul, suffix); } +/* case insensitive endswith check */ int str_is_isuffix(const char *s, const char *suffix, int sul) { - int sl = str_strlen(s); - - if (sl < sul) - return -1; - else - return utf8casecmp(s + sl - sul, suffix); + const char *e = s + strlen(s); + const char *sf; + + (void) sul; + /* note that the uppercase and lowercase forms of a character aren't +* necessarily the same length in their UTF-8 encodings */ + for (sf = suffix; *sf && e > s; sf++) { + if ((*sf & 0xC0) != 0x80) { + while ((*--e & 0xC0) == 0x80) + ; + } + } + return *sf != 0 || utf8casecmp(e, suffix) != 0; } int str_contains(const char *h, const char *n, int nlen) { (void) nlen; - /* 64bit: should return lng */ - return strstr(h, n) ? 0 : 1; + return strstr(h, n) == NULL; } int str_icontains(const char *h, const char *n, int nlen) { (void) nlen; - /* 64bit: should return lng */ - return utf8casestr(h, n) ? 0 : 1; + return utf8casestr(h, n) == NULL; } -#define STR_MAPARGS(STK, PCI, R, S1, S2, ICASE) \ - do{ \ - R = getArgReference(STK, PCI, 0); \ - S1 = *getArgReference_str(STK, PCI, 1); \ - S2 = *getArgReference_str(STK, PCI, 2); \ - icase = PCI->argc == 4 && \ - *getArgReference_bit(STK, PCI, 3) ? true : false; \ - \ +#define STR_MAPARGS(STK, PCI, R, S1, S2, ICASE) \ + do{ \ + R = getArgReference(STK, PCI, 0); \ + S1 = *getArgReference_str(STK, PCI, 1); \ + S2 = *getArgReference_str(STK, PCI, 2); \ + icase = PCI->argc == 4 && *getArgReference_bit(STK, PCI, 3); \ } while(0) static str @@ -3970,8 +3974,7 @@ STRstr_search(Client cntxt, MalBlkPtr mb bit *res = getArgReference(stk, pci, 0); const str *haystack = getArgReference(stk, pci, 1), *needle = getArgReference(stk, pci, 2); - bit icase = pci->argc == 4 - && *getArgReference_bit(stk, pci, 3) ? true : false; + bit icase = pci->argc == 4 && *getArgReference_bit(stk, pci, 3); str s = *haystack, h = *needle, msg = MAL_SUCCEED; if (strNil(s) || strNil(h)) { *res = bit_nil; @@ -4032,8 +4035,7 @@ STRrevstr_search(Client cntxt, MalBlkPtr bit *res = getArgReference(stk, pci, 0); const str *haystack = getArgReference(stk, pci, 1); const str *needle = getArgReference(stk, pci, 2); - bit icase = pci->argc == 4 - && *getArgReference_bit(stk, pci, 3) ? true : false; + bit icase = pci->argc == 4 && *getArgReference_bit(stk, pci, 3); str s = *haystack, h = *needle, msg = MAL_SUCCEED; if (strNil(s) || strNil(h)) { *res = bit_nil; @@ -5251,9 +5253,9 @@ str_select(BAT *bn, BAT *b, BAT *s, stru qry_ctx = qry_ctx ? qry_ctx : &(QryCtx) {.endtime = 0}; if (anti) /* keep nulls ? (use false for now) */ - scanloop_anti(v && *v != '\200' && str_cmp(v, key, klen) != 0, keep_nulls); + scanloop_anti(!strNil(v) && str_cmp(v, key, klen) != 0, keep_nulls); else - scanloop(v && *v != '\200' && str_cmp(v, key, klen) == 0, keep_nulls); + scanloop(!strNil(v) && str_cmp(v, key, klen) == 0, keep_nulls); bailout: bat_iterator_end(&bi); @@ -5380,8 +5382,8 @@ STRselect(bat *r_id, const bat *b_id, co B_ID = getArgReference(STK, PCI, 1);
MonetDB: ascii-flag - Merge with default branch.
Changeset: f4a4f1e76e62 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/f4a4f1e76e62 Modified Files: monetdb5/modules/atoms/str.c Branch: ascii-flag Log Message: Merge with default branch. diffs (207 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -799,41 +799,45 @@ str_is_suffix(const char *s, const char return strcmp(s + sl - sul, suffix); } +/* case insensitive endswith check */ int str_is_isuffix(const char *s, const char *suffix, int sul) { - int sl = str_strlen(s); - - if (sl < sul) - return -1; - else - return GDKstrcasecmp(s + sl - sul, suffix); + const char *e = s + strlen(s); + const char *sf; + + (void) sul; + /* note that the uppercase and lowercase forms of a character aren't +* necessarily the same length in their UTF-8 encodings */ + for (sf = suffix; *sf && e > s; sf++) { + if ((*sf & 0xC0) != 0x80) { + while ((*--e & 0xC0) == 0x80) + ; + } + } + return *sf != 0 || GDKstrcasecmp(e, suffix) != 0; } int str_contains(const char *h, const char *n, int nlen) { (void) nlen; - /* 64bit: should return lng */ - return strstr(h, n) ? 0 : 1; + return strstr(h, n) == NULL; } int str_icontains(const char *h, const char *n, int nlen) { (void) nlen; - /* 64bit: should return lng */ - return GDKstrcasestr(h, n) ? 0 : 1; + return GDKstrcasestr(h, n) == NULL; } -#define STR_MAPARGS(STK, PCI, R, S1, S2, ICASE) \ - do{ \ - R = getArgReference(STK, PCI, 0); \ - S1 = *getArgReference_str(STK, PCI, 1); \ - S2 = *getArgReference_str(STK, PCI, 2); \ - icase = PCI->argc == 4 && \ - *getArgReference_bit(STK, PCI, 3) ? true : false; \ - \ +#define STR_MAPARGS(STK, PCI, R, S1, S2, ICASE) \ + do{ \ + R = getArgReference(STK, PCI, 0); \ + S1 = *getArgReference_str(STK, PCI, 1); \ + S2 = *getArgReference_str(STK, PCI, 2); \ + icase = PCI->argc == 4 && *getArgReference_bit(STK, PCI, 3); \ } while(0) static str @@ -932,8 +936,7 @@ STRstr_search(Client cntxt, MalBlkPtr mb bit *res = getArgReference(stk, pci, 0); const str *haystack = getArgReference(stk, pci, 1), *needle = getArgReference(stk, pci, 2); - bit icase = pci->argc == 4 - && *getArgReference_bit(stk, pci, 3) ? true : false; + bit icase = pci->argc == 4 && *getArgReference_bit(stk, pci, 3); str s = *haystack, h = *needle, msg = MAL_SUCCEED; if (strNil(s) || strNil(h)) { *res = bit_nil; @@ -988,8 +991,7 @@ STRrevstr_search(Client cntxt, MalBlkPtr int *res = getArgReference_int(stk, pci, 0); const str haystack = *getArgReference_str(stk, pci, 1); const str needle = *getArgReference_str(stk, pci, 2); - bit icase = pci->argc == 4 - && *getArgReference_bit(stk, pci, 3) ? true : false; + bit icase = pci->argc == 4 && *getArgReference_bit(stk, pci, 3); if (strNil(haystack) || strNil(needle)) { *res = bit_nil; @@ -2205,9 +2207,9 @@ str_select(BAT *bn, BAT *b, BAT *s, stru qry_ctx = qry_ctx ? qry_ctx : &(QryCtx) {.endtime = 0}; if (anti) /* keep nulls ? (use false for now) */ - scanloop_anti(v && *v != '\200' && str_cmp(v, key, klen) != 0, keep_nulls); + scanloop_anti(!strNil(v) && str_cmp(v, key, klen) != 0, keep_nulls); else - scanloop(v && *v != '\200' && str_cmp(v, key, klen) == 0, keep_nulls); + scanloop(!strNil(v) && str_cmp(v, key, klen) == 0, keep_nulls); bailout: bat_iterator_end(&bi); @@ -2334,8 +2336,8 @@ STRselect(bat *r_id, const bat *b_id, co B_ID = getArgReference(STK, PCI, 1);
MonetDB: ascii-flag - Add test for case insensitive endswith wit...
Changeset: 14e8bd981063 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/14e8bd981063 Modified Files: monetdb5/modules/atoms/Tests/endswith.test monetdb5/modules/atoms/str.c Branch: ascii-flag Log Message: Add test for case insensitive endswith with unequal length upper/lower case. diffs (53 lines): diff --git a/monetdb5/modules/atoms/Tests/endswith.test b/monetdb5/modules/atoms/Tests/endswith.test --- a/monetdb5/modules/atoms/Tests/endswith.test +++ b/monetdb5/modules/atoms/Tests/endswith.test @@ -42,3 +42,25 @@ query I nosort SELECT endswith('Thomas Müller', 'müller', false) 0 + +# Ⱥ and ⱥ are not the same length in UTF-8 encoding +query I nosort +SELECT endswith('XXXȺ', 'ⱥ', false) + +0 + +query I nosort +SELECT endswith('XXXȺ', 'ⱥ', true) + +1 + +query I nosort +SELECT endswith('xxxⱥ', 'Ⱥ', false) + +0 + +query I nosort +SELECT endswith('xxxⱥ', 'Ⱥ', true) + +1 + diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -3234,15 +3234,15 @@ STRjoin(bat *rl_id, bat *rr_id, const ba do { \ RL_ID = getArgReference(STK, PCI, 0); \ RR_ID = PCI->retc == 1 ? 0 : getArgReference(STK, PCI, 1); \ - int i = PCI->retc == 1 ? 1 : 2; \ + int i = PCI->retc == 1 ? 1 : 2; \ L_ID = getArgReference(STK, PCI, i++); \ R_ID = getArgReference(STK, PCI, i++); \ IC_ID = PCI->argc - PCI->retc == 7 ? \ NULL : getArgReference(stk, pci, i++); \ - CL_ID = getArgReference(STK, PCI, i++); \ - CR_ID = getArgReference(STK, PCI, i++); \ - ANTI = PCI->argc - PCI->retc == 7 ? \ - getArgReference(STK, PCI, 8) : getArgReference(STK, PCI, 9);\ + CL_ID = getArgReference(STK, PCI, i++); \ + CR_ID = getArgReference(STK, PCI, i++); \ + ANTI = PCI->argc - PCI->retc == 7 ? \ + getArgReference(STK, PCI, 8) : getArgReference(STK, PCI, 9); \ } while (0) static inline str ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: Dec2023 - We do need to proceed to the end of the curre...
Changeset: db68aa5c865f for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/db68aa5c865f Modified Files: monetdb5/modules/atoms/str.c Branch: Dec2023 Log Message: We do need to proceed to the end of the current character before checking. diffs (12 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -3825,6 +3825,8 @@ str_is_isuffix(const char *s, const char ; } } + while ((*sf & 0xC0) == 0x80) + sf++; return *sf != 0 || utf8casecmp(e, suffix) != 0; } ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: default - Merge with Dec2023 branch.
Changeset: c53196ba9bbc for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/c53196ba9bbc Modified Files: monetdb5/modules/atoms/str.c Branch: default Log Message: Merge with Dec2023 branch. diffs (12 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -3851,6 +3851,8 @@ str_is_isuffix(const char *s, const char ; } } + while ((*sf & 0xC0) == 0x80) + sf++; return *sf != 0 || utf8casecmp(e, suffix) != 0; } ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org
MonetDB: ascii-flag - Merge with default branch.
Changeset: cb725a3ee930 for MonetDB URL: https://dev.monetdb.org/hg/MonetDB/rev/cb725a3ee930 Modified Files: monetdb5/modules/atoms/str.c Branch: ascii-flag Log Message: Merge with default branch. diffs (12 lines): diff --git a/monetdb5/modules/atoms/str.c b/monetdb5/modules/atoms/str.c --- a/monetdb5/modules/atoms/str.c +++ b/monetdb5/modules/atoms/str.c @@ -815,6 +815,8 @@ str_is_isuffix(const char *s, const char ; } } + while ((*sf & 0xC0) == 0x80) + sf++; return *sf != 0 || GDKstrcasecmp(e, suffix) != 0; } ___ checkin-list mailing list -- checkin-list@monetdb.org To unsubscribe send an email to checkin-list-le...@monetdb.org