Bruce Momjian <br...@momjian.us> writes: > This web page explains the feature: > > https://stackoverflow.com/questions/33462061/sql-server-substring-position-negative-value > but also asks: > now the only question that remains is, "why would anyone need it > to behave this way?"
Yeah. I believe our implementation adheres to the SQL spec, which says this for <character substring function> (in SQL:2021 6.3.2): a) If the character encoding form of <character value expression> is UTF8, UTF16, or UTF32, then, in the remainder of this General Rule, the term “character” shall be taken to mean “unit specified by <char length units>”. b) Let C be the value of the <character value expression>, let LC be the length in characters of C, and let S be the value of the <start position>. c) If <string length> is specified, then let L be the value of <string length> and let E be S+L. Otherwise, let E be the larger of LC+1 and S. d) If at least one of C, S, and L is the null value, then the result of the <character substring function> is the null value. e) If E is less than S, then an exception condition is raised: data exception — substring error (22011). [tgl note: given c), this happens if and only if a negative <string length> is provided.] f) Case: i) If S is greater than LC or if E is less than 1 (one), then the result of the <character substring function> is the zero-length character string. ii) Otherwise, 1) Let S1 be the larger of S and 1 (one). Let E1 be the smaller of E and LC+1. Let L1 be E1–S1. 2) The result of the <character substring function> is a character string containing the L1 characters of C starting at character number S1 in the same order that the characters appear in C. That's a pretty sterling example of standards-ese that is both unreadable and devoid of any justification. But if you trace through the possible effects of a negative S value, it looks like (1) if L >= 0 is specified and S+L (E) is less than one, the result is an empty string per rule f)i). (2) if L >= 0 is specified and S+L (E) is at least one but less than LC+1, then E is the substring end+1 position. (3) otherwise, a negative S is disregarded and replaced by 1 so far as the substring end calculation is concerned. (4) in any case, a negative S is disregarded and replaced by 1 so far as the substring start calculation is concerned. I'm kind of inclined to not document this weirdness. I especially don't think it's worth giving an example that neither explains the "disregarded" bit nor highlights the dependency on L being given. regards, tom lane