IMO, these can all be replaced by IAE because there is nothing I would do as a call site if I caught one of these custom exceptions vs. another, it's all the same issue, probably bad user input. The only reason to create a custom exception would be to wrap additional information like a location (line number, column number), but that's not what you describe here. You can imagine an editor catching a syntax error exception and extracting a line and column number and changing the style for that area of the text.
Gary On Wed, Aug 9, 2023 at 7:30 PM Daniel Watson <dcwatso...@gmail.com> wrote: > > Currently I'm planning a set of exceptions that are thrown for various > reasons. I created multiple classes to allow for clearer testing. > > ReservedCharacterException (extends InvalidCharacterException below) - > thrown specifically when a reserved character is encountered within a token. > > InvalidCharacterException (extends IllegalArgumentException) thrown > directly any time an illegal character is encountered. > > ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero > length token is encountered and Case does not support it. > > There are a few other error cases I believe. I'm not looking at the code > right this moment but I'm fairly certain about the need for the above 3. > > > On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold <elh...@ibiblio.org> > wrote: > > > What happens when a token contains an unpermitted character? > > > > On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson <dcwatso...@gmail.com> wrote: > > > > > > Here's my stab at a spec. Wanted to clarify some parts of the Case > > > interface first before jumping into the implementations. Wondering what a > > > good package name for this stuff is, given that "case" is a reserved > > word? > > > > > > Case (interface) > > > The Case interface defines two methods: > > > * String format(Iterable<String> tokens) > > > The format method accepts an Iterable of String tokens and returns a > > single > > > String formatted according to the implementation. The format method is > > > intended to handle transforming between cases, thus tokens passed to the > > > format() method need not be properly formatted for the given Case > > instance, > > > though they must still respect any reserve character restrictions. > > > * List<String> parse(String string) > > > The parse method accepts a single string and returns a List of string > > > tokens that abide by the Case implementation. > > > Note: format() and parse() methods must be fully reciprocal. ie. On a > > > single Case instance, when calling parse() with a valid string, and > > passing > > > the resulting tokens into format(), a matching string should be returned. > > > > > > DelimitedCase (base class for kebab and snake) > > > Defines a Case where all tokens are separated by a single character > > > delimiter. The delimiter is considered a reserved character and is not > > > allowed to appear within tokens when formatting. No further restrictions > > > are placed on token contents by this base implementation. Tokens can > > > contain any valid Java String character. DelimitedCases can support > > > zero-length tokens, which can occur if there are no characters between > > two > > > instances of the delimiter or if the parsed string begins or ends with > > the > > > delimiter. > > > Note: Other Case implementations may not support zero-length tokens, and > > > attempts to call format(...) with empty tokens may fail. > > > > > > KebabCase > > > Extends DelimitedCase and initializes the delimiter as the hyphen '-' > > > character. This case allows only alphanumeric characters within tokens. > > > > > > SnakeCase > > > Extends DelimitedCase and initializes the delimiter as the underscore '_' > > > character. This case allows only alphanumeric characters within tokens. > > > > > > PascalCase > > > Defines a Case where tokens begin with an uppercase alpha character. All > > > subsequent token characters must be lowercase alpha or numeric > > characters. > > > Whenever an uppercase alpha character is encountered, the previous token > > is > > > considered complete and a new token begins, with the uppercase character > > > being the first character of the new token. PascalCase does not allow > > > zero-length tokens when formatting, as it would violate the reciprocal > > > contract of format() and parse(). > > > > > > CamelCase > > > Extends PascalCase and sets one additional restriction - that the first > > > character of the first token (ie the first character of the full string) > > > must be a lowercase alpha character (rather than the uppercase > > requirement > > > of PascalCase). All other restrictions of PascalCase apply. > > > > > > > > > On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson <dcwatso...@gmail.com> > > wrote: > > > > > > > Kebab case is extremely common for web identifiers, eg html element > > ids, > > > > classes, attributes, etc. > > > > > > > > In regards to PascalCase, i agree that most people won't understand the > > > > reasoning behind the name, but it is nevertheless a widely accepted > > term > > > > for that case style. If an alternative is deemed necessary then > > > > "ProperCase" might work - since that is also how English proper nouns > > are > > > > cased. Understanding that name just depends on your knowledge of > > English > > > > grammar. > > > > > > > > A spec can definitely be written for the 4 provided concrete > > > > implementations. And... I may eat these words but... the spec should > > not be > > > > all that complex. I will take a stab at it. > > > > > > > > Thanks for the feedback! > > > > Any other thoughts or comments are welcome! > > > > > > > > Dan > > > > > > > > > > > > On Tue, Aug 8, 2023, 7:45 PM Elliotte Rusty Harold <elh...@ibiblio.org > > > > > > > wrote: > > > > > > > >> This is a good idea and seems like useful functionality. In order to > > > >> accept it into commons, it needs solid documentation and excellent > > > >> test coverage. I've worked on code like this in another language (not > > > >> Java) and the production bugs were bad. E.g. what happens when a > > > >> string contains numbers as well as letters? > > > >> > > > >> I'd like to see a full spec that unambiguously defines how every > > > >> Unicode string is converted into camel/snake/kebab case. The spec > > > >> should be independent of the code. That's not easy to write but it's > > > >> essential. > > > >> > > > >> I don't want any loose/strict modes. It should all be strict > > according to > > > >> spec. > > > >> > > > >> I've never heard of kebab cases before. Is that a common name? I'd > > > >> also like to rename Pascal case. How many programmers under 40 have > > > >> even heard of Pascal, much less are familiar with its case > > > >> conventions? > > > >> > > > >> Long story short - a PR is premature until there's an agreed upon > > spec. > > > >> > > > >> On Tue, Aug 8, 2023 at 8:04 PM Daniel Watson <dcwatso...@gmail.com> > > > >> wrote: > > > >> > > > > >> > I have a bit of code that adds the ability to parse and format > > strings > > > >> into > > > >> > various case patterns. Wanted to check if it's of worth and > > in-scope for > > > >> > commons-text... > > > >> > > > > >> > Its a bit broader than the existing CaseUtils.toCamelCase(...) > > method. > > > >> > Rather than simply formatting tokens into the case, this API adds > > the > > > >> > additional goal of being able to transform one case to another. e.g. > > > >> > > > > >> > SnakeCase.format(PascalCase.parse("MyPascalString")); // returns > > > >> > My_Pascal_String > > > >> > CamelCase.format(SnakeCase.parse("my_snake_string")); // returns > > > >> > mySnakeString > > > >> > KebabCase.format(CamelCase.parse("myCamelString")); // returns > > > >> > my-Camel-String > > > >> > //Note that kebab and snake do not alter the alphabetic case of the > > > >> tokens, > > > >> > as they are essentially case agnostic joining, according to this > > > >> > implementation. Though this can be overridden by end users. > > > >> > > > > >> > The API has one core interface: Case, which has format and parse > > > >> methods. > > > >> > There is a single abstract implementation of it - > > > >> AbstractConfigurableCase > > > >> > - which is a configuration driven way to create a case pattern. It > > has > > > >> > enough options to accommodate the 4 popular cases, and thus the > > > >> subclasses > > > >> > just have to configure these options rather than implement them > > > >> directly. > > > >> > Any further extensions can override or extend the api as necessary. > > > >> > > > > >> > There are five core concrete implementations: > > > >> > > > > >> > PascalCase > > > >> > CamelCase (extends PascalCase) > > > >> > DelimitedCase > > > >> > KebabCase (extends DelimitedCase) > > > >> > SnakeCase (extends DelimitedCase) > > > >> > > > > >> > Each has a static INSTANCE field to avoid redundant instantiation. > > > >> > > > > >> > Some of my reasoning / concerns... > > > >> > > > > >> > * I considered bundling all of this logic into static methods, > > similar > > > >> to > > > >> > CaseUtils, but that prevents the user from truly customizing or > > > >> extending > > > >> > the code for odd cases. This approach is, in my opinion, far easier > > to > > > >> > understand, extend, and debug. > > > >> > * I believe the parsing side should potentially have a loose / > > strict > > > >> mode, > > > >> > in that the logic can ignore non-critical rules on the parsing side. > > > >> e.g. > > > >> > the command CamelCase.parse("MyString") should work, even though the > > > >> input > > > >> > is not strictly camel case. Strict parsing would ensure (if > > possible) > > > >> that > > > >> > the input abides by all elements of the format. > > > >> > * I'm still unsure about how best to handle reserved characters when > > > >> > translating. e.g. How should > > > >> > KebabCase.format(PascalCase.parse("MyPascal-String")) handle the > > hyphen? > > > >> > Should the kebab case strip the reserved character from the token > > > >> values? > > > >> > > > > >> > Long story short - is this worth pursuing in the form of a pull > > request > > > >> for > > > >> > review? Or is it out of scope for commons-text? > > > >> > > > > >> > Dan > > > >> > > > >> > > > >> > > > >> -- > > > >> Elliotte Rusty Harold > > > >> elh...@ibiblio.org > > > >> > > > >> --------------------------------------------------------------------- > > > >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > > >> For additional commands, e-mail: dev-h...@commons.apache.org > > > >> > > > >> > > > > > > > > -- > > Elliotte Rusty Harold > > elh...@ibiblio.org > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org