I've create the ticket for implementing this [1] [1] https://issues.apache.org/jira/browse/IGNITE-14611
пт, 16 апр. 2021 г. в 16:30, Alexei Scherbakov <alexey.scherbak...@gmail.com >: > > > чт, 15 апр. 2021 г. в 18:21, Andrey Mashenkov <andrey.mashen...@gmail.com > >: > >> Hi Alexey, >> I like the idea. >> >> 1. >> >> > TBL-0001 is a *string representation* of the error. It is built from 2 >> > byte scope id (mapped to name TBL) and 2 byte number (0001). Both >> > internally packed in int. No any kind of parsing will be necessary to >> read >> > scope/category. >> >> I think Alexey mean if it will be possible to make smth like that >> >> catch (IgniteException e) { >> if (e.getScope() == "TBL" && e.getCode() == 1234) >> continue; // E.g. retry my TX >> } >> >> It looks useful to me. >> > > I have in mind something like this: > > public class IgniteException extends RuntimeException { > private int errorCode; > > public IgniteException(ErrorScope scope, int code, String message, > Throwable cause) { > super(message, cause); > this.errorCode = make(scope, code); > } > > public boolean matches(ErrorScope scope, int code) { > return errorCode == make(scope, code); > } > > private int make(ErrorScope scope, int code) { > return ((scope.ordinal() << 16) | code); > } > > public ErrorScope scope() { > return ErrorScope.values()[errorCode >> 16]; > } > > public int code() { > return 0xFFFF & errorCode; > } > > public static void main(String[] args) { > IgniteException e = new IgniteException(ErrorScope.RAFT, 1, > "test", null); > > System.out.println(e.matches(ErrorScope.RAFT, 2)); > System.out.println(e.scope()); > System.out.println(e.code()); > > try { > throw e; > } > catch (IgniteException ee) { > System.out.println(ee.matches(ErrorScope.RAFT, 1)); > } > } > } > > >> >> 2. How you see a cross-module exception throwing? >> Assume, user call -> module A, which recursively call -> module B, which >> fails. >> So, module A component calls a module B component and got an Exception >> with >> "B-1234" exception. >> Module A do not expect any exception here and doesn't take care of "B-xxx" >> error codes, but only "A-yyyy. >> Should it rethrow exception with "A-unknown" (maybe "UNK-0001") code >> or reuse "B-xxxx" code with the own message, pointing original exception >> as >> a cause for both cases? >> >> The first approach may looks confusing, while the second one produces too >> many "UNK-" codes. >> What code should get the user in that case? >> >> >> > The method should always report root cause, in your example it will be > B-xxxx, no matter which module API is called. > > >> >> >> >> On Thu, Apr 15, 2021 at 5:36 PM Alexei Scherbakov < >> alexey.scherbak...@gmail.com> wrote: >> >> > чт, 15 апр. 2021 г. в 14:26, Ilya Kasnacheev <ilya.kasnach...@gmail.com >> >: >> > >> > > Hello! >> > > >> > > > All public methods throw only unchecked >> > > org.apache.ignite.lang.IgniteException containing aforementioned >> fields. >> > > > Each public method must have a section in the javadoc with a list of >> > all >> > > possible error codes for this method. >> > > >> > > I don't think this is feasible at all. >> > > Imagine javadoc for cache.get() method featuring three pages of >> possible >> > > error codes thrown by this method. >> > > >> > >> > Of course there is no need to write 3 pages of error codes, this makes >> no >> > sense. >> > I think we can use error ranges here, or, probably, document most >> important >> > error scenarios. >> > The point here is to try to document errors as much as possible. >> > >> > >> > > Also, updated every two weeks to account for changes in >> > > underlying mechanisms. >> > > >> > > There is still a confusion between "error code for any error in logs" >> and >> > > "error code for any pair of method & exception". Which one are we >> > > discussing really? >> > > >> > > This is impossible to track or test, but is susceptible for common >> > > error-hiding antipattern where all exceptions are caught in >> cache.get() >> > and >> > > discarded, and instead a brand new CH-0001 "error in cache.get()" is >> > thrown >> > > to the user. >> > > >> > > Which is certainly not something that anybody wants. >> > > >> > >> > Certainly not. We are talking here about root cause - what is exactly >> the >> > reason for method call failure. >> > >> > >> > > >> > > Regards, >> > > -- >> > > Ilya Kasnacheev >> > > >> > > >> > > чт, 15 апр. 2021 г. в 13:03, Vladislav Pyatkov <vldpyat...@gmail.com >> >: >> > > >> > > > Hi Alexei, >> > > > >> > > > > Each public method *must *have a section in the javadoc with a >> list >> > of >> > > > all possible error codes for this method. >> > > > >> > > > I consider it is redundant, because any public exception can be >> thrown >> > > from >> > > > public API. >> > > > If it not happens today, it may change tomorrow: one will be >> removed, >> > > > another one will be added. >> > > > >> > > > >Nested exceptions are not forbidden to use. They can provide >> > additional >> > > > details on the error for debug purposes >> > > > >> > > > I see another issue, which is in the Ignite 2.x, but not attend >> here. >> > We >> > > > can have a deep nested exception and use it for handling. >> > > > In the result, all time when we are handling an exception we use >> > > > pattern like this: >> > > > try{ >> > > > ... >> > > > } >> > > > catch (Exception e) { >> > > > if (X.hasCause(e, TimeoutException.class)) >> > > > throw e; >> > > > >> > > > if (X.hasCause(e, ConnectException.class, EOFException.class)) >> > > > continue; >> > > > >> > > > if (X.hasCause(e, InterruptedException.class)) >> > > > return false; >> > > > } >> > > > >> > > > If we have a pant to make only one exception to the client side, we >> can >> > > > also do it for an internal exception. >> > > > >> > > > On Wed, Apr 14, 2021 at 11:42 AM Alexei Scherbakov < >> > > > alexey.scherbak...@gmail.com> wrote: >> > > > >> > > > > Alexey, >> > > > > >> > > > > ср, 14 апр. 2021 г. в 01:52, Alexey Kukushkin < >> > > kukushkinale...@gmail.com >> > > > >: >> > > > > >> > > > > > Just some points looking questionable to me, although I realize >> the >> > > > error >> > > > > > handling style may be very opinionated: >> > > > > > >> > > > > > - Would it make sense splitting the proposed composite error >> > code >> > > > > > (TBL-0001) into separate numeric code (0001) and >> scope/category >> > > > > ("TBL") >> > > > > > to >> > > > > > avoid parsing the code when an error handler needs to analyze >> > only >> > > > the >> > > > > > category or the code? >> > > > > > >> > > > > >> > > > > TBL-0001 is a *string representation* of the error. It is built >> > from >> > > 2 >> > > > > byte scope id (mapped to name TBL) and 2 byte number (0001). Both >> > > > > internally packed in int. No any kind of parsing will be >> necessary to >> > > > read >> > > > > scope/category. >> > > > > >> > > > > >> > > > > > - "*The cause - short string description of an issue, >> readable >> > by >> > > > > > user.*". >> > > > > > This terminology sounds confusing to me since that "cause" >> > sounds >> > > > like >> > > > > > Java >> > > > > > Throwable's Message to me and the "Cause" is a lower level >> > > > exception. >> > > > > > >> > > > > >> > > > > The string describes the cause of error, so the name. I'm ok to >> > rename >> > > it >> > > > > to a message. It will be stored in IgniteException.message field >> > > anyway. >> > > > > >> > > > > >> > > > > > - "*The action - steps for a user to resolve error...*". The >> > > action >> > > > is >> > > > > > very important but do we want to make it part of the >> > > > IgniteException? >> > > > > I >> > > > > > do >> > > > > > not think the recovery action text should be part of the >> > > exception. >> > > > > > IgniteException may include a URL pointing to the >> corresponding >> > > > > > documentation - this is discussable. >> > > > > > >> > > > > >> > > > > This will not be the part of the exception. A user should visit >> the >> > > > > documentation page and read the action section by corresponding >> error >> > > > code. >> > > > > >> > > > > >> > > > > > - "*All public methods throw only unchecked >> IgniteException*" - >> > I >> > > > > assume >> > > > > > we still respect JCache specification and prefer using >> standard >> > > Java >> > > > > > exception to signal about invalid parameters. >> > > > > > >> > > > > >> > > > > Using standard java exceptions whenever possible makes sense. >> > > > > >> > > > > >> > > > > > - Why we do not follow the "classic" structured exception >> > handling >> > > > > > practices in Ignite: >> > > > > > >> > > > > >> > > > > Ignite 3 will be multi language, and other languages use other >> error >> > > > > processing models. SQL for example uses error codes. >> > > > > The single exception approach simplifies and unifies error >> handling >> > > > across >> > > > > platforms for me. >> > > > > >> > > > > >> > > > > > - Why do we not allow using checked exceptions? It seems >> to >> > me >> > > > > > sometimes forcing the user to handle an error serves as a >> > hint >> > > > and >> > > > > > thus >> > > > > > improves usability. For example, handling an >> > > > optimistic/pessimistic >> > > > > > transaction conflict/deadlock. Or handling a timeout for >> > > > operations >> > > > > > with >> > > > > > timeouts. >> > > > > > >> > > > > >> > > > > A valid point. Checked exceptions must be used for whose methods, >> > where >> > > > > error handling is enforced, for example tx optimistic failure. >> > > > > Such errors will also have corresponding error codes. >> > > > > >> > > > > >> > > > > > - Why single public IgniteException and no exception >> > hierarchy? >> > > > > Java >> > > > > > is optimized for structured exception handling instead of >> > using >> > > > > > IF-ELSE to >> > > > > > analyze the codes. >> > > > > > >> > > > > >> > > > > Exception hierarchy is not required when using error codes and >> > > applicable >> > > > > only to java API, so I would avoid spending efforts on it. >> > > > > >> > > > > >> > > > > > - Why no nested exceptions? Sometimes an error handler is >> > > > > interested >> > > > > > only in high level exceptions (like Invalid Configuration) >> > and >> > > > > > sometimes >> > > > > > details are needed (like specific configuration parser >> > > > exceptions). >> > > > > > >> > > > > >> > > > > Nested exceptions are not forbidden to use. They can provide >> > additional >> > > > > details on the error for debug purposes, but not strictly >> required, >> > > > because >> > > > > error code + message should provide enough information to the >> user. >> > > > > >> > > > > >> > > > > > - For async methods returning a Future we may have a >> universal >> > > rule >> > > > on >> > > > > > how to handle exceptions. For example, we may specify that >> any >> > > async >> > > > > > method >> > > > > > can throw only invalid argument exceptions. All other errors >> are >> > > > > > reported >> > > > > > via the exceptionally(IgniteException -> {}) callback even if >> > the >> > > > > async >> > > > > > method was executed synchronously. >> > > > > > >> > > > > >> > > > > This is ok to me. >> > > > > >> > > > > >> > > > > > >> > > > > > >> > > > > > вт, 13 апр. 2021 г. в 12:08, Alexei Scherbakov < >> > > > > > alexey.scherbak...@gmail.com >> > > > > > >: >> > > > > > >> > > > > > > Igniters, >> > > > > > > >> > > > > > > I would like to start the discussion about error handling in >> > > Ignite 3 >> > > > > and >> > > > > > > how we can improve it compared to Ignite 2. >> > > > > > > >> > > > > > > The error handling in Ignite 2 was not very good because of >> > generic >> > > > > > > CacheException thrown on almost any occasion, having deeply >> > nested >> > > > root >> > > > > > > cause and often containing no useful information on further >> steps >> > > to >> > > > > fix >> > > > > > > the issue. >> > > > > > > >> > > > > > > I aim to fix it by introducing some rules on error handling. >> > > > > > > >> > > > > > > *Public exception structure.* >> > > > > > > >> > > > > > > A public exception must have an error code, a cause, and an >> > action. >> > > > > > > >> > > > > > > * The code - the combination of 2 byte scope id and 2 byte >> error >> > > > number >> > > > > > > within the module. This allows up to 2^16 errors for each >> scope, >> > > > which >> > > > > > > should be enough. The error code string representation can >> look >> > > like >> > > > > > > RFT-0001 or TBL-0001 >> > > > > > > * The cause - short string description of an issue, readable >> by >> > > user. >> > > > > > This >> > > > > > > can have dynamic parameters depending on the error type for >> > better >> > > > user >> > > > > > > experience, like "Can't write a snapshot, no space left on >> device >> > > > {0}" >> > > > > > > * The action - steps for a user to resolve error situation >> > > described >> > > > in >> > > > > > the >> > > > > > > documentation in the corresponding error section, for example >> > > "Clean >> > > > up >> > > > > > > disk space and retry the operation". >> > > > > > > >> > > > > > > Common errors should have their own scope, for example >> IGN-0001 >> > > > > > > >> > > > > > > All public methods throw only unchecked >> > > > > > > org.apache.ignite.lang.IgniteException containing >> aforementioned >> > > > > fields. >> > > > > > > Each public method must have a section in the javadoc with a >> list >> > > of >> > > > > all >> > > > > > > possible error codes for this method. >> > > > > > > >> > > > > > > A good example with similar structure can be found here [1] >> > > > > > > >> > > > > > > *Async timeouts.* >> > > > > > > >> > > > > > > Because almost all API methods in Ignite 3 are async, they all >> > will >> > > > > have >> > > > > > a >> > > > > > > configurable default timeout and can complete with timeout >> error >> > > if a >> > > > > > > computation is not finished in time, for example if a response >> > has >> > > > not >> > > > > > been >> > > > > > > yet received. >> > > > > > > I suggest to complete the async op future with >> TimeoutException >> > in >> > > > this >> > > > > > > case to make it on par with synchronous execution using >> > future.get, >> > > > > which >> > > > > > > will throw java.util.concurrent.TimeoutException on timeout. >> > > > > > > For reference, see >> > java.util.concurrent.CompletableFuture#orTimeout >> > > > > > > No special error code should be used for this scenario. >> > > > > > > >> > > > > > > *Internal exceptions hierarchy.* >> > > > > > > >> > > > > > > All internal exceptions should extend >> > > > > > > org.apache.ignite.internal.lang.IgniteInternalException for >> > checked >> > > > > > > exceptions and >> > > > > > > org.apache.ignite.internal.lang.IgniteInternalCheckedException >> > for >> > > > > > > unchecked exceptions. >> > > > > > > >> > > > > > > Thoughts ? >> > > > > > > >> > > > > > > [1] >> > > > https://docs.oracle.com/cd/B10501_01/server.920/a96525/preface.htm >> > > > > > > >> > > > > > > -- >> > > > > > > >> > > > > > > Best regards, >> > > > > > > Alexei Scherbakov >> > > > > > > >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > Best regards, >> > > > > > Alexey >> > > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > >> > > > > Best regards, >> > > > > Alexei Scherbakov >> > > > > >> > > > >> > > > >> > > > -- >> > > > Vladislav Pyatkov >> > > > >> > > >> > >> > >> > -- >> > >> > Best regards, >> > Alexei Scherbakov >> > >> >> >> -- >> Best regards, >> Andrey V. Mashenkov >> > > > -- > > Best regards, > Alexei Scherbakov > -- Best regards, Alexei Scherbakov