Ivan, thanks for this effort, as for me - looks good.
>Hi everyone, > >I'd like to continue the discussion. I created IEP with the attempt to >summarize >all the information above, you can find it here [1]. What do you think? > >[1] >https://cwiki.apache.org/confluence/display/IGNITE/IEP-84%3A+Error+handling > >ср, 21 апр. 2021 г. в 20:46, Alexei Scherbakov < alexey.scherbak...@gmail.com >>: > >> Alexei, >> >> I think it's ok to do error conversion if it makes sense, but better to >> preserve the root cause whenever possible. >> Another way to solve the described scenario is to introduce something like >> checked IgniteRetryAgainException, which forces the user to retry or ignore >> it. >> It's difficult to foresee exactly at this point what is the best solution >> without knowing the exact scenario. >> >> Andrey, >> >> I've just realized your proposal to add IGN to each string code. This is ok >> to me, for example IGN-TBL-0001 >> >> ср, 21 апр. 2021 г. в 17:49, Alexey Goncharuk < alexey.goncha...@gmail.com >> >: >> >> > Aleksei, >> > >> > > The method should always report root cause, in your example it will be >> > > B-xxxx, no matter which module API is called >> > >> > I may be wrong, but I doubt this will be usable for an end-user. Let's >> > imagine that the same root exception was raised in different contexts >> > resulting in two outcomes. The first one is safe to retry (say, the root >> > cause led to a transaction prepare failure), but the second outcome may >> be >> > a state in which no matter how many retries will be attempted, the >> > operation will never succeed. Only the upper-level layer can tell the >> > difference and return a proper message to the user, so I would say that >> > some error conversion/wrapping will be required no matter what. >> > >> > --AG >> > >> > пт, 16 апр. 2021 г. в 16:31, Alexei Scherbakov < >> > alexey.scherbak...@gmail.com >> > >: >> > >> > > чт, 15 апр. 2021 г. в 18:21, Andrey Mashenkov < >> > andrey.mashen...@gmail.com >> > > >: >> > > >> > > > Hi Alexey, >> > > > I like the idea. >> > > > >> > > > 1. >> > > > >> > > > > TBL-0001 is a *string representation* of the error. It is built >> > from >> > > 2 >> > > > > byte scope id (mapped to name TBL) and 2 byte number (0001). Both >> > > > > internally packed in int. No any kind of parsing will be necessary >> to >> > > > read >> > > > > scope/category. >> > > > >> > > > I think Alexey mean if it will be possible to make smth like that >> > > > >> > > > catch (IgniteException e) { >> > > > if (e.getScope() == "TBL" && e.getCode() == 1234) >> > > > continue; // E.g. retry my TX >> > > > } >> > > > >> > > > It looks useful to me. >> > > > >> > > >> > > I have in mind something like this: >> > > >> > > public class IgniteException extends RuntimeException { >> > > private int errorCode; >> > > >> > > public IgniteException(ErrorScope scope, int code, String message, >> > > Throwable cause) { >> > > super(message, cause); >> > > this.errorCode = make(scope, code); >> > > } >> > > >> > > public boolean matches(ErrorScope scope, int code) { >> > > return errorCode == make(scope, code); >> > > } >> > > >> > > private int make(ErrorScope scope, int code) { >> > > return ((scope.ordinal() << 16) | code); >> > > } >> > > >> > > public ErrorScope scope() { >> > > return ErrorScope.values()[errorCode >> 16]; >> > > } >> > > >> > > public int code() { >> > > return 0xFFFF & errorCode; >> > > } >> > > >> > > public static void main(String[] args) { >> > > IgniteException e = new IgniteException(ErrorScope.RAFT, 1, >> > "test", >> > > null); >> > > >> > > System.out.println(e.matches(ErrorScope.RAFT, 2)); >> > > System.out.println(e.scope()); >> > > System.out.println(e.code()); >> > > >> > > try { >> > > throw e; >> > > } >> > > catch (IgniteException ee) { >> > > System.out.println(ee.matches(ErrorScope.RAFT, 1)); >> > > } >> > > } >> > > } >> > > >> > > >> > > > >> > > > 2. How you see a cross-module exception throwing? >> > > > Assume, user call -> module A, which recursively call -> module B, >> > which >> > > > fails. >> > > > So, module A component calls a module B component and got an >> Exception >> > > with >> > > > "B-1234" exception. >> > > > Module A do not expect any exception here and doesn't take care of >> > > "B-xxx" >> > > > error codes, but only "A-yyyy. >> > > > Should it rethrow exception with "A-unknown" (maybe "UNK-0001") code >> > > > or reuse "B-xxxx" code with the own message, pointing original >> > exception >> > > as >> > > > a cause for both cases? >> > > > >> > > > The first approach may looks confusing, while the second one produces >> > too >> > > > many "UNK-" codes. >> > > > What code should get the user in that case? >> > > > >> > > > >> > > > >> > > The method should always report root cause, in your example it will be >> > > B-xxxx, no matter which module API is called. >> > > >> > > >> > > > >> > > > >> > > > >> > > > On Thu, Apr 15, 2021 at 5:36 PM Alexei Scherbakov < >> > > > alexey.scherbak...@gmail.com > wrote: >> > > > >> > > > > чт, 15 апр. 2021 г. в 14:26, Ilya Kasnacheev < >> > > ilya.kasnach...@gmail.com >> > > > >: >> > > > > >> > > > > > Hello! >> > > > > > >> > > > > > > All public methods throw only unchecked >> > > > > > org.apache.ignite.lang.IgniteException containing aforementioned >> > > > fields. >> > > > > > > Each public method must have a section in the javadoc with a >> list >> > > of >> > > > > all >> > > > > > possible error codes for this method. >> > > > > > >> > > > > > I don't think this is feasible at all. >> > > > > > Imagine javadoc for cache.get() method featuring three pages of >> > > > possible >> > > > > > error codes thrown by this method. >> > > > > > >> > > > > >> > > > > Of course there is no need to write 3 pages of error codes, this >> > makes >> > > no >> > > > > sense. >> > > > > I think we can use error ranges here, or, probably, document most >> > > > important >> > > > > error scenarios. >> > > > > The point here is to try to document errors as much as possible. >> > > > > >> > > > > >> > > > > > Also, updated every two weeks to account for changes in >> > > > > > underlying mechanisms. >> > > > > > >> > > > > > There is still a confusion between "error code for any error in >> > logs" >> > > > and >> > > > > > "error code for any pair of method & exception". Which one are we >> > > > > > discussing really? >> > > > > > >> > > > > > This is impossible to track or test, but is susceptible for >> common >> > > > > > error-hiding antipattern where all exceptions are caught in >> > > cache.get() >> > > > > and >> > > > > > discarded, and instead a brand new CH-0001 "error in cache.get()" >> > is >> > > > > thrown >> > > > > > to the user. >> > > > > > >> > > > > > Which is certainly not something that anybody wants. >> > > > > > >> > > > > >> > > > > Certainly not. We are talking here about root cause - what is >> exactly >> > > the >> > > > > reason for method call failure. >> > > > > >> > > > > >> > > > > > >> > > > > > Regards, >> > > > > > -- >> > > > > > Ilya Kasnacheev >> > > > > > >> > > > > > >> > > > > > чт, 15 апр. 2021 г. в 13:03, Vladislav Pyatkov < >> > vldpyat...@gmail.com >> > > >: >> > > > > > >> > > > > > > Hi Alexei, >> > > > > > > >> > > > > > > > Each public method *must *have a section in the javadoc with >> a >> > > list >> > > > > of >> > > > > > > all possible error codes for this method. >> > > > > > > >> > > > > > > I consider it is redundant, because any public exception can be >> > > > thrown >> > > > > > from >> > > > > > > public API. >> > > > > > > If it not happens today, it may change tomorrow: one will be >> > > removed, >> > > > > > > another one will be added. >> > > > > > > >> > > > > > > >Nested exceptions are not forbidden to use. They can provide >> > > > > additional >> > > > > > > details on the error for debug purposes >> > > > > > > >> > > > > > > I see another issue, which is in the Ignite 2.x, but not attend >> > > here. >> > > > > We >> > > > > > > can have a deep nested exception and use it for handling. >> > > > > > > In the result, all time when we are handling an exception we >> use >> > > > > > > pattern like this: >> > > > > > > try{ >> > > > > > > ... >> > > > > > > } >> > > > > > > catch (Exception e) { >> > > > > > > if (X.hasCause(e, TimeoutException.class)) >> > > > > > > throw e; >> > > > > > > >> > > > > > > if (X.hasCause(e, ConnectException.class, >> > EOFException.class)) >> > > > > > > continue; >> > > > > > > >> > > > > > > if (X.hasCause(e, InterruptedException.class)) >> > > > > > > return false; >> > > > > > > } >> > > > > > > >> > > > > > > If we have a pant to make only one exception to the client >> side, >> > we >> > > > can >> > > > > > > also do it for an internal exception. >> > > > > > > >> > > > > > > On Wed, Apr 14, 2021 at 11:42 AM Alexei Scherbakov < >> > > > > > > alexey.scherbak...@gmail.com > wrote: >> > > > > > > >> > > > > > > > Alexey, >> > > > > > > > >> > > > > > > > ср, 14 апр. 2021 г. в 01:52, Alexey Kukushkin < >> > > > > > kukushkinale...@gmail.com >> > > > > > > >: >> > > > > > > > >> > > > > > > > > Just some points looking questionable to me, although I >> > realize >> > > > the >> > > > > > > error >> > > > > > > > > handling style may be very opinionated: >> > > > > > > > > >> > > > > > > > > - Would it make sense splitting the proposed composite >> > error >> > > > > code >> > > > > > > > > (TBL-0001) into separate numeric code (0001) and >> > > > scope/category >> > > > > > > > ("TBL") >> > > > > > > > > to >> > > > > > > > > avoid parsing the code when an error handler needs to >> > > analyze >> > > > > only >> > > > > > > the >> > > > > > > > > category or the code? >> > > > > > > > > >> > > > > > > > >> > > > > > > > TBL-0001 is a *string representation* of the error. It is >> > built >> > > > > from >> > > > > > 2 >> > > > > > > > byte scope id (mapped to name TBL) and 2 byte number (0001). >> > Both >> > > > > > > > internally packed in int. No any kind of parsing will be >> > > necessary >> > > > to >> > > > > > > read >> > > > > > > > scope/category. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - "*The cause - short string description of an issue, >> > > readable >> > > > > by >> > > > > > > > > user.*". >> > > > > > > > > This terminology sounds confusing to me since that >> "cause" >> > > > > sounds >> > > > > > > like >> > > > > > > > > Java >> > > > > > > > > Throwable's Message to me and the "Cause" is a lower >> level >> > > > > > > exception. >> > > > > > > > > >> > > > > > > > >> > > > > > > > The string describes the cause of error, so the name. I'm ok >> to >> > > > > rename >> > > > > > it >> > > > > > > > to a message. It will be stored in IgniteException.message >> > field >> > > > > > anyway. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - "*The action - steps for a user to resolve error...*". >> > The >> > > > > > action >> > > > > > > is >> > > > > > > > > very important but do we want to make it part of the >> > > > > > > IgniteException? >> > > > > > > > I >> > > > > > > > > do >> > > > > > > > > not think the recovery action text should be part of the >> > > > > > exception. >> > > > > > > > > IgniteException may include a URL pointing to the >> > > > corresponding >> > > > > > > > > documentation - this is discussable. >> > > > > > > > > >> > > > > > > > >> > > > > > > > This will not be the part of the exception. A user should >> visit >> > > the >> > > > > > > > documentation page and read the action section by >> corresponding >> > > > error >> > > > > > > code. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - "*All public methods throw only unchecked >> > > IgniteException*" >> > > > - >> > > > > I >> > > > > > > > assume >> > > > > > > > > we still respect JCache specification and prefer using >> > > > standard >> > > > > > Java >> > > > > > > > > exception to signal about invalid parameters. >> > > > > > > > > >> > > > > > > > >> > > > > > > > Using standard java exceptions whenever possible makes sense. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - Why we do not follow the "classic" structured >> exception >> > > > > handling >> > > > > > > > > practices in Ignite: >> > > > > > > > > >> > > > > > > > >> > > > > > > > Ignite 3 will be multi language, and other languages use >> other >> > > > error >> > > > > > > > processing models. SQL for example uses error codes. >> > > > > > > > The single exception approach simplifies and unifies error >> > > handling >> > > > > > > across >> > > > > > > > platforms for me. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - Why do we not allow using checked exceptions? It >> > seems >> > > to >> > > > > me >> > > > > > > > > sometimes forcing the user to handle an error serves >> > as a >> > > > > hint >> > > > > > > and >> > > > > > > > > thus >> > > > > > > > > improves usability. For example, handling an >> > > > > > > optimistic/pessimistic >> > > > > > > > > transaction conflict/deadlock. Or handling a timeout >> > for >> > > > > > > operations >> > > > > > > > > with >> > > > > > > > > timeouts. >> > > > > > > > > >> > > > > > > > >> > > > > > > > A valid point. Checked exceptions must be used for whose >> > methods, >> > > > > where >> > > > > > > > error handling is enforced, for example tx optimistic >> failure. >> > > > > > > > Such errors will also have corresponding error codes. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - Why single public IgniteException and no exception >> > > > > hierarchy? >> > > > > > > > Java >> > > > > > > > > is optimized for structured exception handling >> instead >> > of >> > > > > using >> > > > > > > > > IF-ELSE to >> > > > > > > > > analyze the codes. >> > > > > > > > > >> > > > > > > > >> > > > > > > > Exception hierarchy is not required when using error codes >> and >> > > > > > applicable >> > > > > > > > only to java API, so I would avoid spending efforts on it. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - Why no nested exceptions? Sometimes an error >> handler >> > is >> > > > > > > > interested >> > > > > > > > > only in high level exceptions (like Invalid >> > > Configuration) >> > > > > and >> > > > > > > > > sometimes >> > > > > > > > > details are needed (like specific configuration >> parser >> > > > > > > exceptions). >> > > > > > > > > >> > > > > > > > >> > > > > > > > Nested exceptions are not forbidden to use. They can provide >> > > > > additional >> > > > > > > > details on the error for debug purposes, but not strictly >> > > required, >> > > > > > > because >> > > > > > > > error code + message should provide enough information to the >> > > user. >> > > > > > > > >> > > > > > > > >> > > > > > > > > - For async methods returning a Future we may have a >> > > universal >> > > > > > rule >> > > > > > > on >> > > > > > > > > how to handle exceptions. For example, we may specify >> that >> > > any >> > > > > > async >> > > > > > > > > method >> > > > > > > > > can throw only invalid argument exceptions. All other >> > errors >> > > > are >> > > > > > > > > reported >> > > > > > > > > via the exceptionally(IgniteException -> {}) callback >> even >> > > if >> > > > > the >> > > > > > > > async >> > > > > > > > > method was executed synchronously. >> > > > > > > > > >> > > > > > > > >> > > > > > > > This is ok to me. >> > > > > > > > >> > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > вт, 13 апр. 2021 г. в 12:08, Alexei Scherbakov < >> > > > > > > > > alexey.scherbak...@gmail.com >> > > > > > > > > >: >> > > > > > > > > >> > > > > > > > > > Igniters, >> > > > > > > > > > >> > > > > > > > > > I would like to start the discussion about error handling >> > in >> > > > > > Ignite 3 >> > > > > > > > and >> > > > > > > > > > how we can improve it compared to Ignite 2. >> > > > > > > > > > >> > > > > > > > > > The error handling in Ignite 2 was not very good because >> of >> > > > > generic >> > > > > > > > > > CacheException thrown on almost any occasion, having >> deeply >> > > > > nested >> > > > > > > root >> > > > > > > > > > cause and often containing no useful information on >> further >> > > > steps >> > > > > > to >> > > > > > > > fix >> > > > > > > > > > the issue. >> > > > > > > > > > >> > > > > > > > > > I aim to fix it by introducing some rules on error >> > handling. >> > > > > > > > > > >> > > > > > > > > > *Public exception structure.* >> > > > > > > > > > >> > > > > > > > > > A public exception must have an error code, a cause, and >> an >> > > > > action. >> > > > > > > > > > >> > > > > > > > > > * The code - the combination of 2 byte scope id and 2 >> byte >> > > > error >> > > > > > > number >> > > > > > > > > > within the module. This allows up to 2^16 errors for each >> > > > scope, >> > > > > > > which >> > > > > > > > > > should be enough. The error code string representation >> can >> > > look >> > > > > > like >> > > > > > > > > > RFT-0001 or TBL-0001 >> > > > > > > > > > * The cause - short string description of an issue, >> > readable >> > > by >> > > > > > user. >> > > > > > > > > This >> > > > > > > > > > can have dynamic parameters depending on the error type >> for >> > > > > better >> > > > > > > user >> > > > > > > > > > experience, like "Can't write a snapshot, no space left >> on >> > > > device >> > > > > > > {0}" >> > > > > > > > > > * The action - steps for a user to resolve error >> situation >> > > > > > described >> > > > > > > in >> > > > > > > > > the >> > > > > > > > > > documentation in the corresponding error section, for >> > example >> > > > > > "Clean >> > > > > > > up >> > > > > > > > > > disk space and retry the operation". >> > > > > > > > > > >> > > > > > > > > > Common errors should have their own scope, for example >> > > IGN-0001 >> > > > > > > > > > >> > > > > > > > > > All public methods throw only unchecked >> > > > > > > > > > org.apache.ignite.lang.IgniteException containing >> > > > aforementioned >> > > > > > > > fields. >> > > > > > > > > > Each public method must have a section in the javadoc >> with >> > a >> > > > list >> > > > > > of >> > > > > > > > all >> > > > > > > > > > possible error codes for this method. >> > > > > > > > > > >> > > > > > > > > > A good example with similar structure can be found here >> [1] >> > > > > > > > > > >> > > > > > > > > > *Async timeouts.* >> > > > > > > > > > >> > > > > > > > > > Because almost all API methods in Ignite 3 are async, >> they >> > > all >> > > > > will >> > > > > > > > have >> > > > > > > > > a >> > > > > > > > > > configurable default timeout and can complete with >> timeout >> > > > error >> > > > > > if a >> > > > > > > > > > computation is not finished in time, for example if a >> > > response >> > > > > has >> > > > > > > not >> > > > > > > > > been >> > > > > > > > > > yet received. >> > > > > > > > > > I suggest to complete the async op future with >> > > TimeoutException >> > > > > in >> > > > > > > this >> > > > > > > > > > case to make it on par with synchronous execution using >> > > > > future.get, >> > > > > > > > which >> > > > > > > > > > will throw java.util.concurrent.TimeoutException on >> > timeout. >> > > > > > > > > > For reference, see >> > > > > java.util.concurrent.CompletableFuture#orTimeout >> > > > > > > > > > No special error code should be used for this scenario. >> > > > > > > > > > >> > > > > > > > > > *Internal exceptions hierarchy.* >> > > > > > > > > > >> > > > > > > > > > All internal exceptions should extend >> > > > > > > > > > org.apache.ignite.internal.lang.IgniteInternalException >> for >> > > > > checked >> > > > > > > > > > exceptions and >> > > > > > > > > > >> > > org.apache.ignite.internal.lang.IgniteInternalCheckedException >> > > > > for >> > > > > > > > > > unchecked exceptions. >> > > > > > > > > > >> > > > > > > > > > Thoughts ? >> > > > > > > > > > >> > > > > > > > > > [1] >> > > > > > > >> > https://docs.oracle.com/cd/B10501_01/server.920/a96525/preface.htm >> > > > > > > > > > >> > > > > > > > > > -- >> > > > > > > > > > >> > > > > > > > > > Best regards, >> > > > > > > > > > Alexei Scherbakov >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > -- >> > > > > > > > > Best regards, >> > > > > > > > > Alexey >> > > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > -- >> > > > > > > > >> > > > > > > > Best regards, >> > > > > > > > Alexei Scherbakov >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > -- >> > > > > > > Vladislav Pyatkov >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > >> > > > > Best regards, >> > > > > Alexei Scherbakov >> > > > > >> > > > >> > > > >> > > > -- >> > > > Best regards, >> > > > Andrey V. Mashenkov >> > > > >> > > >> > > >> > > -- >> > > >> > > Best regards, >> > > Alexei Scherbakov >> > > >> > >> >> >> -- >> >> Best regards, >> Alexei Scherbakov >> > > >-- >Sincerely yours, >Ivan Bessonov