to Benedict:
well ... I was not around when the decision about the usage of Chronicle Queues was made. I think that at that time it was the most obvious candidate without reinventing the wheel given the features and capabilities it had so taking something off the shelf was a natural conclusion.
Josh / Jordan:
not only FQL but Audit as well these are two separate things. There is also quite a "rich" ecosystem around that.
1) nodetool commands like
enableauditlog
enablefullquerylog
disableauditlog
disablefullquerylog
getauditlog
getfullquerylog
Also, because the files it produces are binary, we need a special tooling to inspect it, it is in tools/fqltool with a bunch of classes, and there is also an AuditLogViewer for reviewing audit logs.
There are MBean methods enabling nodetool commands.
We have also shipped that in two major releases (4.0 and now in 5.0) so the community is quite well used to this, they have the processes set around this etc.
I mention this all because it is just not so easy to replace it with something else if somebody wanted that, in any case. How do we even go around deprecating this if we are indeed going to replace that?
To discuss the release aspect they have in place: I think you are right that the latest ea is as close as possible, if not the same, as what they release privately. Yes. But if we want to stick to the rule that we upgrade only to the latest ea relese before their next minor, then
1) we will be always at least one minor late
2) we do not know when they make up their minds to transition to a new minor so we can upgrade to the latest ea one minor before
3) if something is broken and we need to fix it and we are on ea, then what we get to update to is the latest ea at that time which might fix the issue but it will also bring new stuff in which might open doors to instability as well. So we update to fix the bugs but we might include new ones unknowingly.
Anyway, I don't think this has any silver bullet solution, we might just stick to the latest "ea" and be done with it. I do not expect this project to evolve wildly and unpredictably, it just solves "one problem", there is basically nothing new coming in.
Brandon:
I understand your concerns about phoning home but
1) we already resolved this by setting the respective property
2) I do not think that Chronicle will mess with this once they introduce that. There is nothing to "improve" or "change" there. It is phoning home or not and it is driven by one property. If they made a change that we can not turn it off then we would really be in trouble but for now we are not and practically speaking I don't expect this would change.
I know that this might sound like wishful thinking but in practical terms I really just don't expect this phoning home thing would come back ever.
Speaking of alternatives, I think the primary reason Chronicle was used is this (1).
"It's goal is good enough performance, predictable footprint, simplicity in terms of implementation and configuration and most importantly minimal impact on producers of log records."
While I understand English (I guess, well enough :D), I just don't understand what "good enough performance" is. How is this measured? What is a "predictable footprint"? Was that measured too? How did we quantify that?
" Performance safety is accomplished by feeding items to the binary log using a weighted queue and dropping records if the binary log falls sufficiently far behind."
This is interesting, if I understand correctly, the messages are weighted and the heavier they are, the more probable it is they will be dropped when it is overloaded? Or vice versa, the tighter ones are dropped first?
Have we _ever_ experienced in production that some log events were really dropped? Has anybody ever hit that?
When it comes to alternatives, what about logback + slf4j? It has appenders where we want, it is sync / async, we can code some nio appender too I guess, it logs it as text into a file so we do not need any special tooling to review that. For tailing which Chronicle also offers, I guess "tail -f that.log" just does the job? logback even rolls the files after they are big enough so it rolls the files the same way after some configured period / size as Chronicle does (It even compresses the logs).
Do we log so much so that battle-tested logback is just absolutely not enough for us? Come on, this is not a rocket science that we need to use a library from the realm of "high frequency trading" to just append queries and audit logs as they are executed. logback can handle the load we have just fine imo ...
Or maybe I am completely wrong and we just HAVE TO use Chronicle?
(1)
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/binlog/BinLog.java#L58-L69