I strongly echo Josh’s sentiment. Imagine losing audit entries because C* is 
overloaded? It’s fine if you don’t care about losing audit entries.

Dinesh

> On Feb 28, 2019, at 6:41 AM, Joshua McKenzie <jmcken...@apache.org> wrote:
> 
> One of the things we've run into historically, on a *lot* of axes, is that
> "just put it in C*" for various functionality looks great from a user and
> usability perspective, and proves to be something of a nightmare from an
> admin / cluster behavior perspective.
> 
> i.e. - cluster suffering so you're writing hints? Write them to C* tables
> and watch the cluster suffer more! :)
> Same thing probably holds true for audit logging - at a time frame when
> things are getting hairy w/a cluster, if you're writing that audit logging
> into C* proper (and dealing with ser/deser, compaction pressure, flushing
> pressure, etc) from that, there's a compounding effect of pressure and pain
> on the cluster.
> 
> So the TL;DR we as a project kind of philosophically have been moving
> towards (I think that's valid to say?) is: use C* for the things it's
> absolutely great at, and try to side-channel other recovery operations as
> much as you can (see: file-based hints) to stay out of its way.
> 
> Same thing held true w/design of CDC - I debated "materialize in memory for
> consumer to take over socket", and "keep the data in another C* table", but
> the ramifications to perf and core I/O operations in C* the moment things
> start to go badly were significant enough that the route we went was "do no
> harm". For better or for worse, as there's obvious tradeoffs there.
> 
>> On Thu, Feb 28, 2019 at 7:46 AM Sagar <sagarmeansoc...@gmail.com> wrote:
>> 
>> Thanks all for the pointers.
>> 
>> @Joseph,
>> 
>> I have gone through the links shared by you. Also, I have been looking at
>> the code base.
>> 
>> I understand the fact that pushing the logs to ES or Solr is a lot easier
>> to do. Having said that, the only reason I thought having something like
>> this might help is, if I don't want to add more pieces and still provide a
>> central piece of audit logging within Cassandra itself and still be
>> queryable.
>> 
>> In terms of usages, one of them could definitely be CDC related use cases.
>> With data being stored in tables and being queryable, it can become a lot
>> more easier to expose this data to external systems like Kafka Connect,
>> Debezium which have the ability to push data to Kafka for example. Note
>> that pushing data to Kafka is just an example, but what I mean is, if we
>> can have data in tables, then instead of everyone writing custom custom
>> loggers, they can hook into this table info and take action.
>> 
>> Regarding the infinite loop question, I have done some analysis, and in my
>> opinion, instead of tweaking the behaviour of Binlog and the way it
>> functions currently, we can actually spin up another tailer thread to the
>> same Chronicle Queue which can do the needful. This way the config options
>> etc all remain the same(apart from the logger ofcourse).
>> 
>> Let me know if any of it makes sense :D
>> 
>> Thanks!
>> Sagar.
>> 
>> 
>> On Thu, Feb 28, 2019 at 1:09 AM Dinesh Joshi <djos...@icloud.com.invalid>
>> wrote:
>> 
>>> 
>>> 
>>>> On Feb 27, 2019, at 10:41 AM, Joseph Lynch <joe.e.ly...@gmail.com>
>>> wrote:
>>>> 
>>>> Vinay can confirm, but as far as I am aware we have no current plans to
>>>> implement audit logging to a table directly, but the implementation is
>>>> fully pluggable (like compaction, compression, etc ...). Check out the
>>> blog
>>>> post [1] and documentation [2] Vinay wrote for more details, but the
>>> short
>>> 
>>> +1. I am still curious as to why you'd want to store audit log entries
>>> back in Cassandra? Depending on the scale it can generate a lot of load
>> and
>>> I think you'd end up in an infinite loop because as you're inserting the
>>> audit log entry you'll generate a new one and so on unless you black list
>>> audits to that table / keyspace.
>>> 
>>> Ideally you'd insert this data into ElasticSearch / Solr or some other
>>> place that can be then used for analytics or search.
>>> 
>>> Dinesh
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to