I totally agree, Christopher. I have also run into a few situations where it would have been nice to have something like a mutation listener hook. Particularly in generating indexing and stats records.
Adam On Tue, Dec 8, 2015 at 5:59 PM, Christopher <[email protected]> wrote: > In the future, it might be useful to provide a supported API hook here. It > certainly would've made implementing replication easier, but could also be > useful as a notification system. > > On Tue, Dec 8, 2015 at 4:51 PM Keith Turner <[email protected]> wrote: > >> Constraints are checked before data is written. In the case of failures >> a constraint may see data thats never successfully written. >> >> On Tue, Dec 8, 2015 at 4:18 PM, Christopher <[email protected]> wrote: >> >>> Look at org.apache.accumulo.core.constraints.Constraint for a >>> description and >>> org.apache.accumulo.core.constraints.DefaultKeySizeConstraint as an example. >>> >>> In short, Mutations which are live-ingested into a tablet server are >>> validated against constraints you specify on the table. That means that all >>> Mutations written to a table go through this bit of user-provided code at >>> least once. You could use that fact to your advantage. However, this would >>> be highly experimental and might have some caveats to consider. >>> >>> You can configure a constraint on a table with >>> connector.tableOperations().addConstraint(...) >>> >>> >>> On Sun, Dec 6, 2015 at 10:49 PM Thai Ngo <[email protected]> wrote: >>> >>>> Christopher, >>>> >>>> This is interesting! Could you please give me more details about this? >>>> >>>> Thanks, >>>> Thai >>>> >>>> On Thu, Dec 3, 2015 at 12:17 PM, Christopher <[email protected]> >>>> wrote: >>>> >>>>> You could also implement a constraint to notify an external system >>>>> when a row is updated. >>>>> >>>>> On Wed, Dec 2, 2015, 22:54 Josh Elser <[email protected]> wrote: >>>>> >>>>>> oops :) >>>>>> >>>>>> [1] http://fluo.io/ >>>>>> >>>>>> Josh Elser wrote: >>>>>> > Hi Thai, >>>>>> > >>>>>> > There is no out-of-the-box feature provided with Accumulo that does >>>>>> what >>>>>> > you're asking for. Accumulo doesn't provide any functionality to >>>>>> push >>>>>> > notifications to other systems. You could potentially maintain other >>>>>> > tables/columns in which you maintain the last time a row was >>>>>> updated, >>>>>> > but the onus is on your "other services" to read the table to find >>>>>> out >>>>>> > when a change occurred (which is probably not scalable at "real >>>>>> time"). >>>>>> > >>>>>> > There are other systems you could likely leverage to solve this, >>>>>> > depending on the durability and scalability that your application >>>>>> needs. >>>>>> > >>>>>> > For a system "close" to Accumulo, you could take a look at Fluo [1] >>>>>> > which is an implementation of Google's "Percolator" system. This is >>>>>> a >>>>>> > system based on throughput rather than low-latency, so it may not >>>>>> be a >>>>>> > good fit for your needs. There are probably other systems in the >>>>>> Apache >>>>>> > ecosystem (Kafka, Storm, Flink or Spark Streaming maybe?) that are >>>>>> be >>>>>> > helpful to your problem. I'm not an expert on these to recommend on >>>>>> (nor >>>>>> > do I think I understand your entire architecture well enough). >>>>>> > >>>>>> > Thai Ngo wrote: >>>>>> >> Hi list, >>>>>> >> >>>>>> >> I have a use-case when existing rows in a table will be updated by >>>>>> an >>>>>> >> internal service. Data in a row of this table is composed of 2 >>>>>> parts: >>>>>> >> 1st part - immutable and the 2nd one - will be updated (filled in) >>>>>> a >>>>>> >> little later. >>>>>> >> >>>>>> >> Currently, I have a need of knowing when and which rows will be >>>>>> updated >>>>>> >> in the table so that other services will be wisely start consuming >>>>>> the >>>>>> >> data. It will make more sense when I need to consume the data in >>>>>> near >>>>>> >> realtime. So developing a notification function or simpler - a >>>>>> trigger >>>>>> >> is what I really want to do now. >>>>>> >> >>>>>> >> I am curious to know if someone has done similar job or there are >>>>>> >> features or APIs or best practices available for Accumulo so far. >>>>>> I'm >>>>>> >> thinking of letting the internal service which updates the data >>>>>> notify >>>>>> >> us whenever it updates the data. >>>>>> >> >>>>>> >> What do you think? >>>>>> >> >>>>>> >> Thanks, >>>>>> >> Thai >>>>>> >>>>> >>>> >>
