Hi Yan,

We are looking for a solution for a similar problem.

Currently, we have 2 input streams, one transactions stream (5 partitions)
and one rules stream (one partition). I have a custom
SystemStreamPartitionGrouper that assigns the SSPs like so:.

taskName0: Trx stream partition 0, Rules stream partition 0
taskName1: Trx stream partition 1, Rules stream partition 0
taskName2: Trx stream partition 2, Rules stream partition 0
taskName3: Trx stream partition 3, Rules stream partition 0
taskName4: Trx stream partition 4, Rules stream partition 0

However, it looks like messages from the Rules stream are going to only one
task instance, and not to all of them like I hoped. I have them running in
the same container. Is it because there is only one offset value for the
container for the Rules stream?

Can you please expand on what you mean by "get the rule when starting
up StreamTasks
and then localize it."? Do you mean, loading messages into a changelog
stream using a bootstrap job?

Thanks in advance for your help!

Susan




On Tue, May 5, 2015 at 6:02 PM, Yan Fang <yanfang...@gmail.com> wrote:

> If the rule does not change, we can get the rule when starting up
> StreamTasks and then localize it.
>
> Cheers,
>
> Fang, Yan
> yanfang...@gmail.com
>
> On Tue, May 5, 2015 at 2:41 PM, Yan Fang <yanfang...@gmail.com> wrote:
>
> > "If I understand it correctly the only viable solution at the moment is
> to
> > create a new stream for the  rules messages with as many partitions as
> the
> > data stream and write each rules update message to all partitions of the
> > new rules stream."
> >
> > If the data is constantly changing, yes, AFAIK, this is the only viable
> > solution before we provide "shared store".
> >
> > Cheers,
> >
> > Fang, Yan
> > yanfang...@gmail.com
> >
> > On Tue, May 5, 2015 at 12:34 PM, Ueli Gallizzi <ueli.galli...@gmail.com>
> > wrote:
> >
> >> Hi Yan,
> >>
> >> Thanks for your quick response.
> >>
> >> After I read the discussion on SAMZA-353 I think the best solution for
> my
> >> use case is a "shared state" store among StreamTasks described in
> >> SAMZA-402. To give you some background I have a stream with rules which
> >> are
> >> constantly changing and a data stream on which I apply the rules. The
> >> rules
> >> set is very small if you compare it with the data stream.
> >>
> >> If I understand it correctly the only viable solution at the moment is
> to
> >> create a new stream for the  rules messages with as many partitions as
> the
> >> data stream and write each rules update message to all partitions of the
> >> new rules stream.
> >>
> >> Cheers,
> >> - ueli
> >>
> >> On Tue, May 5, 2015 at 12:06 PM, Yan Fang <yanfang...@gmail.com> wrote:
> >>
> >> > Hi Ueli,
> >> >
> >> > This feature currently is not supported by Samza. There was some
> >> > discussions in the JIRA - SAMZA-353
> >> > <https://issues.apache.org/jira/browse/SAMZA-353>.
> >> >
> >> > But there are some workaround for this, depends on what you want to
> >> > achieve. If you can specify what your requirement is, we can help
> think
> >> of
> >> > the solution.
> >> >
> >> > In another thread
> >> > <
> >> >
> >>
> http://mail-archives.apache.org/mod_mbox/samza-dev/201504.mbox/%3c552410e2.7000...@tivo.com%3E
> >> > >,
> >> > Tommy Becker has similar requirement and he maybe helpful as well.
> >> >
> >> > Cheers,
> >> > Fang, Yan
> >> > yanfang...@gmail.com
> >> >
> >> > On Tue, May 5, 2015 at 7:42 AM, Ueli Gallizzi <
> ueli.galli...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > Is it possible that multiple tasks read from the same input stream
> >> > > partition?
> >> > >
> >> > > example:
> >> > > task 0 stream A partition 0, stream B partition 0
> >> > > task 1 stream A partition 1, stream B partition 0
> >> > > task 2 stream A partition 3, stream B partition 0
> >> > >
> >> > > In this example all messages in stream B partition 0 would be
> >> processed
> >> > by
> >> > > all 3 tasks.
> >> > >
> >> > > Cheers,
> >> > > - ueli
> >> > >
> >> >
> >> >
> >>
> >
> >
>

Reply via email to