e you referring to broadcast streams?
>
>
>
> On Mon, May 2, 2016 at 7:58 PM, Bae, Jae Hyeon wrote:
>
> > Hi Samza devs and users
> >
> > Is it possible to implement broadcasted KeyValueStorage with 0.9? I know
> > 0.10 supports broadcast but we're not
Hi Samza devs and users
Is it possible to implement broadcasted KeyValueStorage with 0.9? I know
0.10 supports broadcast but we're not yet to 0.10.
Thank you
Best, Jae
Hi Samza devs and users
I know this will be tricky in Samza because Samza Kafka consumer is not
coordinated externally, but do you have any idea how to deploy samza jobs
with zero downtime?
Thank you
Best, Jae
Hi Aram
May I know why you're moving off from Spark streaming to Samza?
On Tue, Nov 10, 2015 at 11:04 PM, Aram Mkrtchyan <
aram.mkrtch...@picsart.com.invalid> wrote:
> Hi Yi,
> Many thanks! :)
> It's wonderful ! We really need samza-hdfs module.
> Thanks to samza being very simple we have been a
be log-compacted. Then, the next question is: what's the size of the
> data in the checkpoint topic and how many input topic partitions you have
> in the job?
>
> It would also be helpful if you can share which version of samza you are
> using.
>
> Thanks!
>
> -Yi
&g
Also, I am using Samza 0.9.1.
On Wed, Nov 4, 2015 at 12:18 AM, Bae, Jae Hyeon wrote:
> Hi Yi
>
> There are 8 partitions in the input topic. The size of checkpoint topic is
> 26 MB.
>
> segment.bytes26214400cleanup.policycompact
>
> On Tue, Nov 3, 2015 at 11:12 PM, Yi
act:
> 1. How many system stream partitions you have as the input? And how many
> tasks are there?
> 2. Did you set your checkpoint topic as log-compact topic in Kafka? The
> topic size would be much smaller if log compaction is turned on.
>
> Regards
>
> -Yi
>
> On Tu
log-compact topic in Kafka? The
> topic size would be much smaller if log compaction is turned on.
>
> Regards
>
> -Yi
>
> On Tue, Nov 3, 2015 at 3:59 PM, Bae, Jae Hyeon wrote:
>
> > Hi Samza Dev
> >
> > Do you know why the following job is taking too long?
&
Hi Samza Dev
Do you know why the following job is taking too long?
2015-11-03 23:58:17 KafkaCheckpointManager [INFO] Get latest offset 3386930
for topic __samza_checkpoint_ver_1_for_xxx_1 and partition 0.
This is seriously slowing down development. How can I fix this problem?
Thank you
Best, Ja
lization (again, assuming if you
> have changelog configured)
>
> > implement the StreamTask job to consume a Kafka topic and call add()
> method?
> Why wouldn't you want to use a changelog ?
>
>
> On Fri, Oct 2, 2015 at 3:09 PM, Bae, Jae Hyeon wrote:
>
> > T
Thanks Yi Pan, I have one more question.
Does KV-store consume automatically from a Kafka topic? Does it consume
only on restore()? If so, do I have to implement the StreamTask job to
consume a Kafka topic and call add() method?
On Fri, Oct 2, 2015 at 2:01 PM, Yi Pan wrote:
> Hi, Jae Hyeon,
>
>
n Fri, Oct 2, 2015 at 1:43 PM, Bae, Jae Hyeon wrote:
> Hi Samza devs and users
>
> This is my first try with KeyValueStore and I am really excited!
>
> I glanced through TaskStorageManager source code, it looks creates
> consumers for stores and I am wondering that how kafka cle
Hi Samza devs and users
This is my first try with KeyValueStore and I am really excited!
I glanced through TaskStorageManager source code, it looks creates
consumers for stores and I am wondering that how kafka cleanup will be
propagated to KeyValueStore.
My KeyValueStore usage is a little bit d
lt;
> https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-ConsumerOffsetChecker
> >
>
> Thanks,
>
> Fang, Yan
> yanfang...@gmail.com
>
> On Tue, Jul 14, 2015 at 10:33 AM, Bae, Jae Hyeon
> wrote:
>
> > Hi Yan
> >
> > Thanks for
e, you can use
>
> systems.system-name.samza.offset.default=upcoming
> systems.system-name.streams.stream-name.samza.reset.offset=true
>
> What is the reason that you do not want to use
> systems.system-name.streams.stream-name.samza.reset.offset
> ?
>
> Thanks,
>
> Fang, Yan
>
Hi
I want to reset the offset for the job to the latest one, which means I
want to ignore them without using
systems.system-name.streams.stream-name.samza.reset.offset option.
If I use checkpoint tool and reset the offset as -1 or Long.MAX_VALUE, in
my theory, kafka consumer will throw an excepti
essages in the queue in one partition.
>
> Thanks,
>
> Fang, Yan
> yanfang...@gmail.com
>
> On Fri, Jul 10, 2015 at 12:36 PM, Bae, Jae Hyeon
> wrote:
>
> > Hi Samza devs and users
> >
> > I wrote customized Samza S3 consumer which downloads files from S3
Hi Samza devs and users
I wrote customized Samza S3 consumer which downloads files from S3 and put
messages in BlockedEnvelopeMap. It was straightforward because there's a
nice example, filereader. I tried to a little optimize with
newBlockingQueue() method because I guess that single queue shared
t; the hello-samza, it fails with same error you have. I have opened a bug to
> track this here:
> https://issues.apache.org/jira/browse/SAMZA-609
>
> Thanks,
> Naveen
>
> On Mar 19, 2015, at 3:44 PM, Bae, Jae Hyeon metac...@gmail.com>> wrote:
>
> Double quote characte
t;uiStartup.ended\\") or xpath(\\"category\\") = \\"uiIntent\\"\",\"
job.name\":\"clevent-kafka\",
Anyway, I will use url encode-decode to make json parsing more comfortable.
On Thu, Mar 19, 2015 at 3:21 PM, Bae, Jae Hyeon wrote:
> The problem i
;ll have to check your AM container
> logs. Can you find those? They're usually linked to from the YARN RM UI.
>
> Cheers,
> Chris
>
> On Thu, Mar 19, 2015 at 2:32 PM, Bae, Jae Hyeon
> wrote:
>
> > Hi Samza Devs
> >
> > I want to pass the quoted string
Hi Samza Devs
I want to pass the quoted string like
filter.map.filter1.property=xpath("name") in ("uiBrowseStartup.ended",
"subscription.ended", "uiStartup.ended") or xpath("category") = "uiIntent"
through the configuration to the container but AM keeps failing
Application application_142309072
ENV_CONTAINER_ID does not look existing in 0.8.0. I will try
ENV_CONTAINER_NAME.
On Thu, Mar 5, 2015 at 5:09 PM, Chris Riccomini
wrote:
> Hey Jae,
>
> Also, this should work:
>
> System.getenv(ShellCommandConfig.ENV_CONTAINER_ID).toInt
>
> This is actually how SamzaContainer gets its container
Hi Samza devs
How can I get container id such as 0, 1, 2, ... up to yarn.container.count?
I tried to get some environment variable but it didn't work well.
If it's not easy to get the container id, I have to use process id to
distinguish containers from the same job running in the same machine b
I am reading JobCoordinator and now I can understand why multiple
containers were not launched. I need to create multiple tasks, which are
grouped again based on containerCount.
On Fri, Feb 6, 2015 at 1:26 PM, Bae, Jae Hyeon wrote:
> Our current main purpose of samza is for data pipeline, so
Our current main purpose of samza is for data pipeline, so we don't want to
create multiple tasks in the single SamzaContainer. As I read Samza
implementation, it will create as many tasks as the number of partitions
assigned in the container, right?
The problem of that approach is, each task will
> longer. The longer the container sleeps, the more latency that's introduced
> when there *is* a message available. 10ms is what we use by default.
>
> Cheers,
> Chris
>
> On Fri, Feb 6, 2015 at 11:11 AM, Bae, Jae Hyeon
> wrote:
>
> > Could you explain why consumerMult
Could you explain why consumerMultiplexer.choose returns null?
Can it happen when there's no message in the kafka topic?
If my theory is correct, its frequency is too high, in my testing
environment, it's more than 50 per second.
Thank you
Best, Jae
will file a jira and submit PR.
Let me know. If this is not acceptable, I should switch my SystemProducer
implementation to StreamTask again.
On Thu, Jan 29, 2015 at 9:29 AM, Bae, Jae Hyeon wrote:
> Hi Chris
>
>
>
> On Thu, Jan 29, 2015 at 9:10 AM, Chris Riccomini
> w
itched it to SystemProducer as Metamx Druid Tranquility did. But I
overlook that duplicate data which can be caused by flush & commit mismatch.
Do you have any idea?
>
> Could you elaborate on this?
>
> Cheers,
> Chris
>
> On Thu, Jan 29, 2015 at 12:27 AM, Bae, Jae
Never mind. I found a solution. Flush should be synced with commit.
On Thu, Jan 29, 2015 at 12:15 AM, Bae, Jae Hyeon wrote:
> Hi Samza Devs
>
> StreamTask can control SamzaContainer.commit() through task coordinator.
> Can we make SystemProducer control commit after flush? With
Hi Samza Devs
StreamTask can control SamzaContainer.commit() through task coordinator.
Can we make SystemProducer control commit after flush? With this feature,
we can prevent any duplicate data on SamzaContainer failure.
For example, if we set commit interval as 2 minutes, before commit time
int
got it.
On Fri, Jan 23, 2015 at 9:58 AM, TJ Giuli wrote:
> Echo
> —T
> > On Jan 23, 2015, at 9:49 AM, Chris Riccomini
> wrote:
> >
> > Hey all,
> >
> > Could you please confirm that you're seeing this? I'm trying to verify
> the
> > TLP migration for:
> >
> > https://issues.apache.org/jira/bro
33 matches
Mail list logo