from:"Malcolm McFarland"

Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland

ssage about how the container is requesting more resources than it can allocate. With 1 core, everything is fine. Is there another Samza option I need to set? Cheers, Malcolm -- Malcolm McFarland Cavulus

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland

d be sufficient. We haven't seen this > issue before. What Samza/YARN versions are you using? Can you also include > the logs from where you get the error and your yarn configuration? > > - Prateek > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland > wrote: > > &g

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland

this? Looking at the Samza source on Github, it appears to be information that's passed back to the AM when it starts up. Cheers, Malcolm On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland wrote: > > Hi Prateek, > > Sorry, meant to include these versions with my email; I'm

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland

etected this and decided to default to 1? Can you > try setting maximum-allocation-vcores lower? > > - Prateek > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland > wrote: > > > One other detail: I'm running YARN on ECS in AWS. Has anybody seen > > issues with

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland

yScheduler.html> > and DominantResourceCalculator to account for vcore allocations in > scheduling. > > - Prateek > > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland > wrote: > > > Hi Prateek, > > > > This still seems to be manifesting with the same problem. Sinc

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland

One more thing -- fwiw, I actually also came across the possibility that I would need to use the DominantResourceCalculator, but as you point out, this doesn't seem to be available in Hadoop 2.6. On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland wrote: > That's quite helpfu

Re: Running w/ multiple CPUs/container on YARN

2019-04-02 Thread Malcolm McFarland

=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator And on the Samza side, I'm setting: cluster-manager.container.cpu.cores=2 However, YARN is still telling me that the running task has 1 vcore assigned. Do you have any other suggestions for options to tweak? Cheers, Malcolm On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland wrote:

Re: Running w/ multiple CPUs/container on YARN

2019-04-02 Thread Malcolm McFarland

esponse. The YARN RM supports container requests with max-mem: > > 14336, max-cpu: 1" > > > > On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland > > wrote: > > > >> Hey Prateek, > >> > >> The upgrade to Hadoop 2.7.6 went fine; everything seems

Re: Running w/ multiple CPUs/container on YARN

2019-04-02 Thread Malcolm McFarland

tion shown on-demand, as opposed to preemptive? Cheers, Malcolm Cheers, Malcolm On Tue, Apr 2, 2019 at 12:54 PM Malcolm McFarland wrote: > Hi Prateek, > > I'm not getting an error now, just an unyielding vcore allotment of 1. > I just verified that we

Re: Running w/ multiple CPUs/container on YARN

2019-04-02 Thread Malcolm McFarland

from the v2.6.1 docs (which I was initially using because of its inclusion in the hello-samza project) to mean that this was a per-container setting. Thanks again for the help, and for the tip on upgrading to Yarn 2.7.6! Cheers, Malcolm On Tue, Apr 2, 2019 at 1:47 PM Malcolm McFarland wrote

Re: Samza 1.1.0 on AWS EMR (emr - 5.13.0, amazon 2.8.3, zookeeper 3.4.10)

2019-04-18 Thread Malcolm McFarland

situation here: > > > > > https://stackoverflow.com/questions/55737123/samza-1-1-0-run-app-sh-does-not-work-during-deployment-of-hello-samza > > > > Can someone on your team please help? > > > > Many thanks, > > > > Majd > > > > >

Samza tasks aren't starting in YARN containers

2019-05-07 Thread Malcolm McFarland

container to try and deduce what's happening? Cheers, Malcolm -- Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this message is prohibited. The information con

Re: Samza tasks aren't starting in YARN containers

2019-05-07 Thread Malcolm McFarland

e, May 7, 2019 at 3:22 PM Malcolm McFarland wrote: > Hey folks, > > We're having some trouble running Samza under YARN. The YARN > containers are launching fully into the RUNNING state, and I can see > in the node manager logs that the containers are running, but my logs > are sh

Re: Samza tasks aren't starting in YARN containers

2019-05-10 Thread Malcolm McFarland

n-version? > > Can you roll-back to a known-good version to better isolate the issue? > > Best, > Jagadish > > On Tue, May 7, 2019 at 3:54 PM Malcolm McFarland > wrote: > > > As a followup to this, here's what I see when the Samza app tries to > start; &

Re: Samza tasks aren't starting in YARN containers

2019-05-20 Thread Malcolm McFarland

mpsonLinux"; rel="noreferrer" target="_blank">https://confluence.atlassian.com/doc/generating-a-thread-dump-externally-182158040.html#GeneratingaThreadDumpExternally-GeneratingthreaddumpsonLinux> > java.net.ConnectException Since this is a ConnectException, can you rule out network

Tracing the Samza+YARN startup process

2019-05-21 Thread Malcolm McFarland

process on a YARN cluster, from Accepted status, to localization, to the application master startup, to the actual application's startup? Cheers, Malcolm -- Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disc

Re: Samza tasks aren't starting in YARN containers

2019-05-22 Thread Malcolm McFarland

astructure is configured. I think I need to find which logs to look at in order to trace the startup/localization steps. If anybody has advice on this, I'd appreciate it! Cheers, Malcolm On Mon, May 20, 2019 at 9:49 PM Malcolm McFarland wrote: > > Hey Jagadish, > > Thanks for the ti

Re: Tracing the Samza+YARN startup process

2019-05-22 Thread Malcolm McFarland

ething you were looking for? > > Also, by "don't fully start up" do you mean that > applications are missing some containers (but the ApplicationMaster is > running)? > Or the application is missing entirely. > > -- > thanks > rayman > [image: Samza Job La

AM resource needs

2019-05-23 Thread Malcolm McFarland

Hey folks, Are there any guidelines for how to provision an Application Master in relation to the number of StreamTask instances it will be managing? Ie, are there different memory, CPU, and thread-count figures for 100S StreamTasks vs 1000, vs 1? Cheers, Malcolm McFarland Cavulus This

Re: AM resource needs

2019-05-24 Thread Malcolm McFarland

nd the container (ie, yarn.am.container.memory.mb=1536, yarn.am.opts=-Xmx1024m); does that sound reasonable? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this messag

Re: Tracing the Samza+YARN startup process

2019-05-30 Thread Malcolm McFarland

at's interfering with this,or maybe just isn't white-listing a port correctly, and if I could identify where the application is stalling, it'd probably help to narrow down the possibilities. Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Ca

Re: Tracing the Samza+YARN startup process

2019-05-31 Thread Malcolm McFarland

other than what's returned from /bin/hostname (I'm guessing it uses gethostname() on Ubuntu, could be wrong). Anybody ideas? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distributio

Limiting the job coordinator port range

2019-06-04 Thread Malcolm McFarland

Hey folks, Is there any way to specify which ports the tasks communicate with the job coordinator on? Right now it looks like it's 3+. Is there any way to narrow this? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthoriz

Re: Tracing the Samza+YARN startup process

2019-06-18 Thread Malcolm McFarland

te only allows 10GB of storage (this can be extended a small amount via an ephemeral mounted volume but seemingly not enough to satisfy YARN's VM requirements). Hth, and thanks for everybody's patience, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b

Re: Tracing the Samza+YARN startup process

2019-06-20 Thread Malcolm McFarland

No problem -- I'm happy that we finally figured this out and could share our results. ECS could actually be a good choice for Node Managers; it's easy in ECS to scale node counts up and down and to cycle out unhealthy servers. Malcolm McFarland Cavulus This correspondence is from Hea

Where does task.class actually matter?

2019-07-17 Thread Malcolm McFarland

re if the bundle is built locally or not? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this message is prohibited. The information contained in this messa

Re: Where does task.class actually matter?

2019-07-18 Thread Malcolm McFarland

* the task from affect the box that *runs* the task? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this message is prohibited. The information contained in this me

Using Kafka's ProducerInterceptor with Samza

2019-09-10 Thread Malcolm McFarland

as anybody attempted this sort of combination? Cheers, Malcolm McFarland Cavulus [0] https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/producer/ProducerInterceptor.html [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-42%3A+Add+Producer+and+Consumer+Interceptors This corresponde

Questions about using custom groupers

2019-09-25 Thread Malcolm McFarland

* topics to be removed. I have two questions: 1) Where within these queues is the grouper configuration stored? 2) Would a Kafka topic cleanup.policy of "compact" cause trouble here? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Ca

Re: Using Kafka's ProducerInterceptor with Samza

2019-10-07 Thread Malcolm McFarland

when Samza picks up the message for processing. Are there any ideas out there about how to do this? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this mes

Re: Using Kafka's ProducerInterceptor with Samza

2019-10-07 Thread Malcolm McFarland

io silence on a YARN deploy (we've had no other code-related issues transitioning to YARN). I'll take a look through the metrics and see if any of those could fill this role. Right now we're looking at per-partition consumption, and maybe "process-calls" will help with th

Occasional checkpoint mismatch on Samza task reload

2019-10-31 Thread Malcolm McFarland

mechanism. Are there any best practices or gotchas surrounding restarting Samza applications on YARN that could help here? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the c

Samza compaction policy

2019-11-05 Thread Malcolm McFarland

Hey folks, We have cleanup.policy=compact set on our checkpoint topics. Even with this, we have almost 3 billion messages in some of these topics, and this is causing huge startup times. Are there any other settings we should set to optimize our startup times? Cheers, Malcolm McFarland Cavulus

Re: Occasional checkpoint mismatch on Samza task reload

2019-11-07 Thread Malcolm McFarland

s and yarn.nodemanager.process-kill-wait.ms YARN values. Would this give Samza more time to shutdown, perhaps allowing unpersisted checkpoints to be written out? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, co

Re: Samza compaction policy

2019-11-07 Thread Malcolm McFarland

=compact? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this message is prohibited. The information contained in this message is intended only for the personal

Re: Samza compaction policy

2019-11-07 Thread Malcolm McFarland

Actually, do you have an example of some appropriate settings for Kafka to ensure that compaction is behaving correctly for the Samza checkpoint topics? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure

Re: Occasional checkpoint mismatch on Samza task reload

2019-11-07 Thread Malcolm McFarland

Also, is there a way to produce this error, ie if we added extra messages to the __checkpoint topics? Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this message

Resource allocation on YARN

2020-02-23 Thread Malcolm McFarland

so, do you recommend scaling up in box YARN node processing capability, or out in YARN node count? Thanks, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this message is

Re: Resource allocation on YARN

2020-02-24 Thread Malcolm McFarland

What would the effect be on a container that was only allowed one CPU core? Would it be ok to trade that off for more containers? Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use

Re: Resource allocation on YARN

2020-02-25 Thread Malcolm McFarland

about good rule-of-thumb values for each of these parameters. Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or improper disclosure, copying, distribution, or use of the contents of this message is prohibited. The information contain

Streamtasks not instantiating consumers on startup

2020-04-09 Thread Malcolm McFarland

ver? We're running Samza 0.14.1 and using AWS MSK which is running version Kafka 2.2.1. Thanks so much, Malcolm McFarland Cavulus

Re: Streamtasks not instantiating consumers on startup

2020-04-11 Thread Malcolm McFarland

ation. Fwiw, we're using Kafka 0.11.0.2 with Samza 0.14.1; my understanding is that there should be version compatibility between Kafka 0.11.0.x-2.x. If you have any other ideas, I'd be interested in hearing them. Cheers, Malcolm McFarland Cavulus On Fri, Apr 10, 2020 at 8:36 PM Yi Pa

Upgrading from 0.14 -> 1.5.1

2020-12-14 Thread Malcolm McFarland

in pid=a3e86ddf-8d18-40c9-8063-1efd588cec56. Stopping this processor. New JobModel: JobModel [..] At this point the ThreadJob shuts down cleanly. Afaict, the legacy configuration is set up correctly, and mirrors our functional build under 0.14.1. Any thoughts? Cheers, Malcolm McFarland Cavulus

Re: Upgrading from 0.14 -> 1.5.1

2020-12-18 Thread Malcolm McFarland

t down for session: 0x18fae2900a0 2020-12-17 17:04:32.309 [Samza Debounce Thread-81ac6a4e-3d5e-479c-9a6c-2f2d9b4372d3] StreamProcessor [INFO] Shutting down the executor service of the stream processor: 81ac6a4e-3d5e-479c-9a6c-2f2d9b4372d3. Does this help? Cheers, Malcolm McFarland Cav

Re: Upgrading from 0.14 -> 1.5.1

2020-12-18 Thread Malcolm McFarland

ones using coordination service) one must use org.apache.samza.container.grouper.task.GroupByContainerIdsFactory". I added that in and everything is starting up smoothly. Cheers, Malcolm McFarland Cavulus This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any unauthorized or impr

yarn-site.xml setup

2020-12-28 Thread Malcolm McFarland

ress) correctly: HADOOP_YARN_HOME YARN_HOME HADOOP_COMMON_HOME HADOOP_HOME HADOOP_PREFIX HADOOP_CONF_DIR My yarn-site.xml is at $YARN_HOME/etc/hadoop/yarn-site.xml, and contains the following configuration: yarn.resourcemanager.address ${RM_IP_ADDRESS} This worked fine in 0.14.1. Cheers, Ma

resourcemanager setting on YARN

2021-03-11 Thread Malcolm McFarland

e the local yarn-site.xml accurately. When it starts on yarn, though, it seems to be resolving the resource manager to localhost. The final exception information is at the end of this email. Any ideas? Cheers, Malcolm McFarland Cavulus Failed to connect to server: localhost/127.0.0.1:8030: retries get f

Re: [VOTE] Apache Samza 1.7.0 RC1

2022-05-19 Thread Malcolm McFarland

Hey all, Has Samza 1.7.0 been officially released? I've been following the discussion here, and it seems like it was cleared in March, but I haven't seen any announcements or updates to the docs. Not trying to be pushy here, just curious about the status of release 1.7.0. Cheer

Re: Java 11 Pull requests

2022-08-26 Thread Malcolm McFarland

this PR and see if we can integrate it into a local Java 11-based build. Thanks Jamie! Cheers, Malcolm McFarland Cavulus On Wed, Aug 24, 2022 at 1:47 PM James DeMichele wrote: > Hi, I'm not sure if my previous email went through so thought that I would > try again. > > I know

Running v1.7.0 locally

2022-08-29 Thread Malcolm McFarland

/simple-legacy-task-1-2.0-coordinationData/jobModelGeneration/jobModelVersion At which point the application silently exits. Thanks in advance for any advice, ideas, things to check, etc. Cheers, Malcolm McFarland Cavulus

Re: Java 11 Checkin again

2022-09-02 Thread Malcolm McFarland

.0 ( https://hadoop.apache.org/docs/r3.3.0/index.html). Are there any unit tests in Samza that verify compatibility against a YARN cluster? If so, that could be a place to validate YARN v2.10/v3.3 cross-compatibility. Just throwing my 2 cents out there, Malcolm McFarland Cavulus On Fri, Sep 2, 2022

Samza partition hashing relative to other clients

2022-12-14 Thread Malcolm McFarland

nybody know a) if it's possible to define a custom key-to-partition hashing algorithm in Samza, or b) if there is a reliable general-purpose algorithm that can create the same results as Samza's algorithm? Cheers, Malcolm McFarland Cavulus [0] https://github.com/apache/samz

Re: Samza partition hashing relative to other clients

2022-12-15 Thread Malcolm McFarland

pecific about my reasoning! Cheers, Malcolm McFarland Cavulus [0] https://github.com/apache/samza/blob/1.7.0/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala#L97-L101 [1] https://github.com/apache/kafka/blob/2.4/clients/src/main/java/org/apache/kafka/clients/produce

Missing information for release 1.7.0

2023-04-27 Thread Malcolm McFarland

Hey all, I just noticed that, in the blog post for Apache 1.8.0, there is mention of following instructions for the 1.7.0 upgrade. However, there is no blog post about 1.7.0, nor is there a version of the documentation for 1.7.0. Did that get accidentally dropped? Cheers, Malcolm McFarland

Re: Missing information for release 1.7.0

2023-04-27 Thread Malcolm McFarland

Just found this under the "Releases" tab. A little confusing, but that works! Cheers, Malcolm McFarland Cavulus On Thu, Apr 27, 2023 at 4:51 PM Malcolm McFarland wrote: > Hey all, > > I just noticed that, in the blog post for Apache 1.8.0, there is mention > of followin

ProcessJobFactory/ThreadJobFactory still viable runtime options?

2023-05-04 Thread Malcolm McFarland

what is the preferred way to run a single instance of a streamtask locally? I'm using Samza 1.6.0, Kafka 2.2.2, and ZooKeeper 3.4.14. Thanks in advance for the help! Cheers, Malcolm McFarland Cavulus Here's a sample of the last few log messages, log level set to TRACE: [INFO] Metad

YARN job resizing

2023-05-09 Thread Malcolm McFarland

an autoscaling module (removed in version 1.4.0), but no actual documentation or examples. Cheers, Malcolm McFarland Cavulus

57 matches

Mail list logo