Re: Best way for reading all messages and close

2018-09-17 Thread David Espinosa
ions from > > the beginning, and keep polling until all partitions has reached EOF. > > > Though, if you have concurrent writers, new messages may be appended > > after you observe EOF on a partition, so you are never guaranteed to have > > read all messages at the time

Best way for reading all messages and close

2018-09-14 Thread David Espinosa
Hi all, Although the usage of Kafka is stream oriented, for a concrete use case I need to read all the messages existing in a topic and once all them has been read then closing the consumer. What's the best way or framework for doing this? Thanks in advance, David,

Include Keys in FileStreamSink connector

2018-08-08 Thread David Espinosa
Hi all, I'm trying to backup the whole content of an avro topic into a file, and later restoring the Kafka topic from the file using Kafka Connect. I'm using avroconverters from both key and value, but key is not included in the dump file. Somebody knows how to include keys using Kafka Connect? (

Re: How set properly infinite retention

2018-07-30 Thread David Espinosa
Thanks a lot! I will try that! David El lun., 30 jul. 2018 a las 13:26, Kamal Chandraprakash (< kamal.chandraprak...@gmail.com>) escribió: > log.retention.ms = 9223372036854775807 (Long.MAX_VALUE) > > > On Mon, Jul 30, 2018 at 3:04 PM David Espinosa wrote: > > > Hi

Re: How set properly infinite retention

2018-07-30 Thread David Espinosa
log cleaner. > > On Mon, 30 Jul 2018, 10:07 David Espinosa, wrote: > > > Hi all, > > I would like to set infinite retention for all topics created in the > > cluster by default. > > I have tried with: > > > > *log.retention.ms <http://log.retention.ms&g

How set properly infinite retention

2018-07-30 Thread David Espinosa
Hi all, I would like to set infinite retention for all topics created in the cluster by default. I have tried with: *log.retention.ms =-1* at *server.properties* But messages get deleted approx after 10 days. Which configuration at broker level should I use for infinite

Is possible to have an infinite delete.retention.ms?

2018-06-25 Thread David Espinosa
Hi all, I would like to setup a compaction policy on a topic where a message can be deleted (GDPR..) using a tombstone with the same key that the message to be removed. My problem is that I would like to use empty payload messages also for identifying that an entity has been deleted, but these las

How set log compaction policies at cluster level

2018-06-14 Thread David Espinosa
Hi all, I would like to apply log compaction configuration for any topic in my kafka cluster, as default properties. These configuration properties are: - cleanup.policy - delete.retention.ms - segment.ms - min.cleanable.dirty.ratio I have tried to place them in the server.properties

Log Compaction configuration over all topics in cluster

2018-05-09 Thread David Espinosa
Hi all, I would like to apply log compaction configuration for any topic in my kafka cluster, as default properties. These configuration properties are: - cleanup.policy - delete.retention.ms - segment.ms - min.cleanable.dirty.ratio I have tried to place them in the server.properties

Re: Recommended max number of topics (and data separation)

2018-01-31 Thread David Espinosa
I used: -Djute.maxbuffer=50111000 and the gain I had is that I could increment number of topics from 70k to 100k :P 2018-01-30 23:25 GMT+01:00 Andrey Falko : > On Tue, Jan 30, 2018 at 1:38 PM, David Espinosa wrote: > > Hi Andrey, > > My topics are replicated with a replicated f

Re: Recommended max number of topics (and data separation)

2018-01-30 Thread David Espinosa
way around about topic naming, I think the longer the topic names are the sooner jute.maxbuffer overflows. David 2018-01-30 4:40 GMT+01:00 Andrey Falko : > On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa wrote: > > Hi Monty, > > > > I'm also planning to use a big amou

Re: Recommended max number of topics (and data separation)

2018-01-28 Thread David Espinosa
Hi Monty, I'm also planning to use a big amount of topics in Kafka, so recently I made a test within a 3 nodes kafka cluster where I created 100k topics with one partition. Sent 1M messages in total. These are my conclusions: - There is not any limitation on kafka regarding the number of topic

Confluent REST proxy and Kafka Headers

2018-01-26 Thread David Espinosa
Hi all, Does somebody know if it's possible to retrieve message kafka headers (since 11) using the Confluent REST proxy? Thanks in advance,

Re: GDPR appliance

2018-01-26 Thread David Espinosa
> > > > Regards, > > > > > > > > > > > > Lars Albertsson > > > Data engineering consultant > > > www.mapflat.com > > > https://twitter.com/lalleal > > > +46 70 7687109 <+46%2070%20768%2071%2009> <+46%2070

Re: GDPR appliance

2017-11-23 Thread David Espinosa
s > control we just configured SimpleAclAuthorizer. The net result is, some > consumers can only read redacted topic and very few have consumers can read > unredacted. > > On Wed, Nov 22, 2017 at 10:47 AM David Espinosa wrote: > > > Hi all, > > I would like to double c

GDPR appliance

2017-11-22 Thread David Espinosa
Hi all, I would like to double check with you how we want to apply some GDPR into my kafka topics. In concrete the "right to be forgotten", what forces us to delete some data contained in the messages. So not deleting the message, but editing it. For doing that, my intention is to replicate the top

Kafka, Data Lake and Event Sourcing

2017-08-21 Thread David Espinosa
Hi, Nowadays in my company we are planning to create a Data Lake. As we have started also to use Kafka as our Event Store, and therefore implement some Event Sourcing on it, we are wondering if it would be a good idea to use the same approach to create a Data Lake. So, one of the ideas in our mind

java.io.IOException: Packet len4194320 is out of range!

2017-08-07 Thread David Espinosa
Hi, I'm having this error when trying to connect zookeeper once I have created +70k topics. I have played with the java property jute.maxbuffer with no success. Have anybody found this error before? Thanks in advance, David

jute maxbuffer in Zookeeper

2017-08-07 Thread David Espinosa
Hi all, In fact this is a question regarding Zookeeper but very related to Kafka. I'm doing a test in order to check that we can create up to 100k topics in a kafka cluster, so we can manage multitenancy this way. After a proper setup, I have managed to create those 100k topics in Kafka, but now I

Re: Spring release using apache clients 11

2017-07-21 Thread David Espinosa
mework 5.0 and Java 8. > > On 2017-07-20 15:58 (-0400), David Espinosa wrote: > > Thanks Rajini! > > > > El dia 20 jul. 2017 18:41, "Rajini Sivaram" va > > escriure: > > > > > David, > > > > > > The release plans are here: h

Re: Spring release using apache clients 11

2017-07-20 Thread David Espinosa
ich is > planned just after the next SF 5.0 RC3, which is expected tomorrow. > > Regards, > > Rajini > > On Thu, Jul 20, 2017 at 5:01 PM, David Espinosa wrote: > > > Hi, somebody know if we will any spring integration/kafka release soon > > using apache clients 11? > > >

Spring release using apache clients 11

2017-07-20 Thread David Espinosa
Hi, somebody know if we will any spring integration/kafka release soon using apache clients 11?

Bridging activeMQ and Kafka

2017-05-25 Thread David Espinosa
Hi All, I want to migrate our system which is using activeMQ to Kafka. In order to do a gradual migration to Kafka, I would like to create a bridge between activeMQ and Kafka, so a producer and consumer could be working on different message brokers until the migration is complete and all my servic

Re: Partitions as mechanism to keep multitenant segregated data

2017-05-23 Thread David Espinosa
small customer numbers (hundreds at most, ever). Instead, use a hash > function and a key, as recommended to land customers on the same partition. > > Thanks > > Tom Crayford > Heroku Kafka > > On Tue, May 23, 2017 at 9:46 AM, David Espinosa wrote: > > > Hi, > > >

Partitions as mechanism to keep multitenant segregated data

2017-05-23 Thread David Espinosa
Hi, In order to keep separated (physically) the data from different customers in our application, we are using a custom partitioner to drive messages to a concrete partition of a topic. We know that we are loosing parallelism per topic this way, but our requirements regarding multitenancy are high