+1 on including a reasonable default for log.dirs that points to data/ in
the installation directory. This is followed by some other systems and is
probably a little better compared to /tmp.

Also, I actually think we should default to production ready settings for
most configs, at least those related to performance. In most cases, this
works just fine and doesn't hurt the development environment much. Same
goes for GC settings. There is no value in shipping with a smaller heap or
non standard GC configs.

I ran into a bunch of this while writing docs for Kafka and thinking from
the user's perspective. And I think doing both of these things will improve
user experience.

Thanks,
Neha

On Sun, Feb 1, 2015 at 8:43 AM, Jay Kreps <boredandr...@gmail.com> wrote:

> Yeah I second Jaikiran's take: I think we should stick to something that
> works "out of the box" without editing config. I think the experience here
> is important, and adding one manual configuration step quickly cascades
> into a bunch of these steps (since it is always easier to make something
> manual than to think through how to implement a reasonable default).
>
> Of course when going to production you have to think, and I don't have a
> ton of sympathy for people who are doing production deploys with their data
> in /tmp, but I agree it would be better to make this a little safer even in
> that case if we can come up with a way to give a local data dir.
>
> Another approach I just thought of would be to just ship two configs: a
> developer.properties and production.properties each of which would have
> reasonable "starting point" configurations. This might actually be better
> since many of the settings we default to are a little anemic for a
> production install (e.g. default number of partitions, default replication
> factor, thread count, etc). I suspect this might solve a lot of support
> problems, actually, since most people don't think about that stuff.
>
> -Jay
>
> On Sat, Jan 31, 2015 at 7:27 AM, Jaikiran Pai <jai.forums2...@gmail.com>
> wrote:
>
> >  Hi Jeff,
> >
> > I guess you are :)
> >
> > Personally, whenever I download and try a new project *in development
> > environment* I always just want to get it up and running without having
> to
> > fiddle with configurations. Of course, I do a bit of reading the docs,
> > before trying it out, but I like to have the install and run to be
> > straightforward without having to change/add configurations. Having
> > sensible defaults helps in development environments and in getting
> started.
> > IMO, this param belongs to that category.
> >
> > -Jaikiran
> >
> >
> > On Thursday 29 January 2015 08:00 PM, Jeff Holoman wrote:
> >
> > Maybe I'm in the minority here, but I actually don't think there should
> be
> > a default for this param and you should be required to explicitly set
> this.
> >
> > On Thu, Jan 29, 2015 at 5:43 AM, Jay Kreps <boredandr...@gmail.com>
> wrote:
> >
> >>
> >>
> >> > On Jan. 29, 2015, 6:50 a.m., Gwen Shapira wrote:
> >> > > We added --override option to KafkaServer that allows overriding
> >> default configuration from commandline.
> >> > > I believe that just changing the shell script to include --override
> >> log.dir=${KAFKA_HOME}/data
> >> > > may be enough?
> >> > >
> >> > > overriding configuration from server.properties in code can be very
> >> unintuitive.
> >> >
> >> > Jaikiran Pai wrote:
> >> >     That sounds a good idea. I wasn't aware of the --override option.
> >> I'll give that a try and if it works then the change will be limited to
> >> just the scripts.
> >> >
> >> > Jaikiran Pai wrote:
> >> >     Hi Gwen,
> >> >
> >> >     I had a look at the JIRA
> >> https://issues.apache.org/jira/browse/KAFKA-1654 which added support
> for
> >> --override and also the purpose of that option. From what I see it
> won't be
> >> useful in this case, because in this current task, we don't want to
> >> override a value that has been explicitly set (via server.properties for
> >> example). Instead we want to handle a case where no explicit value is
> >> specified for the data log directory and in such cases default it to a
> path
> >> which resides under the Kafka install directory.
> >> >
> >> >     If we use the --override option in our (startup) scripts to set
> >> log.dir=${KAFKA_HOME}/data, we will end up forcing this value as the
> >> log.dir even when the user has intentionally specified a different path
> for
> >> the data logs.
> >> >
> >> >     Let me know if I misunderstood your suggestion.
> >> >
> >> > Jay Kreps wrote:
> >> >     I think you are right that --override won't work but maybe this is
> >> still a good suggestion?
> >> >
> >> >     Something seems odd about force overriding the working directory
> of
> >> the process just to set the log directory.
> >> >
> >> >     Another option might be to add --default. This would work like
> >> --override but would provide a default value only if none is specified.
> I
> >> think this might be possible because the java.util.Properties we use for
> >> config supports a hierarchy of defaults. E.g. you can say new
> >> Properties(defaultProperties). Not sure if this is better or worse.
> >> >
> >> >     Thoughts?
> >> >
> >> > Jaikiran Pai wrote:
> >> >     Hi Jay,
> >> >
> >> >     > Another option might be to add --default. This would work like
> >> --override but would provide a default value only if none is specified.
> I
> >> think this might be possible because the java.util.Properties we use for
> >> config supports a hierarchy of defaults. E.g. you can say new
> >> Properties(defaultProperties). Not sure if this is better or worse.
> >> >
> >> >     I think --default sounds like a good idea which could help us use
> >> it for other properties too (if we need to). It does look better than
> the
> >> current change that I have done, because the Java code then doesn't
> have to
> >> worry about how that default value is sourced. We can then just update
> the
> >> scripts to set up the default for the log.dir appropriately.
> >> >
> >> >     I can work towards adding support for it and will update this
> patch
> >> once it's ready.
> >> >
> >> >     As for:
> >> >
> >> >     > Something seems odd about force overriding the working directory
> >> of the process just to set the log directory.
> >> >
> >> >     Sorry, I don't understand what you meant there. Is it something
> >> about the change that was done to the scripts?
> >>
> >> I guess what I mean is: is there any other reason you might care about
> >> the working directory of the process? If so we probably don't want to
> force
> >> it to be the Kafka directory. If not it may actually be fine and in that
> >> case I think having relative paths work is nice. I don't personally know
> >> the answer to this, what is "good practice" for a server process?
> >>
> >>
> >> - Jay
> >>
> >>
> >> -----------------------------------------------------------
> >> This is an automatically generated e-mail. To reply, visit:
> >> https://reviews.apache.org/r/30403/#review70168
> >> -----------------------------------------------------------
> >>
> >>
> >>  On Jan. 29, 2015, 6:24 a.m., Jaikiran Pai wrote:
> >> >
> >> > -----------------------------------------------------------
> >> > This is an automatically generated e-mail. To reply, visit:
> >> > https://reviews.apache.org/r/30403/
> >> > -----------------------------------------------------------
> >> >
> >> > (Updated Jan. 29, 2015, 6:24 a.m.)
> >> >
> >> >
> >> > Review request for kafka.
> >> >
> >> >
> >> > Bugs: KAFKA-1906
> >> >     https://issues.apache.org/jira/browse/KAFKA-1906
> >> >
> >> >
> >> > Repository: kafka
> >> >
> >> >
> >> > Description
> >> > -------
> >> >
> >> > KAFKA-1906 Default the Kafka data log directory to
> >> $KAFKA_HOME/data/kafka-logs directory, where KAFKA_HOME is the Kafka
> >> installation directory
> >> >
> >> >
> >> > Diffs
> >> > -----
> >> >
> >> >   bin/kafka-run-class.sh 881f578a8f5c796fe23415b978c1ad35869af76e
> >> >   bin/windows/kafka-run-class.bat
> >> 9df3d2b45236b4f06d55a89c84afcf0ab9f5d0f2
> >> >   config/server.properties 1614260b71a658b405bb24157c8f12b1f1031aa5
> >> >   core/src/main/scala/kafka/server/KafkaConfig.scala
> >> 6d74983472249eac808d361344c58cc2858ec971
> >> >   core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala
> >> 82dce80d553957d8b5776a9e140c346d4e07f766
> >> >
> >> > Diff: https://reviews.apache.org/r/30403/diff/
> >> >
> >> >
> >> > Testing
> >> > -------
> >> >
> >> > The change here involves updating the Kafka scripts (for Windows and *
> >> nix) to infer and setup KAFKA_HOME environment variable. This value is
> then
> >> used by the KafkaConfig to decide what path to default to for the Kafka
> >> data logs, in the absence of any explicitly set log.dirs (or log.dir)
> >> properties.
> >> >
> >> > Care has been taken to ensure that other mechanism which might
> >> presently be bypassing the Kafka scripts, will still continue to
> function,
> >> since in the absence of KAFKA_HOME environment property value, we fall
> back
> >> to /tmp/kafka-logs (the present default) as the default data log
> directory
> >> >
> >> > Existing tests have been run to ensure that this change maintains
> >> backward compatibility (i.e. doesn't fail when KAFKA_HOME isn't
> >> available/set) and 2 new test methods have been added to the
> >> KafkaConfigTest to ensure that this change works.
> >> >
> >> > Although the change has been made to both .sh and .bat files, to
> >> support this, I haven't actually tested this change on a Windows OS and
> >> would appreciate if someone can test this there and let me know if they
> run
> >> into any issues.
> >> >
> >> >
> >> > Thanks,
> >> >
> >> > Jaikiran Pai
> >> >
> >> >
> >>
> >>
> >
> >
> >  --
> >  Jeff Holoman
> > Systems Engineer
> > 678-612-9519
> >
> >
> >
> >
> >
>



-- 
Thanks,
Neha

Reply via email to