I would try to ping legal again and see if they respond. If not, I think we 
will need to come up with a simpler approach, that does not require legal 
approval.

⁣D.​

On Jul 18, 2017, 2:23 PM, at 2:23 PM, Nikita Ivanov <nivano...@gmail.com> wrote:
>Igniters,
>Just a quick update. I haven't gotten response from ASF Legal on this
>thread and I frankly don't know how to proceed here. What's the process
>to
>arrive to a decision point here?
>
>Thanks!
>--
>Nikita Ivanov
>
>
>On Mon, Jul 10, 2017 at 3:11 PM, Konstantin Boudnik <c...@apache.org>
>wrote:
>
>> On Sat, Jul 08, 2017 at 11:04AM, Nikita Ivanov wrote:
>> > Cos,
>> > Based on my experience having it off by default negates the entire
>> > purpose... We need statistically meaningful data set to make any
>> inferences
>> > from it. Moreover, if we are going to ask folks to turn it on it
>will
>> > significantly skew the resulting data set anyways and show full
>picture.
>> I
>> > think "on" by default is the better option if we are to collect
>usage
>> stats
>> > to begin with.
>>
>> yes, sure. But having this "on" by default is likely to expose us to
>> another
>> shit-storm down the road. An interesting dilemma to have indeed. In
>my
>> experience, whenever I install something like a browser or an
>operating
>> system, it would ask if I want to make the particular piece of
>software
>> better
>> by sending back some anonymized stats. Basically, I am given a way to
>> explicitly opt-out if I wish.
>>
>> By turning the feature "on" by default is like saying: "we'll be
>collecting
>> some stats, but if you don't want to you can go here and there and
>disable
>> the
>> collection. Oh, and by the way - you need to go and figure out the
>exact
>> steps
>> to disable it."
>>
>> > Also, I want to re-iterate it again to avoid misunderstanding:
>there is
>> no
>> > proposal nor will there be a technical way to attribute collected
>data
>> back
>> > to a certain company. That's not what this is all about. We should
>only
>> be
>> > interested in aggregated stats (community size, geo information,
>language
>> > information, components usage).
>>
>> Yes, I think it is clear, but never hurts to re-iterate.
>>
>> Cos
>>
>> > Thoughts?
>> >
>> > --
>> > Nikita Ivanov
>> > Founder & CTO
>> > GridGain Systems
>> >
>> > On Fri, Jul 7, 2017 at 8:17 PM, Konstantin Boudnik <c...@apache.org>
>> wrote:
>> >
>> > > Actually, that should be OFF by default. It sounds like this
>reduce the
>> > > amount
>> > > of the data collected, but this would address the concerns of
>companies
>> > > like
>> > > Roman's. I know for sure that a few of my clients would sue my
>ass out
>> of
>> > > existence if I gave them the platform collecting their
>data-centers
>> info.
>> > >
>> > > Let's have it, set if off by default and document and easy way to
>turn
>> it
>> > > off.
>> > > Then start making rounds asking our user base to share _some_ of
>the
>> stats
>> > > with the community, so we can track the growth of the install
>base,
>> etc.
>> > >
>> > > Cos
>> > >
>> > > On Thu, Jul 06, 2017 at 08:20AM, Nikita Ivanov wrote:
>> > > > The idea so far is to have a single system property in
>configuration
>> that
>> > > > turns this off completely. I envision that this will be
>prominently
>> > > > featured on Ignite website so that everyone who would like to
>> disable it
>> > > -
>> > > > can do it in seconds.
>> > > >
>> > > > Thoughts?
>> > > >
>> > > > --
>> > > > Nikita Ivanov
>> > > > Founder & CTO
>> > > > GridGain Systems
>> > > >
>> > > > On Wed, Jul 5, 2017 at 9:27 PM, Roman Shtykh
><rsht...@yahoo.com>
>> wrote:
>> > > >
>> > > > > Nikita,
>> > > > >
>> > > > > Sending and storing (somewhere the company cannot securely
>handle)
>> any
>> > > > > information (OS version, IP addresses, etc.) that can be used
>to
>> > > compromise
>> > > > > the services would be unacceptable.
>> > > > > Turning it off might be ok (possibly through the cluster
>settings,
>> not
>> > > via
>> > > > > globally-accessible site), but the thing that there's a risk
>some
>> > > > > information can leak outside (for any reason, starting from a
>human
>> > > > > mistake) is scary.
>> > > > >
>> > > > > -- Roman
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Thursday, July 6, 2017 12:38 PM, Nikita Ivanov <
>> > > niva...@gridgain.com>
>> > > > > wrote:
>> > > > >
>> > > > >
>> > > > > Roman,
>> > > > > Thanks for the feedback. What are those questions
>specifically?
>> Are IP
>> > > > > addresses and OS is what causing it?
>> > > > >
>> > > > > Thanks!
>> > > > >
>> > > > > --
>> > > > > Nikita Ivanov
>> > > > > Founder & CTO
>> > > > > GridGain Systems
>> > > > >
>> > > > > On Wed, Jul 5, 2017 at 6:15 PM, Roman Shtykh
>> <rsht...@yahoo.com.invalid
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > NIkita,
>> > > > >
>> > > > > While this will help improve Ignite, it will prevent its
>adoption
>> by
>> > > many
>> > > > > projects -- sending and retaining IP adresses, OS versions,
>etc.
>> raises
>> > > > > tons of questions when considering to use Ignite. Even if it
>can be
>> > > opted
>> > > > > out.
>> > > > > -- Roman
>> > > > >
>> > > > >
>> > > > >     On Thursday, July 6, 2017 5:38 AM, Nikita Ivanov <
>> > > nivano...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > >
>> > > > >  Igniters,
>> > > > > I would like to kick off the discussion on the idea of
>collecting
>> > > Ignite
>> > > > > usage statistics. The basic idea behind this is to better
>> understand
>> > > > > general and anonymous Ignite usage information to better
>calibrate
>> > > > > community efforts in developing new features, improving
>existing
>> ones,
>> > > > > delivering better documentation - and in every other way to
>make
>> our
>> > > > > project a better software solution.
>> > > > >
>> > > > > Although such instrumentation is standard practice in
>commercially
>> > > > > developed software, for an ASF project this could be a
>sensitive
>> issue.
>> > > > > Therefore I would like to initiate a full community
>discussion on
>> how
>> > > best
>> > > > > to implement such practice for the benefit of project while
>> ensuring
>> > > the
>> > > > > privacy protection of Ignite users.
>> > > > >
>> > > > > To ignite (pun intended) the discussion I'll outline below
>some of
>> the
>> > > > > basic thoughts that I have on this subject. They are here
>only to
>> give
>> > > an
>> > > > > idea of what such instrumentation may potentially look like
>so
>> that we
>> > > can
>> > > > > discuss the merits of this idea in a tangible context.
>> > > > >
>> > > > > Overview
>> > > > > -------------
>> > > > > Upon start and every hour thereafter each Ignite node will
>collect,
>> > > encrypt
>> > > > > and send usage statistics over HTTPS to the ASF-hosted
>server. That
>> > > server
>> > > > > will accept such HTTPS packets, decrypt them and store them
>in a
>> > > > > time-series DB. A web interface will be provided to view the
>usage
>> > > > > information.
>> > > > >
>> > > > > Opt-In or Opt-out
>> > > > > -------------------------
>> > > > > Opt-out. Ignite website will offer simple instructions
>(system
>> > > property) on
>> > > > > how to disable this instrumentation.
>> > > > >
>> > > > > Code, Infra, Access
>> > > > > ---------------------------
>> > > > > Ignite instrumentation will be part of the Ignite code base.
>The
>> > > collection
>> > > > > server will be a separate module in the Ignite code base
>(released
>> > > > > separately from Ignite). The collection server will be hosted
>by
>> ASF
>> > > Infra.
>> > > > >
>> > > > > Usage statistics will be publicly accessible by anyone in the
>> > > community.
>> > > > >
>> > > > > Private, Personal Data
>> > > > > ------------------------------
>> > > > > No private or personal data will ever be transferred. No
>emails,
>> > > usernames,
>> > > > > company names, grid names, etc.
>> > > > >
>> > > > > Data Retention
>> > > > > --------------------
>> > > > > All data will be retained for 1 year and deleted permanently
>> > > thereafter.
>> > > > >
>> > > > > Usage Data
>> > > > > ----------------
>> > > > > The following data will be collected in each packet sent to
>the
>> > > collection
>> > > > > server:
>> > > > > - GRID_SIZE (to correspond our testing environment with the
>more
>> > > frequent
>> > > > > cluster sizes)
>> > > > > - IP_ADDR (for general geo-tracking as well as to know what
>> > > documentation
>> > > > > language should be a priority)
>> > > > > - SES_ID (to track continues uptime vs. re-starts)
>> > > > > - USERNAME_TYPE (privilege username vs. standard, to track
>> production
>> > > vs.
>> > > > > dev/testing usage; note - this is not an actual username)
>> > > > > - OS_NAME
>> > > > > - OS_VER
>> > > > > - OS_ARCH
>> > > > > - JAVA_VER
>> > > > > - JAVA_VENDOR
>> > > > > - COMP_SQL (whether or not this feature was used)
>> > > > > - COMP_COMPUTE (whether or not this feature was used)
>> > > > > - COMP_DATAGRID (whether or not this feature was used)
>> > > > > - COMP_STREAMING (whether or not this feature was used)
>> > > > > - COMP_IGFS (whether or not this feature was used)
>> > > > > - COMP_SERVICE (whether or not this feature was used)
>> > > > > - COMP_PERSISTENCE (whether or not this feature was used)
>> > > > >
>> > > > > Please let's discuss this idea. Everyone's comments and
>> suggestions are
>> > > > > *extremely* welcome.
>> > > > >
>> > > > > Thanks,
>> > > > > Nikita Ivanov.
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > >
>>
>>

Reply via email to