Re: [VOTE] Release Apache Guacamole 0.9.13-incubating (RC1)

2017-07-24 Thread Jean-Baptiste Onofré

+1 (binding)

Regards
JB

On 07/24/2017 07:32 AM, Mike Jumper wrote:

Hello Incubator PMC,

The Apache Guacamole community has voted on and approved a proposal to
release Apache Guacamole 0.9.13-incubating.

We now kindly request that the Incubator PMC members review and vote on
this incubator release.

The VOTE RESULT is here:

http://mail-archives.apache.org/mod_mbox/incubator-
guacamole-dev/201707.mbox/%3CCALKeL-Oc-xjbeKr_RiLqp_m4_
7sYdF5ZoK9T4q%3DN%2B0J7q-Hk9g%40mail.gmail.com%3E

The draft release notes (along with links to artifacts,
signatures/checksums, and updated documentation) can be found here:

http://guacamole.incubator.apache.org/releases/0.9.13-incubating/

The git tag for all relevant repositories is "0.9.13-incubating-RC1":

https://github.com/apache/incubator-guacamole-client/
tree/0.9.13-incubating-RC1 (commit de12b683d746129ddc8b34425ed6e40b618c91d6)
https://github.com/apache/incubator-guacamole-server/
tree/0.9.13-incubating-RC1 (commit ca3563a38e6416a6cde7f208dc9827c512227888)
https://github.com/apache/incubator-guacamole-manual/
tree/0.9.13-incubating-RC1 (commit a1a4ee64dd00217905288ac1d902713b0298c03b)

Build instructions are included in the manual, which is part of the updated
documentation referenced above. For convenience:

http://guacamole.incubator.apache.org/doc/0.9.13-incubating/gug/installing-
guacamole.html

Maven artifacts for guacamole-common, guacamole-common-js, and
guacamole-ext can be found in the following staging repository:

https://repository.apache.org/content/repositories/orgapacheguacamole-1007

Source and binary distributions (also linked within the release notes):

https://dist.apache.org/repos/dist/dev/incubator/guacamole/
0.9.13-incubating-RC1/

Artifacts have been signed with the "mjum...@apache.org" key listed in:

https://dist.apache.org/repos/dist/dev/incubator/guacamole/KEYS

Please review and vote:

[ ] +1 Approve the release
[ ] -1 Don't approve the release (please provide specific comments)

This vote will be open for at least 72 hours.

Thanks,

- Mike



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Edit access to wiki for proposal

2017-07-24 Thread Steve Lawrence
I would like to request edit access to the incubator wiki so that I can
create a new proposal.

My wiki username is: SteveLawrence

Thanks,
- Steve

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Edit access to wiki for proposal

2017-07-24 Thread John D. Ament
Done, happy editing!

On Mon, Jul 24, 2017 at 8:59 AM Steve Lawrence 
wrote:

> I would like to request edit access to the incubator wiki so that I can
> create a new proposal.
>
> My wiki username is: SteveLawrence
>
> Thanks,
> - Steve
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


[DISCUSS] Daffodil Incubation Proposal

2017-07-24 Thread Steve Lawrence
Dear Apache Incubator Community,

We would like to start a discussion around a proposal to bring Daffodil
into the Apache Incubator. Daffodil is a implementation of the DFDL
specification used to convert between fixed format data and XML/JSON.

The draft proposal can be found in the wiki at the following URL:

https://wiki.apache.org/incubator/DaffodilProposal

We do not yet have a champion or mentors, but it was recommended that we
create a proposal and send it to this list to potentially find those
that might be interested. The text for the draft proposal is found
below. We look forward to your input.

Thanks,
-Steve


= Daffodil Proposal =

== Abstract ==

Daffodil is an implementation of the Data Format Description Language
(DFDL) used to convert between fixed format data and XML/JSON.

== Proposal ==

The Data Format Description Language (DFDL) is a specification,
developed by the Open Grid Forum, capable of describing many data
formats, including both textual and binary, scientific and numeric,
legacy and modern, commercial record-oriented, and many industry and
military standards. It defines a language that is a subset of W3C XML
schema to describe the logical format of the data, and annotations
within the schema to describe the physical representation.

Daffodil is an open source implementation of the DFDL specification that
uses these DFDL schemas to parse fixed format data into an infoset,
which is most commonly represented as either XML or JSON. This allows
the use of well-established XML or JSON technologies and libraries to
consume, inspect, and manipulate fixed format data in existing
solutions. Daffodil is also capable of the reverse by serializing or
"unparsing" an XML or JSON infoset back to the original data format.

== Background ==

Many different software solutions need to consume and manage data,
including data directed routing, databases, data analysis, data
cleansing, data visualizing, and more. A key aspect of such solutions is
the need to transform the data into an easily consumable format.
Usually, this means that for each unique data format, one develops a
tool that can read and extract the necessary information, often leading
to ad-hoc and data-format-specific description systems. Such systems are
often proprietary, not well tested, and incompatible, leading to vendor
lock-in, flawed software, and increased training costs. DFDL is a new
standard, with version 1.0 completed in October of 2016, that solves
these problems by defining an open standard to describe many different
data formats and how to parse and unparse between the data and XML/JSON.

Two closed source implementations of DFDL currently exist. The first was
created by IBM and is now part of their IBM® Integration Bus product.
The second was created by the European Space Agency, called DFDL4S or
"DFDL for Space" targeted at the challenges of their satellite data
processing.

Around 2005, Pacific Northwest National Lab created Defuddle, built as
an open source implementation and proof of concept of the draft DFDL
specification and a test bed to feed new concepts into specification
development. Primary development of Defuddle was eventually taken over
by the National Center for Supercomputing Applications (NCSA). However,
due to evolution of the DFDL specification and architectural and
performance issues with Defuddle, around 2009, NCSA restarted the
project with the new name of Daffodil, with a goal of implementing the
complete DFDL specification. Daffodil development continued at NCSA
until around 2012, at which point development slowed due to budget
limitations. Shortly thereafter, primary development was picked up by
Tresys Technology where it continues today, with contributions from
other entities such as the Navy Research Lab, the Air Force Research
Lab, MITRE, and Booz Allen Hamilton. In February of 2015, Daffodil
version 1.0.0 was released, including support for the DFDL features
needed to parse many common file formats. Daffodil version 2.0.0 is
expected to be released in August of 2017, which will include unparse
support with one-to-one parsing feature parity.

Entities including IBM, MITRE, NATO NCI Agency, Northrop-Grumman, Quark
Security, Raytheon, and Tresys Technology have developed DFDL schemas
for many data formats from varying technology domains, including PNG,
GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar, and MIL-STD-2045,
many of which are publicly available on the DFDL Schemas github. There
are also a number of military-application data formats, the
specifications of which are not public, which have historically been
very difficult and expensive to process, and for which DFDL schemas have
been created or are actively in development; these include
MIL-STD-6040/USMTF ATO, MIL-STD-6017/VMF, MIL-STD-6016/NATO STANAG 5516
(aka "Link16").

== Rationale ==

Numerous software solutions exist that consume, inspect, analyze, and
transform data, many of which can be found in the Apache Software
Founda

Re: [DISCUSS] Daffodil Incubation Proposal

2017-07-24 Thread Mike Drob
What is the relationship between Daffodil and something like Apache Tika's
extraction engine?

On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
stephen.d.lawre...@gmail.com> wrote:

> Dear Apache Incubator Community,
>
> We would like to start a discussion around a proposal to bring Daffodil
> into the Apache Incubator. Daffodil is a implementation of the DFDL
> specification used to convert between fixed format data and XML/JSON.
>
> The draft proposal can be found in the wiki at the following URL:
>
> https://wiki.apache.org/incubator/DaffodilProposal
>
> We do not yet have a champion or mentors, but it was recommended that we
> create a proposal and send it to this list to potentially find those
> that might be interested. The text for the draft proposal is found
> below. We look forward to your input.
>
> Thanks,
> -Steve
>
>
> = Daffodil Proposal =
>
> == Abstract ==
>
> Daffodil is an implementation of the Data Format Description Language
> (DFDL) used to convert between fixed format data and XML/JSON.
>
> == Proposal ==
>
> The Data Format Description Language (DFDL) is a specification,
> developed by the Open Grid Forum, capable of describing many data
> formats, including both textual and binary, scientific and numeric,
> legacy and modern, commercial record-oriented, and many industry and
> military standards. It defines a language that is a subset of W3C XML
> schema to describe the logical format of the data, and annotations
> within the schema to describe the physical representation.
>
> Daffodil is an open source implementation of the DFDL specification that
> uses these DFDL schemas to parse fixed format data into an infoset,
> which is most commonly represented as either XML or JSON. This allows
> the use of well-established XML or JSON technologies and libraries to
> consume, inspect, and manipulate fixed format data in existing
> solutions. Daffodil is also capable of the reverse by serializing or
> "unparsing" an XML or JSON infoset back to the original data format.
>
> == Background ==
>
> Many different software solutions need to consume and manage data,
> including data directed routing, databases, data analysis, data
> cleansing, data visualizing, and more. A key aspect of such solutions is
> the need to transform the data into an easily consumable format.
> Usually, this means that for each unique data format, one develops a
> tool that can read and extract the necessary information, often leading
> to ad-hoc and data-format-specific description systems. Such systems are
> often proprietary, not well tested, and incompatible, leading to vendor
> lock-in, flawed software, and increased training costs. DFDL is a new
> standard, with version 1.0 completed in October of 2016, that solves
> these problems by defining an open standard to describe many different
> data formats and how to parse and unparse between the data and XML/JSON.
>
> Two closed source implementations of DFDL currently exist. The first was
> created by IBM and is now part of their IBM® Integration Bus product.
> The second was created by the European Space Agency, called DFDL4S or
> "DFDL for Space" targeted at the challenges of their satellite data
> processing.
>
> Around 2005, Pacific Northwest National Lab created Defuddle, built as
> an open source implementation and proof of concept of the draft DFDL
> specification and a test bed to feed new concepts into specification
> development. Primary development of Defuddle was eventually taken over
> by the National Center for Supercomputing Applications (NCSA). However,
> due to evolution of the DFDL specification and architectural and
> performance issues with Defuddle, around 2009, NCSA restarted the
> project with the new name of Daffodil, with a goal of implementing the
> complete DFDL specification. Daffodil development continued at NCSA
> until around 2012, at which point development slowed due to budget
> limitations. Shortly thereafter, primary development was picked up by
> Tresys Technology where it continues today, with contributions from
> other entities such as the Navy Research Lab, the Air Force Research
> Lab, MITRE, and Booz Allen Hamilton. In February of 2015, Daffodil
> version 1.0.0 was released, including support for the DFDL features
> needed to parse many common file formats. Daffodil version 2.0.0 is
> expected to be released in August of 2017, which will include unparse
> support with one-to-one parsing feature parity.
>
> Entities including IBM, MITRE, NATO NCI Agency, Northrop-Grumman, Quark
> Security, Raytheon, and Tresys Technology have developed DFDL schemas
> for many data formats from varying technology domains, including PNG,
> GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar, and MIL-STD-2045,
> many of which are publicly available on the DFDL Schemas github. There
> are also a number of military-application data formats, the
> specifications of which are not public, which have historically been
> very difficult and expensive t

Re: [DISCUSS] Daffodil Incubation Proposal

2017-07-24 Thread Steve Lawrence
I'll preface this saying that I don't have a ton of experience with
Apache Tika. But based on my understanding, Tika and Daffodil do have
somewhat similar goals, but reach them in different ways. For example,
Tika requires that one writes /code/ to perform data extraction, usually
relying on existing Java libraries to extract the desired metadata. The
downside to this is that code can be buggy, and libraries might not even
exist for formats of interest (especially common with legacy and
military data).

Daffodil, on the other hand, does not require one to write any code.
Instead, one writes a DFDL Schema (similar to XML Schema, with DFDL
annotations) that fully describes the data, which Daffodil then uses to
convert the data to XML/JSON for extraction. So adding support for a new
format means writing a new schema rather than new code. And less code
generally means less bugs. Also, for secure systems that require
certification, generally speaking, it is easier to certify a schema as
compared to code.

We certainly don't believe that Daffodil could replace Tika, but it does
have the potential to add new functionality to Tika for formats that do
not have existing libraries. One of our goals is to look into
integrating Daffodil support into tools like Tika. We'd love to hear
from Tika devs if this is something they'd be interested in.

I'll also add that whereas Tika tends to focus primarily on metadata,
DFDL schemas usually describe an entire file format down to the byte, so
one can extract more than just meta data, including text and binary
data. Further differentiating, Daffodil has support for serializing data
(called unparse) from the XML/JSON representation, allowing one to
transform or filter data as well. We don't believe this feature is all
that applicable to Tika, but may be useful to other technologies such as
filtering or data fuzzing technologies.

- Steve


On 07/24/2017 10:59 AM, Mike Drob wrote:
> What is the relationship between Daffodil and something like Apache Tika's
> extraction engine?
> 
> On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
> stephen.d.lawre...@gmail.com> wrote:
> 
>> Dear Apache Incubator Community,
>>
>> We would like to start a discussion around a proposal to bring Daffodil
>> into the Apache Incubator. Daffodil is a implementation of the DFDL
>> specification used to convert between fixed format data and XML/JSON.
>>
>> The draft proposal can be found in the wiki at the following URL:
>>
>> https://wiki.apache.org/incubator/DaffodilProposal
>>
>> We do not yet have a champion or mentors, but it was recommended that we
>> create a proposal and send it to this list to potentially find those
>> that might be interested. The text for the draft proposal is found
>> below. We look forward to your input.
>>
>> Thanks,
>> -Steve
>>
>>
>> = Daffodil Proposal =
>>
>> == Abstract ==
>>
>> Daffodil is an implementation of the Data Format Description Language
>> (DFDL) used to convert between fixed format data and XML/JSON.
>>
>> == Proposal ==
>>
>> The Data Format Description Language (DFDL) is a specification,
>> developed by the Open Grid Forum, capable of describing many data
>> formats, including both textual and binary, scientific and numeric,
>> legacy and modern, commercial record-oriented, and many industry and
>> military standards. It defines a language that is a subset of W3C XML
>> schema to describe the logical format of the data, and annotations
>> within the schema to describe the physical representation.
>>
>> Daffodil is an open source implementation of the DFDL specification that
>> uses these DFDL schemas to parse fixed format data into an infoset,
>> which is most commonly represented as either XML or JSON. This allows
>> the use of well-established XML or JSON technologies and libraries to
>> consume, inspect, and manipulate fixed format data in existing
>> solutions. Daffodil is also capable of the reverse by serializing or
>> "unparsing" an XML or JSON infoset back to the original data format.
>>
>> == Background ==
>>
>> Many different software solutions need to consume and manage data,
>> including data directed routing, databases, data analysis, data
>> cleansing, data visualizing, and more. A key aspect of such solutions is
>> the need to transform the data into an easily consumable format.
>> Usually, this means that for each unique data format, one develops a
>> tool that can read and extract the necessary information, often leading
>> to ad-hoc and data-format-specific description systems. Such systems are
>> often proprietary, not well tested, and incompatible, leading to vendor
>> lock-in, flawed software, and increased training costs. DFDL is a new
>> standard, with version 1.0 completed in October of 2016, that solves
>> these problems by defining an open standard to describe many different
>> data formats and how to parse and unparse between the data and XML/JSON.
>>
>> Two closed source implementations of DFDL currently exist. The first was
>> 

Re: [DISCUSS] Daffodil Incubation Proposal

2017-07-24 Thread John D. Ament
Hi,

This looks like a very interesting proposal.  It's a bit worrisome though
that you have no champion or mentor.  Have you been in contact with anyone
at the ASF on this?

I see that the existing code appears to have 3 different copyright holders,
and all code is derived from the BSD-3-clause license.  It appears that all
of the initial developers are from a single holder, Tresys.  Is there any
interest in granting committership to the other contributors?

John

On Mon, Jul 24, 2017 at 11:30 AM Steve Lawrence <
stephen.d.lawre...@gmail.com> wrote:

> I'll preface this saying that I don't have a ton of experience with
> Apache Tika. But based on my understanding, Tika and Daffodil do have
> somewhat similar goals, but reach them in different ways. For example,
> Tika requires that one writes /code/ to perform data extraction, usually
> relying on existing Java libraries to extract the desired metadata. The
> downside to this is that code can be buggy, and libraries might not even
> exist for formats of interest (especially common with legacy and
> military data).
>
> Daffodil, on the other hand, does not require one to write any code.
> Instead, one writes a DFDL Schema (similar to XML Schema, with DFDL
> annotations) that fully describes the data, which Daffodil then uses to
> convert the data to XML/JSON for extraction. So adding support for a new
> format means writing a new schema rather than new code. And less code
> generally means less bugs. Also, for secure systems that require
> certification, generally speaking, it is easier to certify a schema as
> compared to code.
>
> We certainly don't believe that Daffodil could replace Tika, but it does
> have the potential to add new functionality to Tika for formats that do
> not have existing libraries. One of our goals is to look into
> integrating Daffodil support into tools like Tika. We'd love to hear
> from Tika devs if this is something they'd be interested in.
>
> I'll also add that whereas Tika tends to focus primarily on metadata,
> DFDL schemas usually describe an entire file format down to the byte, so
> one can extract more than just meta data, including text and binary
> data. Further differentiating, Daffodil has support for serializing data
> (called unparse) from the XML/JSON representation, allowing one to
> transform or filter data as well. We don't believe this feature is all
> that applicable to Tika, but may be useful to other technologies such as
> filtering or data fuzzing technologies.
>
> - Steve
>
>
> On 07/24/2017 10:59 AM, Mike Drob wrote:
> > What is the relationship between Daffodil and something like Apache
> Tika's
> > extraction engine?
> >
> > On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
> > stephen.d.lawre...@gmail.com> wrote:
> >
> >> Dear Apache Incubator Community,
> >>
> >> We would like to start a discussion around a proposal to bring Daffodil
> >> into the Apache Incubator. Daffodil is a implementation of the DFDL
> >> specification used to convert between fixed format data and XML/JSON.
> >>
> >> The draft proposal can be found in the wiki at the following URL:
> >>
> >> https://wiki.apache.org/incubator/DaffodilProposal
> >>
> >> We do not yet have a champion or mentors, but it was recommended that we
> >> create a proposal and send it to this list to potentially find those
> >> that might be interested. The text for the draft proposal is found
> >> below. We look forward to your input.
> >>
> >> Thanks,
> >> -Steve
> >>
> >>
> >> = Daffodil Proposal =
> >>
> >> == Abstract ==
> >>
> >> Daffodil is an implementation of the Data Format Description Language
> >> (DFDL) used to convert between fixed format data and XML/JSON.
> >>
> >> == Proposal ==
> >>
> >> The Data Format Description Language (DFDL) is a specification,
> >> developed by the Open Grid Forum, capable of describing many data
> >> formats, including both textual and binary, scientific and numeric,
> >> legacy and modern, commercial record-oriented, and many industry and
> >> military standards. It defines a language that is a subset of W3C XML
> >> schema to describe the logical format of the data, and annotations
> >> within the schema to describe the physical representation.
> >>
> >> Daffodil is an open source implementation of the DFDL specification that
> >> uses these DFDL schemas to parse fixed format data into an infoset,
> >> which is most commonly represented as either XML or JSON. This allows
> >> the use of well-established XML or JSON technologies and libraries to
> >> consume, inspect, and manipulate fixed format data in existing
> >> solutions. Daffodil is also capable of the reverse by serializing or
> >> "unparsing" an XML or JSON infoset back to the original data format.
> >>
> >> == Background ==
> >>
> >> Many different software solutions need to consume and manage data,
> >> including data directed routing, databases, data analysis, data
> >> cleansing, data visualizing, and more. A key aspect of such solutions is
> >> th

Re: [DISCUSS] Daffodil Incubation Proposal

2017-07-24 Thread Steve Lawrence
I've emailed two Apache NiFi committers, but I did not realize that they
were not ASF members, so couldn't be a champion. One of them reached out
to an IPMC member who said that it was fine to write a proposal and send
it to the gen@incubator without having a champion, and this could be a
good way to find a champion/mentors.

The code is actually licensed under the University of Illinois/NCSA
license [1], which is a combination of the MIT and 3-clause BSD
licenses. But we do plan on relicensing to Apache v2 upon entering
incubation. Tresys and NCSA (who contributed the bulk of the code) are
both open to relicensing to Apache v2. We have not yet reached out to
the other contributors, but we do not expect issues with this. Either we
think they will have no objections, or their contributions (which are
relatively small) can be removed and/or replaced.

And we're absolutely open to granting committership to others. We have
granted committership to others in the past, but over time, they have
moved on to other projects/companies and did not continue contributions.
One of our main goals in joining incubation is to grow the developer
community beyond just Tresys.

- Steve

[1] https://opensource.org/licenses/NCSA

On 07/24/2017 01:06 PM, John D. Ament wrote:
> Hi,
> 
> This looks like a very interesting proposal.  It's a bit worrisome though
> that you have no champion or mentor.  Have you been in contact with anyone
> at the ASF on this?
> 
> I see that the existing code appears to have 3 different copyright holders,
> and all code is derived from the BSD-3-clause license.  It appears that all
> of the initial developers are from a single holder, Tresys.  Is there any
> interest in granting committership to the other contributors?
> 
> John
> 
> On Mon, Jul 24, 2017 at 11:30 AM Steve Lawrence <
> stephen.d.lawre...@gmail.com> wrote:
> 
>> I'll preface this saying that I don't have a ton of experience with
>> Apache Tika. But based on my understanding, Tika and Daffodil do have
>> somewhat similar goals, but reach them in different ways. For example,
>> Tika requires that one writes /code/ to perform data extraction, usually
>> relying on existing Java libraries to extract the desired metadata. The
>> downside to this is that code can be buggy, and libraries might not even
>> exist for formats of interest (especially common with legacy and
>> military data).
>>
>> Daffodil, on the other hand, does not require one to write any code.
>> Instead, one writes a DFDL Schema (similar to XML Schema, with DFDL
>> annotations) that fully describes the data, which Daffodil then uses to
>> convert the data to XML/JSON for extraction. So adding support for a new
>> format means writing a new schema rather than new code. And less code
>> generally means less bugs. Also, for secure systems that require
>> certification, generally speaking, it is easier to certify a schema as
>> compared to code.
>>
>> We certainly don't believe that Daffodil could replace Tika, but it does
>> have the potential to add new functionality to Tika for formats that do
>> not have existing libraries. One of our goals is to look into
>> integrating Daffodil support into tools like Tika. We'd love to hear
>> from Tika devs if this is something they'd be interested in.
>>
>> I'll also add that whereas Tika tends to focus primarily on metadata,
>> DFDL schemas usually describe an entire file format down to the byte, so
>> one can extract more than just meta data, including text and binary
>> data. Further differentiating, Daffodil has support for serializing data
>> (called unparse) from the XML/JSON representation, allowing one to
>> transform or filter data as well. We don't believe this feature is all
>> that applicable to Tika, but may be useful to other technologies such as
>> filtering or data fuzzing technologies.
>>
>> - Steve
>>
>>
>> On 07/24/2017 10:59 AM, Mike Drob wrote:
>>> What is the relationship between Daffodil and something like Apache
>> Tika's
>>> extraction engine?
>>>
>>> On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
>>> stephen.d.lawre...@gmail.com> wrote:
>>>
 Dear Apache Incubator Community,

 We would like to start a discussion around a proposal to bring Daffodil
 into the Apache Incubator. Daffodil is a implementation of the DFDL
 specification used to convert between fixed format data and XML/JSON.

 The draft proposal can be found in the wiki at the following URL:

 https://wiki.apache.org/incubator/DaffodilProposal

 We do not yet have a champion or mentors, but it was recommended that we
 create a proposal and send it to this list to potentially find those
 that might be interested. The text for the draft proposal is found
 below. We look forward to your input.

 Thanks,
 -Steve


 = Daffodil Proposal =

 == Abstract ==

 Daffodil is an implementation of the Data Format Description Language
 (DFDL) used to convert betwee

Re: [DISCUSS] Daffodil Incubation Proposal

2017-07-24 Thread McHenry, Kenton Guadron
Yes, DFDL and its open source implementation Daffodil are more about file 
formats and getting access to the entirety of a file's contents in a consistent 
way through machine readable specifications.  The work has implications in the 
area of digital preservation allowing one to preserve these machine readable 
specifications rather than all the tools needed to open/save a file in order to 
work with it.  Imagine someone developing graphics software to work with 3D 
models and not having to worry about the hundreds of formats out there for 3D 
meshes (whether there are tools for opening the files and whether they can get 
access to those tools, whether the spec is available and worrying about how 
complex that spec is to implement, etc.), and simply building their code around 
the contents (e.g. vertices, faces, etc.).  One could come up with similar 
scenarios for other data types (documents, images, videos, audio, depth data, 
numeric data).  Ideally tools built supporting DFDL, could someday, support any 
format for that type without the developer having to worry about the details of 
how that data is represented within a file.

Kenton McHenry, Ph.D.
Principal Research Scientist, Adjunct Assistant Professor of Computer Science
Deputy Director of the Scientific Software & Applications Division
National Center for Supercomputing Applications, University of Illinois at 
Urbana-Champaign

On Jul 24, 2017, at 10:30 AM, Steve Lawrence 
mailto:stephen.d.lawre...@gmail.com>> wrote:

I'll preface this saying that I don't have a ton of experience with
Apache Tika. But based on my understanding, Tika and Daffodil do have
somewhat similar goals, but reach them in different ways. For example,
Tika requires that one writes /code/ to perform data extraction, usually
relying on existing Java libraries to extract the desired metadata. The
downside to this is that code can be buggy, and libraries might not even
exist for formats of interest (especially common with legacy and
military data).

Daffodil, on the other hand, does not require one to write any code.
Instead, one writes a DFDL Schema (similar to XML Schema, with DFDL
annotations) that fully describes the data, which Daffodil then uses to
convert the data to XML/JSON for extraction. So adding support for a new
format means writing a new schema rather than new code. And less code
generally means less bugs. Also, for secure systems that require
certification, generally speaking, it is easier to certify a schema as
compared to code.

We certainly don't believe that Daffodil could replace Tika, but it does
have the potential to add new functionality to Tika for formats that do
not have existing libraries. One of our goals is to look into
integrating Daffodil support into tools like Tika. We'd love to hear
from Tika devs if this is something they'd be interested in.

I'll also add that whereas Tika tends to focus primarily on metadata,
DFDL schemas usually describe an entire file format down to the byte, so
one can extract more than just meta data, including text and binary
data. Further differentiating, Daffodil has support for serializing data
(called unparse) from the XML/JSON representation, allowing one to
transform or filter data as well. We don't believe this feature is all
that applicable to Tika, but may be useful to other technologies such as
filtering or data fuzzing technologies.

- Steve


On 07/24/2017 10:59 AM, Mike Drob wrote:
What is the relationship between Daffodil and something like Apache Tika's
extraction engine?

On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
stephen.d.lawre...@gmail.com> wrote:

Dear Apache Incubator Community,

We would like to start a discussion around a proposal to bring Daffodil
into the Apache Incubator. Daffodil is a implementation of the DFDL
specification used to convert between fixed format data and XML/JSON.

The draft proposal can be found in the wiki at the following URL:

https://wiki.apache.org/incubator/DaffodilProposal

We do not yet have a champion or mentors, but it was recommended that we
create a proposal and send it to this list to potentially find those
that might be interested. The text for the draft proposal is found
below. We look forward to your input.

Thanks,
-Steve


= Daffodil Proposal =

== Abstract ==

Daffodil is an implementation of the Data Format Description Language
(DFDL) used to convert between fixed format data and XML/JSON.

== Proposal ==

The Data Format Description Language (DFDL) is a specification,
developed by the Open Grid Forum, capable of describing many data
formats, including both textual and binary, scientific and numeric,
legacy and modern, commercial record-oriented, and many industry and
military standards. It defines a language that is a subset of W3C XML
schema to describe the logical format of the data, and annotations
within the schema to describe the physical representation.

Daffodil is an open source implementation of the DFD

Re: [DISCUSS] Daffodil Incubation Proposal

2017-07-24 Thread Dave Fisher
Hi Kenton,

Is there any reason that you and others from the NCSA are not Initial 
Committers? That would make this proposal stronger.

Regarding Apache Tika - it relies on other projects including Apache POI and 
Apache PDFBox. They are pragmatic about what is used. If Daffodil works to 
expand then I think that there would be good synergy between the projects. I 
know as a POI PMC member that the POI community has significantly benefited 
from the Tika community some of whom are from Mitre.

To date Tika has not emphasized structured data, although they do extract 
content from Excel and OpenOffice.

I am intrigued.

Regards,
Dave

> On Jul 24, 2017, at 10:55 AM, McHenry, Kenton Guadron  
> wrote:
> 
> Yes, DFDL and its open source implementation Daffodil are more about file 
> formats and getting access to the entirety of a file's contents in a 
> consistent way through machine readable specifications.  The work has 
> implications in the area of digital preservation allowing one to preserve 
> these machine readable specifications rather than all the tools needed to 
> open/save a file in order to work with it.  Imagine someone developing 
> graphics software to work with 3D models and not having to worry about the 
> hundreds of formats out there for 3D meshes (whether there are tools for 
> opening the files and whether they can get access to those tools, whether the 
> spec is available and worrying about how complex that spec is to implement, 
> etc.), and simply building their code around the contents (e.g. vertices, 
> faces, etc.).  One could come up with similar scenarios for other data types 
> (documents, images, videos, audio, depth data, numeric data).  Ideally tools 
> built supporting DFDL, could someday, support any format for that type 
> without the developer having to worry about the details of how that data is 
> represented within a file.
> 
> Kenton McHenry, Ph.D.
> Principal Research Scientist, Adjunct Assistant Professor of Computer Science
> Deputy Director of the Scientific Software & Applications Division
> National Center for Supercomputing Applications, University of Illinois at 
> Urbana-Champaign
> 
> On Jul 24, 2017, at 10:30 AM, Steve Lawrence 
> mailto:stephen.d.lawre...@gmail.com>> wrote:
> 
> I'll preface this saying that I don't have a ton of experience with
> Apache Tika. But based on my understanding, Tika and Daffodil do have
> somewhat similar goals, but reach them in different ways. For example,
> Tika requires that one writes /code/ to perform data extraction, usually
> relying on existing Java libraries to extract the desired metadata. The
> downside to this is that code can be buggy, and libraries might not even
> exist for formats of interest (especially common with legacy and
> military data).
> 
> Daffodil, on the other hand, does not require one to write any code.
> Instead, one writes a DFDL Schema (similar to XML Schema, with DFDL
> annotations) that fully describes the data, which Daffodil then uses to
> convert the data to XML/JSON for extraction. So adding support for a new
> format means writing a new schema rather than new code. And less code
> generally means less bugs. Also, for secure systems that require
> certification, generally speaking, it is easier to certify a schema as
> compared to code.
> 
> We certainly don't believe that Daffodil could replace Tika, but it does
> have the potential to add new functionality to Tika for formats that do
> not have existing libraries. One of our goals is to look into
> integrating Daffodil support into tools like Tika. We'd love to hear
> from Tika devs if this is something they'd be interested in.
> 
> I'll also add that whereas Tika tends to focus primarily on metadata,
> DFDL schemas usually describe an entire file format down to the byte, so
> one can extract more than just meta data, including text and binary
> data. Further differentiating, Daffodil has support for serializing data
> (called unparse) from the XML/JSON representation, allowing one to
> transform or filter data as well. We don't believe this feature is all
> that applicable to Tika, but may be useful to other technologies such as
> filtering or data fuzzing technologies.
> 
> - Steve
> 
> 
> On 07/24/2017 10:59 AM, Mike Drob wrote:
> What is the relationship between Daffodil and something like Apache Tika's
> extraction engine?
> 
> On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
> stephen.d.lawre...@gmail.com> wrote:
> 
> Dear Apache Incubator Community,
> 
> We would like to start a discussion around a proposal to bring Daffodil
> into the Apache Incubator. Daffodil is a implementation of the DFDL
> specification used to convert between fixed format data and XML/JSON.
> 
> The draft proposal can be found in the wiki at the following URL:
> 
> https://wiki.apache.org/incubator/DaffodilProposal
> 
> We do not yet have a champion or mentors, but it was recommended that we
> create a proposal and 

Re: [VOTE] Release Apache Tamaya Extensions 0.3-incubating

2017-07-24 Thread Oliver B. Fischer

Dear all,

our vote is now running for more then a week. Should I close the vote 
and start it again? I am not sure what is missing to get enough positive 
(or even negative) votes on our release.


Oliver

Am 16.07.17 um 23:38 schrieb Oliver B. Fischer:

Dear all,

The Apache Tamaya team would like to release version 0.3-incubating of
Apache Tamaya Extensions.

The new release of our extensions as it uses the new 0.3 release of 
Apache Tamya API&Core.


The source distribution has been deployed to [1] and the Maven 
artefacts are available at [2].


The full release notes are available at [3].

The project vote has been passed with

5 binding +1 votes (pmc):

0 non-binding +1

0 -1 vote

The vote thread can be found at [4].

Please note:
This vote is a "majority approval" with a minimum of three +1 votes and
no -1’s (see [5]).


[ ] +1 for community members who have reviewed the bits
[ ] +0
[ ] -1 for fatal flaws that should cause these bits not to be
released, and why..


[1] https://s.apache.org/NA8w
[2] https://s.apache.org/79Jw
[3] https://s.apache.org/SSYr
[4] https://s.apache.org/vyST
[5] http://www.apache.org/foundation/voting.html#ReleaseVotes




--
N Oliver B. Fischer
A Schönhauser Allee 64, 10437 Berlin, Deutschland/Germany
P +49 30 44793251
M +49 178 7903538
E o.b.fisc...@swe-blog.net
S oliver.b.fischer
J oliver.b.fisc...@jabber.org
X http://xing.to/obf


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



INFRA guide to graduation?

2017-07-24 Thread Christopher
Is there any documentation detailing how to file a proper and complete JIRA
for graduation?
For example, which components should be put in the JIRA, issue type, etc.

It seems to me that good docs in this regard could save INFRA a lot of
headaches, and could help projects understand which infrastructure bits
they need to think about as they move from the Incubator infrastructure to
the TLP infrastructure.

Perhaps these exist already and I just haven't found them?


Re: INFRA guide to graduation?

2017-07-24 Thread John D. Ament
Christopher,

Please review [

1]

If something seems off - we just did a review [2], please feel free to
comment.  Or have infra knock on our door.


John

[1]:
https://incubator.apache.org/guides/transferring.html#first_steps_outside_the_incubator
[2]:
https://lists.apache.org/thread.html/13ac64b765bf6810586dbfac283f18dc916f4c07e583a17d5ee5178c@%3Cgeneral.incubator.apache.org%3E


On Mon, Jul 24, 2017 at 7:30 PM Christopher  wrote:

> Is there any documentation detailing how to file a proper and complete JIRA
> for graduation?
> For example, which components should be put in the JIRA, issue type, etc.
>
> It seems to me that good docs in this regard could save INFRA a lot of
> headaches, and could help projects understand which infrastructure bits
> they need to think about as they move from the Incubator infrastructure to
> the TLP infrastructure.
>
> Perhaps these exist already and I just haven't found them?
>


Re: [VOTE] Release Apache Tamaya Extensions 0.3-incubating

2017-07-24 Thread John D. Ament
Oliver,

No, please leave the current vote open.  People will vote, just give them
some time (summer holidays has everyone a bit sluggish to respond).

You have 5 +1's on the dev list, 1 of which is binding, so you'll be fine
:-)

John

On Mon, Jul 24, 2017 at 3:36 PM Oliver B. Fischer 
wrote:

> Dear all,
>
> our vote is now running for more then a week. Should I close the vote
> and start it again? I am not sure what is missing to get enough positive
> (or even negative) votes on our release.
>
> Oliver
>
> Am 16.07.17 um 23:38 schrieb Oliver B. Fischer:
> > Dear all,
> >
> > The Apache Tamaya team would like to release version 0.3-incubating of
> > Apache Tamaya Extensions.
> >
> > The new release of our extensions as it uses the new 0.3 release of
> > Apache Tamya API&Core.
> >
> > The source distribution has been deployed to [1] and the Maven
> > artefacts are available at [2].
> >
> > The full release notes are available at [3].
> >
> > The project vote has been passed with
> >
> > 5 binding +1 votes (pmc):
> >
> > 0 non-binding +1
> >
> > 0 -1 vote
> >
> > The vote thread can be found at [4].
> >
> > Please note:
> > This vote is a "majority approval" with a minimum of three +1 votes and
> > no -1’s (see [5]).
> >
> > 
> > [ ] +1 for community members who have reviewed the bits
> > [ ] +0
> > [ ] -1 for fatal flaws that should cause these bits not to be
> > released, and why..
> > 
> >
> > [1] https://s.apache.org/NA8w
> > [2] https://s.apache.org/79Jw
> > [3] https://s.apache.org/SSYr
> > [4] https://s.apache.org/vyST
> > [5] http://www.apache.org/foundation/voting.html#ReleaseVotes
> >
> >
>
> --
> N Oliver B. Fischer
> A Schönhauser Allee 64, 10437 Berlin, Deutschland/Germany
> P +49 30 44793251 <+49%2030%2044793251>
> M +49 178 7903538 <+49%20178%207903538>
> E o.b.fisc...@swe-blog.net
> S oliver.b.fischer
> J oliver.b.fisc...@jabber.org
> X http://xing.to/obf
>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [VOTE] Release Apache Tamaya Extensions 0.3-incubating

2017-07-24 Thread Suneel Marthi
+1 binding

1. Verified Sigs and Hashes
2.  Verified the DISCLAIMER, LICENSE and NOTICE
3.  Ran a clean {src} build and all tests pass.

On Mon, Jul 24, 2017 at 8:21 PM, John D. Ament 
wrote:

> Oliver,
>
> No, please leave the current vote open.  People will vote, just give them
> some time (summer holidays has everyone a bit sluggish to respond).
>
> You have 5 +1's on the dev list, 1 of which is binding, so you'll be fine
> :-)
>
> John
>
> On Mon, Jul 24, 2017 at 3:36 PM Oliver B. Fischer <
> o.b.fisc...@swe-blog.net>
> wrote:
>
> > Dear all,
> >
> > our vote is now running for more then a week. Should I close the vote
> > and start it again? I am not sure what is missing to get enough positive
> > (or even negative) votes on our release.
> >
> > Oliver
> >
> > Am 16.07.17 um 23:38 schrieb Oliver B. Fischer:
> > > Dear all,
> > >
> > > The Apache Tamaya team would like to release version 0.3-incubating of
> > > Apache Tamaya Extensions.
> > >
> > > The new release of our extensions as it uses the new 0.3 release of
> > > Apache Tamya API&Core.
> > >
> > > The source distribution has been deployed to [1] and the Maven
> > > artefacts are available at [2].
> > >
> > > The full release notes are available at [3].
> > >
> > > The project vote has been passed with
> > >
> > > 5 binding +1 votes (pmc):
> > >
> > > 0 non-binding +1
> > >
> > > 0 -1 vote
> > >
> > > The vote thread can be found at [4].
> > >
> > > Please note:
> > > This vote is a "majority approval" with a minimum of three +1 votes and
> > > no -1’s (see [5]).
> > >
> > > 
> > > [ ] +1 for community members who have reviewed the bits
> > > [ ] +0
> > > [ ] -1 for fatal flaws that should cause these bits not to be
> > > released, and why..
> > > 
> > >
> > > [1] https://s.apache.org/NA8w
> > > [2] https://s.apache.org/79Jw
> > > [3] https://s.apache.org/SSYr
> > > [4] https://s.apache.org/vyST
> > > [5] http://www.apache.org/foundation/voting.html#ReleaseVotes
> > >
> > >
> >
> > --
> > N Oliver B. Fischer
> > A Schönhauser Allee 64, 10437 Berlin, Deutschland/Germany
> > P +49 30 44793251 <+49%2030%2044793251>
> > M +49 178 7903538 <+49%20178%207903538>
> > E o.b.fisc...@swe-blog.net
> > S oliver.b.fischer
> > J oliver.b.fisc...@jabber.org
> > X http://xing.to/obf
> >
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@incubator.apache.org
> >
> >
>


Re: [VOTE] Release Apache Tamaya Extensions 0.3-incubating

2017-07-24 Thread Justin Mclean
Hi,

+1 binding

I checked:
- name includes incubating
- signatures and md5 hashes correct
- DISCLAIMER exists
- LICENSE and NOTICE good
- no unexpected binary files
- all source file have headers
- can compile from source

Thanks,
Justin

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Release Apache Guacamole 0.9.13-incubating (RC1)

2017-07-24 Thread Justin Mclean
Hi,

+1 binding

I checked:
- incubating in name
- signatures and hashes correct
- DISCLAIMER exists
- server and client's LICENSE and NOTICE good (although Autoconf had me 
concerned for a second there)
- no unexpected binary files
- all source files have ASF headers

I didn't compile from source as it look like it requires some setup and OSX 
doesn’t seem to be a supported platform?

Thanks,
Justin


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org