Re: [PROPOSAL] Any23 to join the incubator
Thanks a lot Chris, have a nice day! Simo http://people.apache.org/~simonetripodi/ http://www.99soft.org/ On Mon, Sep 26, 2011 at 8:33 AM, Mattmann, Chris A (388J) wrote: > Hi All, > > OK, since the chatter about this proposal has died down and since > I've agreed to champion it, I'll call a formal VOTE tomorrow afternoon > and let it run through the rest of the week. The Tika PMC has not > registered any objections to sponsoring the proposal, so I will go > ahead and update it to reflect Tika PMC as the sponsor and we will > look forward to helping to shepherd and mentor Any23 through > the Incubator. > > Thanks for your input! > > Cheers, > Chris > > On Sep 22, 2011, at 7:43 AM, Simone Tripodi wrote: > >> Hi Lewis! >> thanks a lot for your interest on Any23 and welcome aboard!! I'm going >> to put you in the initial committers list! >> All the best, have a nice day! >> Simo >> >> http://people.apache.org/~simonetripodi/ >> http://www.99soft.org/ >> >> >> >> On Thu, Sep 22, 2011 at 4:39 PM, lewis john mcgibbney >> wrote: >>> Hi everyone, >>> >>> Further to the previous threads on this topic, I would like to express my >>> interest in becoming a committer for the project. Coming from an academic >>> background I am working extensively with the mapping of static legislative >>> document resources to RDF datasets and then using these datasets across >>> platforms such as Kasabi [1], and various projects closely linked to Jena, >>> E.g. Joseki and Fuseki. Also I've found other tools such as eyeball reall >>> helpful during my journey. >>> >>> I was voted in by the Apache Nutch PMC around three months ago as PMC member >>> and Committer, and was thankfully directed to this thread by Chris Mattmann. >>> The idea of extending the functionality of Any23 as a Nutch plugin is >>> something which interests me, and which could also benefit academic/research >>> users of Nutch such as myself. At this stage I don't have a strong opinion >>> on whether Any23 should be a sub-project of Tika, but think it is very >>> encouraging that it seems like a probable direction the project is/could >>> move towards. >>> >>> Thanks very much. >>> >>> Lewis >>> >>> [1] >>> http://beta.kasabi.com/dataset/wombra-scottish-technical-standards-section-6-energy >>> >>> -- >>> *Lewis* >>> >> >> - >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> > > > ++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++ > > > - > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > > - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [PROPOSAL] Any23 to join the incubator
Thanks Chris, I agree. Stuff are mature to call a vote now. Thanks to everyone for the help in improving the proposal so far. cheers, Davide On Mon, Sep 26, 2011 at 8:28 AM, Simone Tripodi wrote: > Thanks a lot Chris, have a nice day! > Simo > > http://people.apache.org/~simonetripodi/ > http://www.99soft.org/ > > > > On Mon, Sep 26, 2011 at 8:33 AM, Mattmann, Chris A (388J) > wrote: > > Hi All, > > > > OK, since the chatter about this proposal has died down and since > > I've agreed to champion it, I'll call a formal VOTE tomorrow afternoon > > and let it run through the rest of the week. The Tika PMC has not > > registered any objections to sponsoring the proposal, so I will go > > ahead and update it to reflect Tika PMC as the sponsor and we will > > look forward to helping to shepherd and mentor Any23 through > > the Incubator. > > > > Thanks for your input! > > > > Cheers, > > Chris > > > > On Sep 22, 2011, at 7:43 AM, Simone Tripodi wrote: > > > >> Hi Lewis! > >> thanks a lot for your interest on Any23 and welcome aboard!! I'm going > >> to put you in the initial committers list! > >> All the best, have a nice day! > >> Simo > >> > >> http://people.apache.org/~simonetripodi/ > >> http://www.99soft.org/ > >> > >> > >> > >> On Thu, Sep 22, 2011 at 4:39 PM, lewis john mcgibbney > >> wrote: > >>> Hi everyone, > >>> > >>> Further to the previous threads on this topic, I would like to express > my > >>> interest in becoming a committer for the project. Coming from an > academic > >>> background I am working extensively with the mapping of static > legislative > >>> document resources to RDF datasets and then using these datasets across > >>> platforms such as Kasabi [1], and various projects closely linked to > Jena, > >>> E.g. Joseki and Fuseki. Also I've found other tools such as eyeball > reall > >>> helpful during my journey. > >>> > >>> I was voted in by the Apache Nutch PMC around three months ago as PMC > member > >>> and Committer, and was thankfully directed to this thread by Chris > Mattmann. > >>> The idea of extending the functionality of Any23 as a Nutch plugin is > >>> something which interests me, and which could also benefit > academic/research > >>> users of Nutch such as myself. At this stage I don't have a strong > opinion > >>> on whether Any23 should be a sub-project of Tika, but think it is very > >>> encouraging that it seems like a probable direction the project > is/could > >>> move towards. > >>> > >>> Thanks very much. > >>> > >>> Lewis > >>> > >>> [1] > >>> > http://beta.kasabi.com/dataset/wombra-scottish-technical-standards-section-6-energy > >>> > >>> -- > >>> *Lewis* > >>> > >> > >> - > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >> For additional commands, e-mail: general-h...@incubator.apache.org > >> > > > > > > ++ > > Chris Mattmann, Ph.D. > > Senior Computer Scientist > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > Office: 171-266B, Mailstop: 171-246 > > Email: chris.a.mattm...@nasa.gov > > WWW: http://sunset.usc.edu/~mattmann/ > > ++ > > Adjunct Assistant Professor, Computer Science Department > > University of Southern California, Los Angeles, CA 90089 USA > > ++ > > > > > > - > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > - > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > > -- Davide Palmisano http://davidepalmisano.com http://twitter.com/dpalmisano
Re: [DISCUSS] DirectMemory to join the Apache Incubator
I just saw the plan is to use Confluence for Websites. I think this should not be done anymore, the Apache CMS is preferred instead, or one could use Maven et al. Was the intention to write "Wiki" instead of "Website"? Cheers On Tue, Sep 20, 2011 at 11:53 AM, Raffaele P. Guidi wrote: > Thanks, Simone for your introduction and support. To help evaluation I would > add that there's also some more information in the project wiki at > https://github.com/raffaeleguidi/DirectMemory/wiki and that I'm here to > answer all of your questions in detail as well. > > Thanks for your, > Raffaele > > On Tue, Sep 20, 2011 at 11:48 AM, Simone Tripodi > wrote: > >> Hi all guys, >> I would like to propose DirectMemory, a Java OpenSource multi-layered >> cache implementation featuring off-heap memory storage (a-la >> Terracotta BigMemory) originally developed by Raffaele P. Guidi on >> GitHub[1], to be an Apache Incubator project. For those interested on >> knowing more about DirectMemory, you can read Raffaele's related >> blog[2]. >> >> Here's a link to the proposal in the Incubator wiki[3] where we >> started collecting all needed info. >> >> As you will note, the list of mentors is in need of some volunteers, >> so if you find this interesting, feel free to sign up or let us know >> you are interested :). >> >> Hope to read from you soon, thanks in advance and have a nice day! >> All the best, >> Simo >> >> [1] https://github.com/raffaeleguidi/DirectMemory >> [2] http://raffaeleguidi.wordpress.com/ >> [3] http://wiki.apache.org/incubator/DirectMemoryProposal >> >> http://people.apache.org/~simonetripodi/ >> http://www.99soft.org/ >> >> - >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> > -- http://www.grobmeier.de - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [DISCUSS] DirectMemory to join the Apache Incubator
Well, I currently use github wiki for documentation and don't have a specific preference for confluence so I probably should have written Wiki; in any case I'm open to whatever the current apache standard is. The only exception, maybe, is for SVN - I heard that someone in apache is experimenting with GIT and I would like DirectMemory to be part of that experiment as well - should the rest of the team agree. Cheers, Raffaele On Mon, Sep 26, 2011 at 12:59 PM, Christian Grobmeier wrote: > I just saw the plan is to use Confluence for Websites. I think this > should not be done anymore, the Apache CMS is preferred instead, or > one could use Maven et al. Was the intention to write "Wiki" instead > of "Website"? > > Cheers > > > On Tue, Sep 20, 2011 at 11:53 AM, Raffaele P. Guidi > wrote: > > Thanks, Simone for your introduction and support. To help evaluation I > would > > add that there's also some more information in the project wiki at > > https://github.com/raffaeleguidi/DirectMemory/wiki and that I'm here to > > answer all of your questions in detail as well. > > > > Thanks for your, > > Raffaele > > > > On Tue, Sep 20, 2011 at 11:48 AM, Simone Tripodi > > wrote: > > > >> Hi all guys, > >> I would like to propose DirectMemory, a Java OpenSource multi-layered > >> cache implementation featuring off-heap memory storage (a-la > >> Terracotta BigMemory) originally developed by Raffaele P. Guidi on > >> GitHub[1], to be an Apache Incubator project. For those interested on > >> knowing more about DirectMemory, you can read Raffaele's related > >> blog[2]. > >> > >> Here's a link to the proposal in the Incubator wiki[3] where we > >> started collecting all needed info. > >> > >> As you will note, the list of mentors is in need of some volunteers, > >> so if you find this interesting, feel free to sign up or let us know > >> you are interested :). > >> > >> Hope to read from you soon, thanks in advance and have a nice day! > >> All the best, > >> Simo > >> > >> [1] https://github.com/raffaeleguidi/DirectMemory > >> [2] http://raffaeleguidi.wordpress.com/ > >> [3] http://wiki.apache.org/incubator/DirectMemoryProposal > >> > >> http://people.apache.org/~simonetripodi/ > >> http://www.99soft.org/ > >> > >> - > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >> For additional commands, e-mail: general-h...@incubator.apache.org > >> > >> > > > > > > -- > http://www.grobmeier.de > > - > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >
Re: [DISCUSS] DirectMemory to join the Apache Incubator
On Mon, Sep 26, 2011 at 1:12 PM, Raffaele P. Guidi wrote: > Well, I currently use github wiki for documentation and don't have a > specific preference for confluence so I probably should have written Wiki; Lets change the term to wiki and discuss after the vote about the website itself. This is usually very straighforward > in any case I'm open to whatever the current apache standard is. The only > exception, maybe, is for SVN - I heard that someone in apache is > experimenting with GIT and I would like DirectMemory to be part of that > experiment as well - should the rest of the team agree. Lets say the status is currently some kind of closed beta. Its currently under heavy discussion so it might take a while until people actually can use it. If you like, you can express your interest in GIT in the proposal (at the subversion url) but you should not expect this will happen to soon. Anyway, lets wait for a few days until the git discussion have settled down before we vote. Cheers Christian > > Cheers, > Raffaele > > On Mon, Sep 26, 2011 at 12:59 PM, Christian Grobmeier > wrote: > >> I just saw the plan is to use Confluence for Websites. I think this >> should not be done anymore, the Apache CMS is preferred instead, or >> one could use Maven et al. Was the intention to write "Wiki" instead >> of "Website"? >> >> Cheers >> >> >> On Tue, Sep 20, 2011 at 11:53 AM, Raffaele P. Guidi >> wrote: >> > Thanks, Simone for your introduction and support. To help evaluation I >> would >> > add that there's also some more information in the project wiki at >> > https://github.com/raffaeleguidi/DirectMemory/wiki and that I'm here to >> > answer all of your questions in detail as well. >> > >> > Thanks for your, >> > Raffaele >> > >> > On Tue, Sep 20, 2011 at 11:48 AM, Simone Tripodi >> > wrote: >> > >> >> Hi all guys, >> >> I would like to propose DirectMemory, a Java OpenSource multi-layered >> >> cache implementation featuring off-heap memory storage (a-la >> >> Terracotta BigMemory) originally developed by Raffaele P. Guidi on >> >> GitHub[1], to be an Apache Incubator project. For those interested on >> >> knowing more about DirectMemory, you can read Raffaele's related >> >> blog[2]. >> >> >> >> Here's a link to the proposal in the Incubator wiki[3] where we >> >> started collecting all needed info. >> >> >> >> As you will note, the list of mentors is in need of some volunteers, >> >> so if you find this interesting, feel free to sign up or let us know >> >> you are interested :). >> >> >> >> Hope to read from you soon, thanks in advance and have a nice day! >> >> All the best, >> >> Simo >> >> >> >> [1] https://github.com/raffaeleguidi/DirectMemory >> >> [2] http://raffaeleguidi.wordpress.com/ >> >> [3] http://wiki.apache.org/incubator/DirectMemoryProposal >> >> >> >> http://people.apache.org/~simonetripodi/ >> >> http://www.99soft.org/ >> >> >> >> - >> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> >> >> >> > >> >> >> >> -- >> http://www.grobmeier.de >> >> - >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> > -- http://www.grobmeier.de - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [PROPOSAL] Any23 to join the incubator
Thanks to everybody! I'm very happy to see that things are going forward! The best Mic On 26 September 2011 12:50, Davide Palmisano wrote: > Thanks Chris, > > I agree. Stuff are mature to call a vote now. > Thanks to everyone for the help in improving the proposal so far. > > cheers, > > Davide > > On Mon, Sep 26, 2011 at 8:28 AM, Simone Tripodi >wrote: > > > Thanks a lot Chris, have a nice day! > > Simo > > > > http://people.apache.org/~simonetripodi/ > > http://www.99soft.org/ > > > > > > > > On Mon, Sep 26, 2011 at 8:33 AM, Mattmann, Chris A (388J) > > wrote: > > > Hi All, > > > > > > OK, since the chatter about this proposal has died down and since > > > I've agreed to champion it, I'll call a formal VOTE tomorrow afternoon > > > and let it run through the rest of the week. The Tika PMC has not > > > registered any objections to sponsoring the proposal, so I will go > > > ahead and update it to reflect Tika PMC as the sponsor and we will > > > look forward to helping to shepherd and mentor Any23 through > > > the Incubator. > > > > > > Thanks for your input! > > > > > > Cheers, > > > Chris > > > > > > On Sep 22, 2011, at 7:43 AM, Simone Tripodi wrote: > > > > > >> Hi Lewis! > > >> thanks a lot for your interest on Any23 and welcome aboard!! I'm going > > >> to put you in the initial committers list! > > >> All the best, have a nice day! > > >> Simo > > >> > > >> http://people.apache.org/~simonetripodi/ > > >> http://www.99soft.org/ > > >> > > >> > > >> > > >> On Thu, Sep 22, 2011 at 4:39 PM, lewis john mcgibbney > > >> wrote: > > >>> Hi everyone, > > >>> > > >>> Further to the previous threads on this topic, I would like to > express > > my > > >>> interest in becoming a committer for the project. Coming from an > > academic > > >>> background I am working extensively with the mapping of static > > legislative > > >>> document resources to RDF datasets and then using these datasets > across > > >>> platforms such as Kasabi [1], and various projects closely linked to > > Jena, > > >>> E.g. Joseki and Fuseki. Also I've found other tools such as eyeball > > reall > > >>> helpful during my journey. > > >>> > > >>> I was voted in by the Apache Nutch PMC around three months ago as PMC > > member > > >>> and Committer, and was thankfully directed to this thread by Chris > > Mattmann. > > >>> The idea of extending the functionality of Any23 as a Nutch plugin is > > >>> something which interests me, and which could also benefit > > academic/research > > >>> users of Nutch such as myself. At this stage I don't have a strong > > opinion > > >>> on whether Any23 should be a sub-project of Tika, but think it is > very > > >>> encouraging that it seems like a probable direction the project > > is/could > > >>> move towards. > > >>> > > >>> Thanks very much. > > >>> > > >>> Lewis > > >>> > > >>> [1] > > >>> > > > http://beta.kasabi.com/dataset/wombra-scottish-technical-standards-section-6-energy > > >>> > > >>> -- > > >>> *Lewis* > > >>> > > >> > > >> - > > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > >> For additional commands, e-mail: general-h...@incubator.apache.org > > >> > > > > > > > > > ++ > > > Chris Mattmann, Ph.D. > > > Senior Computer Scientist > > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > > Office: 171-266B, Mailstop: 171-246 > > > Email: chris.a.mattm...@nasa.gov > > > WWW: http://sunset.usc.edu/~mattmann/ > > > ++ > > > Adjunct Assistant Professor, Computer Science Department > > > University of Southern California, Los Angeles, CA 90089 USA > > > ++ > > > > > > > > > - > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > > > - > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > -- > Davide Palmisano > > http://davidepalmisano.com > http://twitter.com/dpalmisano > -- Michele Mostarda Senior Software Engineer mail: m...@michelemostarda.com skype: michele.mostarda twitter: micmos fbk : http://wed.fbk.eu/en/people deri: https://dev.deri.ie/confluence/display/~mmostarda site: http://www.michelemostarda.com
Re: [DISCUSS] DirectMemory to join the Apache Incubator
Ok I will change it this evening. Git is not a priority in any way. Cheers, Raffaele On Monday, September 26, 2011, Christian Grobmeier wrote: > On Mon, Sep 26, 2011 at 1:12 PM, Raffaele P. Guidi > wrote: >> Well, I currently use github wiki for documentation and don't have a >> specific preference for confluence so I probably should have written Wiki; > > Lets change the term to wiki and discuss after the vote about the > website itself. This is usually very straighforward > >> in any case I'm open to whatever the current apache standard is. The only >> exception, maybe, is for SVN - I heard that someone in apache is >> experimenting with GIT and I would like DirectMemory to be part of that >> experiment as well - should the rest of the team agree. > > Lets say the status is currently some kind of closed beta. Its > currently under heavy discussion so it might take a while until people > actually can use it. If you like, you can express your interest in GIT > in the proposal (at the subversion url) but you should not expect this > will happen to soon. > > Anyway, lets wait for a few days until the git discussion have settled > down before we vote. > > Cheers > Christian > >> >> Cheers, >>Raffaele >> >> On Mon, Sep 26, 2011 at 12:59 PM, Christian Grobmeier >> wrote: >> >>> I just saw the plan is to use Confluence for Websites. I think this >>> should not be done anymore, the Apache CMS is preferred instead, or >>> one could use Maven et al. Was the intention to write "Wiki" instead >>> of "Website"? >>> >>> Cheers >>> >>> >>> On Tue, Sep 20, 2011 at 11:53 AM, Raffaele P. Guidi >>> wrote: >>> > Thanks, Simone for your introduction and support. To help evaluation I >>> would >>> > add that there's also some more information in the project wiki at >>> > https://github.com/raffaeleguidi/DirectMemory/wiki and that I'm here to >>> > answer all of your questions in detail as well. >>> > >>> > Thanks for your, >>> > Raffaele >>> > >>> > On Tue, Sep 20, 2011 at 11:48 AM, Simone Tripodi >>> > wrote: >>> > >>> >> Hi all guys, >>> >> I would like to propose DirectMemory, a Java OpenSource multi-layered >>> >> cache implementation featuring off-heap memory storage (a-la >>> >> Terracotta BigMemory) originally developed by Raffaele P. Guidi on >>> >> GitHub[1], to be an Apache Incubator project. For those interested on >>> >> knowing more about DirectMemory, you can read Raffaele's related >>> >> blog[2]. >>> >> >>> >> Here's a link to the proposal in the Incubator wiki[3] where we >>> >> started collecting all needed info. >>> >> >>> >> As you will note, the list of mentors is in need of some volunteers, >>> >> so if you find this interesting, feel free to sign up or let us know >>> >> you are interested :). >>> >> >>> >> Hope to read from you soon, thanks in advance and have a nice day! >>> >> All the best, >>> >> Simo >>> >> >>> >> [1] https://github.com/raffaeleguidi/DirectMemory >>> >> [2] http://raffaeleguidi.wordpress.com/ >>> >> [3] http://wiki.apache.org/incubator/DirectMemoryProposal >>> >> >>> >> http://people.apache.org/~simonetripodi/ >>> >> http://www.99soft.org/ >>> >> >>> >> - >>> >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>> >> For additional commands, e-mail: general-h...@incubator.apache.org >>> >> >>> >> >>> > >>> >>> >>> >>> -- >>> http://www.grobmeier.de >>> >>> - >>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>> For additional commands, e-mail: general-h...@incubator.apache.org >>> >>> >> > > > > -- > http://www.grobmeier.de > > - > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >
Re: [VOTE] S4 to join the Incubator
This passes, with 16 +1 votes, plenty of them binding, and no -1 votes. Thanks to all who voted! We can now get started creating the Apache S4 podling. Patrick On Tue, Sep 20, 2011 at 1:56 PM, Patrick Hunt wrote: > It's been a nearly a week since the S4 proposal was submitted for > discussion. A few questions were asked, and the proposal was clarified > in response. Sufficient mentors have volunteered. I thus feel we are > now ready for a vote. > > The latest proposal can be found at the end of this email and at: > > http://wiki.apache.org/incubator/S4Proposal > > The discussion regarding the proposal can be found at: > > http://s.apache.org/RMU > > Please cast your votes: > > [ ] +1 Accept S4 for incubation > [ ] +0 Indifferent to S4 incubation > [ ] -1 Reject S4 for incubation > > This vote will close 72 hours from now. > > Thanks, > > Patrick > > -- > = S4 Proposal = > > == Abstract == > > S4 (Simple Scalable Streaming System) is a general-purpose, > distributed, scalable, partially fault-tolerant, pluggable platform > that allows programmers to easily develop applications for processing > continuous, unbounded streams of data. > > == Proposal == > > S4 is a software platform written in Java. Clients that send and > receive events can be written in any programming language. S4 also > includes a collection of modules called Processing Elements (or PEs > for short) that implement basic functionality and can be used by > application developers. In S4, keyed data events are routed with > affinity to Processing Elements (PEs), which consume the events and do > one or both of the following: (1) ''emit'' one or more events which > may be consumed by other PEs, (2) ''publish'' results. The > architecture resembles the Actors model, providing semantics of > encapsulation and location transparency, thus allowing applications to > be massively concurrent while exposing a simple programming interface > to application developers. > > To drive adoption and increase the number of contributors to the > project, we may need to prioritize the focus based on feedback from > the community. We believe that one of the top priorities and driving > design principle for the S4 project is to provide a simple API that > hides most of the complexity associated with distributed systems and > concurrency. The project grew out of the need to provide a flexible > platform for application developers and scientists that can be used > for quick experimentation and production. > > S4 differs from existing Apache projects in a number of fundamental > ways. Flume is an Incubator project that focuses on log processing, > performing lightweight processing in a distributed fashion and > accumulating log data in a centralized repository for batch > processing. S4 instead performs all stream processing in a distributed > fashion and enables applications to form arbitrary graphs to process > streams of events. We see Flume as a complementary project. We also > expect S4 to complement Hadoop processing and in some cases to > supersede it. Kafka is another Incubator project that focuses on > processing large amounts of stream data. The design of Kafka, however, > follows the pub-sub paradigm, which focuses on delivering messages > containing arbitrary data from source processes (publishers) to > consumer processes (subscribers). Compared to S4, Kafka is an > intermediate step between data generation and processing, while S4 is > itself a platform for processing streams of events. > > S4 overall addresses a need of existing applications to process > streams of events beyond moving data to a centralized repository for > batch processing. It complements the features of existing Apache > projects, such as Hadoop, Flume, and Kafka, by providing a flexible > platform for distributed event processing. > > == Background == > > S4 was initially developed at Yahoo! Labs starting in 2008 to process > user feedback in the context of search advertising. The project was > licensed under the Apache License version 2.0 in October 2010. The > project documentation is currently available at http://s4.io . > > == Rationale == > > Stream computing has been growing steadily over the last 20 years. > However, recently there has been an explosion in real-time data > sources including the Web, sensor networks, financial securities > analysis and trading, traffic monitoring, natural language processing > of news and social data, and much more. > > As Hadoop evolved as a standard open source solution for batch > processing of massive data sets, there is no equivalent community > supported open source platform for processing data streams in > real-time. While various research projects have evolved into > proprietary commercial products, S4 has the potential to fill the gap. > Many projects that require a scalable stream processing architecture > currently use Hadoop by segmenting the input stream into data batches. > This solution is not efficie
Re: [VOTE] S4 to join the Incubator
Thank you all for your support, looking forward to working with the Apache community. -leo On Sep 26, 2011, at 9:47 AM, Patrick Hunt wrote: > This passes, with 16 +1 votes, plenty of them binding, and no -1 votes. > > Thanks to all who voted! > > We can now get started creating the Apache S4 podling. > > Patrick > > On Tue, Sep 20, 2011 at 1:56 PM, Patrick Hunt wrote: >> It's been a nearly a week since the S4 proposal was submitted for >> discussion. A few questions were asked, and the proposal was clarified >> in response. Sufficient mentors have volunteered. I thus feel we are >> now ready for a vote. >> >> The latest proposal can be found at the end of this email and at: >> >> http://wiki.apache.org/incubator/S4Proposal >> >> The discussion regarding the proposal can be found at: >> >> http://s.apache.org/RMU >> >> Please cast your votes: >> >> [ ] +1 Accept S4 for incubation >> [ ] +0 Indifferent to S4 incubation >> [ ] -1 Reject S4 for incubation >> >> This vote will close 72 hours from now. >> >> Thanks, >> >> Patrick >> >> -- >> = S4 Proposal = >> >> == Abstract == >> >> S4 (Simple Scalable Streaming System) is a general-purpose, >> distributed, scalable, partially fault-tolerant, pluggable platform >> that allows programmers to easily develop applications for processing >> continuous, unbounded streams of data. >> >> == Proposal == >> >> S4 is a software platform written in Java. Clients that send and >> receive events can be written in any programming language. S4 also >> includes a collection of modules called Processing Elements (or PEs >> for short) that implement basic functionality and can be used by >> application developers. In S4, keyed data events are routed with >> affinity to Processing Elements (PEs), which consume the events and do >> one or both of the following: (1) ''emit'' one or more events which >> may be consumed by other PEs, (2) ''publish'' results. The >> architecture resembles the Actors model, providing semantics of >> encapsulation and location transparency, thus allowing applications to >> be massively concurrent while exposing a simple programming interface >> to application developers. >> >> To drive adoption and increase the number of contributors to the >> project, we may need to prioritize the focus based on feedback from >> the community. We believe that one of the top priorities and driving >> design principle for the S4 project is to provide a simple API that >> hides most of the complexity associated with distributed systems and >> concurrency. The project grew out of the need to provide a flexible >> platform for application developers and scientists that can be used >> for quick experimentation and production. >> >> S4 differs from existing Apache projects in a number of fundamental >> ways. Flume is an Incubator project that focuses on log processing, >> performing lightweight processing in a distributed fashion and >> accumulating log data in a centralized repository for batch >> processing. S4 instead performs all stream processing in a distributed >> fashion and enables applications to form arbitrary graphs to process >> streams of events. We see Flume as a complementary project. We also >> expect S4 to complement Hadoop processing and in some cases to >> supersede it. Kafka is another Incubator project that focuses on >> processing large amounts of stream data. The design of Kafka, however, >> follows the pub-sub paradigm, which focuses on delivering messages >> containing arbitrary data from source processes (publishers) to >> consumer processes (subscribers). Compared to S4, Kafka is an >> intermediate step between data generation and processing, while S4 is >> itself a platform for processing streams of events. >> >> S4 overall addresses a need of existing applications to process >> streams of events beyond moving data to a centralized repository for >> batch processing. It complements the features of existing Apache >> projects, such as Hadoop, Flume, and Kafka, by providing a flexible >> platform for distributed event processing. >> >> == Background == >> >> S4 was initially developed at Yahoo! Labs starting in 2008 to process >> user feedback in the context of search advertising. The project was >> licensed under the Apache License version 2.0 in October 2010. The >> project documentation is currently available at http://s4.io . >> >> == Rationale == >> >> Stream computing has been growing steadily over the last 20 years. >> However, recently there has been an explosion in real-time data >> sources including the Web, sensor networks, financial securities >> analysis and trading, traffic monitoring, natural language processing >> of news and social data, and much more. >> >> As Hadoop evolved as a standard open source solution for batch >> processing of massive data sets, there is no equivalent community >> supported open source platform for processing data streams in >> real-time. While various
[VOTE] Release HCatalog 0.2-incubating (RC1)
Hi all, HCatalog community is excited to share that RC for release of HCatalog has been +1'd over at hcatalog-user@incubator Please try it out and vote for the Apache HCatalog 0.2-incubating release. Vote thread: http://markmail.org/thread/s7b53a4a2xd35jad Artifact and signatures: http://people.apache.org/~hashutosh/hcatalog-0.2.0-incubating-candidate-1/ SVN Tag: https://svn.apache.org/repos/asf/incubator/hcatalog/tags/release-0.2.0-rc1/ PGP release keys: https://svn.apache.org/repos/asf/incubator/hcatalog/trunk/KEYS [ ] +1 Release the packages as Apache HCatalog 0.2-incubating [ ] -1 Do not release the packages because... Thanks, Ashutosh
[VOTE] Add Any23 to the Apache Incubator
Hi Folks, OK, the proposal period had died now and I'm now calling a formal VOTE on the Any23 proposal located here: http://wiki.apache.org/incubator/Any23Proposal Proposal text copied at the bottom of this email. I'll leave the VOTE open through the rest of the week, and close it around Saturday, October 1, early AM PDT. Please VOTE: [ ] +1 Accept Any23 into the Apache Incubator [ ] +0 Don't care [ ] -1 Don't Accept Any23 into the Apache Incubator because... Thanks! Cheers, Chris P.S. Here's my +1 Proposal Text: = Any23 = == Abstract == The following proposal is about ''Anything To Triples'' (shortly Any23) defined as a Java library, a Web service and a set of command line tools to extract and validate structured data in [[http://www.w3.org/RDF/|RDF]] format from a variety of Web documents and markup formats. Any23 is what it is informally named an ''RDF Distiller''. == Proposal == Any23 "Anything to Triples" is a library written in Java 6 and released under the Apache 2.0 License. It provides a set of extractors for scraping semantic markup (such as [[http://microformats.org/|Microformats]], [[http://www.w3.org/TR/rdfa-syntax/|RDFa]] and [[http://www.w3.org/TR/microdata/|Microdata]]) from several sources (HTML4, XHTML5, CSV), a set of data validations, a set of parsers and writers to handle the main RDF transport formats (RDFXML, Ntriples, NQuads, Turtle). The library provides a command line tool for dealing with data extraction, conversion and validation, and a REST service implementation. The library is plugin based, allowing the hot loading of new extractors and validators. Any23 enables third-parties developers to access structured data from Web pages without the need of implementing ad-hoc scraping techniques. In this sense, Any23 will relieve developers from build complex solutions when developing data acquisition pipelines and processes targeted to semantically marked-up Web data. == Background == Any23 has been initially developed at [[http://www.deri.ie/|DERI (Digital Enterprise Research Institute)]], as main component of the RDF extraction pipeline used in [[http://sindice.com/|Sindice (the Semantic Web Index)]], now is evolved in joint effort with [[http://www.fbk.eu/|FBK (Fondazione Bruno Kessler)]]. At present time the Any23 official [[http://developers.any23.org|developers page]] contains all the documentation, while the code is maintained on [[http://code.google.com/p/any23/|Google Code]]. An official up-to-date showcase [[http://any23.org|demo]] is also available. == Rationale == Provide and maintain a robust, standard and updated library for extracting and validating semantic markup from heterogeneous sources would provide large benefits to the entire Open Source Community. Researchers and academic projects are adopting RDF related technologies from years while the industry is actually moving toward Semantic Web technologies with more concreteness. Several industry initiatives related to the [[http://en.wikipedia.org/wiki/Semantic_Web|Web of Data]] are taking place in the these months. [[http://schema.org|Schema.org]], for example, is an initiative sponsored by [[http://www.google.com/about/corporate/company/|Google Inc]], [[http://info.yahoo.com/center/us/yahoo/|Yahoo Inc]] and [[http://www.microsoft.com/about/companyinformation/en/us/default.aspx|Microsoft Corporation]] to structure the data in a harmonized way on [[http://dev.w3.org/html5/spec/Overview.html|HTML5]] pages. [[http://schema.org|Schema.org]] leverages on the [[http://dev.w3.org/html5/md/|HTML5 Microdata]] native specification. [[http://ogp.me/|OpenGraphProtocol]] is the open standard sponsored by [[https://www.facebook.com/pages/Facebooking/114721225206500|Facebook Inc]] to include metadata in HTML page headers. [[http://ogp.me/|OpenGraphProtocol]], initially based on [[http://www.w3.org/TR/xhtml-rdfa-primer/|RDFa]], allows to describe the content of a Web page and its underlying vocabulary could be directly represented using RDF. = Current Status = == Meritocracy == The historical Any23 team believes in meritocracy and always acted as a community. Mailing list, open issue tracker and other communication channels have always been adopted since its first release. The adoption in a larger community, such as Apache, is the natural evolution for Any23. Moreover, the Apache standards will enforce the existing Any23 community practices and will be a foundation for future committers involvement. == Core Developers == In alphabetical order: * Davide Palmisano * Giovanni Tummarello * Michele Mostarda * Richard Cyganiak * Reto Bachmann-Gmuer * Simone Tripodi * Szymon Danielczyk * Tommaso Teofili == Alignment == Main aim of the project is to develop and maintain a fully flavored semantic markup distiller that can be used by other Apache projects that need an RDF extraction tool. The Any23 library core is written using the followi
Re: [VOTE] Add Any23 to the Apache Incubator
On Tue, Sep 27, 2011 at 7:18 AM, Mattmann, Chris A (388J) wrote: > Hi Folks, > > OK, the proposal period had died now and I'm now calling a formal VOTE on > the Any23 proposal located here: > > http://wiki.apache.org/incubator/Any23Proposal > > Proposal text copied at the bottom of this email. I'll leave the VOTE open > through the > rest of the week, and close it around Saturday, October 1, early AM PDT. > > Please VOTE: > > [X] +1 Accept Any23 into the Apache Incubator > [ ] +0 Don't care > [ ] -1 Don't Accept Any23 into the Apache Incubator because... Cheers, Christian > > Thanks! > > Cheers, > Chris > > P.S. Here's my +1 > > Proposal Text: > > = Any23 = > == Abstract == > The following proposal is about ''Anything To Triples'' (shortly Any23) > defined as a Java library, a Web service and a set of command line tools to > extract and validate structured data in [[http://www.w3.org/RDF/|RDF]] > format from a variety of Web documents and markup formats. Any23 is what it > is informally named an ''RDF Distiller''. > > == Proposal == > Any23 "Anything to Triples" is a library written in Java 6 and released under > the Apache 2.0 License. It provides a set of extractors for scraping semantic > markup (such as [[http://microformats.org/|Microformats]], > [[http://www.w3.org/TR/rdfa-syntax/|RDFa]] and > [[http://www.w3.org/TR/microdata/|Microdata]]) from several sources (HTML4, > XHTML5, CSV), a set of data validations, a set of parsers and writers to > handle the main RDF transport formats (RDFXML, Ntriples, NQuads, Turtle). > The library provides a command line tool for dealing with data extraction, > conversion and validation, and a REST service implementation. The library is > plugin based, allowing the hot loading of new extractors and validators. > Any23 enables third-parties developers to access structured data from Web > pages without the need of implementing ad-hoc scraping techniques. In this > sense, Any23 will relieve developers from build complex solutions when > developing data acquisition pipelines and processes targeted to semantically > marked-up Web data. > > == Background == > Any23 has been initially developed at [[http://www.deri.ie/|DERI (Digital > Enterprise Research Institute)]], as main component of the RDF extraction > pipeline used in [[http://sindice.com/|Sindice (the Semantic Web Index)]], > now is evolved in joint effort with [[http://www.fbk.eu/|FBK (Fondazione > Bruno Kessler)]]. At present time the Any23 official > [[http://developers.any23.org|developers page]] contains all the > documentation, while the code is maintained on > [[http://code.google.com/p/any23/|Google Code]]. An official up-to-date > showcase [[http://any23.org|demo]] is also available. > > == Rationale == > Provide and maintain a robust, standard and updated library for extracting > and validating semantic markup from heterogeneous sources would provide large > benefits to the entire Open Source Community. Researchers and academic > projects are adopting RDF related technologies from years while the industry > is actually moving toward Semantic Web technologies with more concreteness. > Several industry initiatives related to the > [[http://en.wikipedia.org/wiki/Semantic_Web|Web of Data]] are taking place > in the these months. [[http://schema.org|Schema.org]], for example, is an > initiative sponsored by > [[http://www.google.com/about/corporate/company/|Google Inc]], > [[http://info.yahoo.com/center/us/yahoo/|Yahoo Inc]] and > [[http://www.microsoft.com/about/companyinformation/en/us/default.aspx|Microsoft > Corporation]] to structure the data in a harmonized way on > [[http://dev.w3.org/html5/spec/Overview.html|HTML5]] pages. > [[http://schema.org|Schema.org]] leverages on the > [[http://dev.w3.org/html5/md/|HTML5 Microdata]] native specification. > [[http://ogp.me/|OpenGraphProtocol]] is the open standard sponsored by > [[https://www.facebook.com/pages/Facebooking/114721225206500|Facebook Inc]] > to include metadata in HTML page headers. > [[http://ogp.me/|OpenGraphProtocol]], initially based on > [[http://www.w3.org/TR/xhtml-rdfa-primer/|RDFa]], allows to describe the > content of a Web page and its underlying vocabulary could be directly > represented using RDF. > > = Current Status = > == Meritocracy == > The historical Any23 team believes in meritocracy and always acted as a > community. Mailing list, open issue tracker and other communication channels > have always been adopted since its first release. The adoption in a larger > community, such as Apache, is the natural evolution for Any23. Moreover, the > Apache standards will enforce the existing Any23 community practices and will > be a foundation for future committers involvement. > > == Core Developers == > In alphabetical order: > > * Davide Palmisano > * Giovanni Tummarello > * Michele Mostarda > * Richard Cyganiak > * Reto Bachmann-Gmuer > * Simone Tripodi