On Monday, August 10, 2015, Amol Kekre <a...@datatorrent.com> wrote: > Roman, > It is a single proposal. Ted and I talked at length on this. The main > difference is the release frequency and need for Malhar to work off a > stable release of Apex. The proposal is for a single community not two, and > to share all the community aspects (emails, commiters, etc.)
This makes perfect sense to me, and having one community delivering 2 different releases is not a problem, at the most some logistic issues. Look forward to the vote. rgds jan i ps. I too would recommend to start with a smaller community, it simplifies the start phase, but it is totally perfect as it is. > > Thks, > Amol > > > > On Sun, Aug 9, 2015 at 8:18 PM, Ted Dunning <ted.dunn...@gmail.com > <javascript:;>> wrote: > > > I had some long talks with Amol on exactly this point when he was getting > > started with this proposal. > > > > At the current time, it appears that the developer communities for these > > two systems are indistinguishable. Moreover, neither project is actually > > much use without the other. Malhar requires Apex to run and Apex > provides > > low-level capabilities that are not particularly useful without Malhar. > > These two are even more strongly linked than, say Lucene and Solr. Or > HDFS > > and Yarn. > > > > The primary difference between these is that Malhar is expected to be > > released much more frequently than Apex as new functions are added. > Also, > > as a platform, there is much more burden on Apex to be stable, thus > > decreasing the desirable release frequency. Over time, it is likely that > > newcomers are more likely to find it easy to contribute on the Malhar > side > > initially, but the existing community is fine with keeping the committer > > pool uniform. > > > > > > > > On Sun, Aug 9, 2015 at 8:09 PM, Roman Shaposhnik <ro...@shaposhnik.org > <javascript:;>> > > wrote: > > > > > I'm confused about whether this is one proposal or two > > > proposals rolled into one. On one hand it seems like > > > Apex and Malhar are independent. On the other hand > > > the proposal covers Apex in great details but not so > > > much Malhar. > > > > > > As usual with ASF, the real question here is the community > > > for the two. If we're talking about the same initial community > > > for both -- I think it makes sense to treat them as a single > > > project, not two. > > > > > > Thanks, > > > Roman. > > > > > > On Wed, Aug 5, 2015 at 5:23 PM, Amol Kekre <a...@datatorrent.com > <javascript:;>> wrote: > > > > I would like to start a discussion on DataTorrent's core engine and > its > > > > operators joining the ASF as an incubating project under the name > Apex. > > > > > > > > The proposal is available on the wiki here: > > > > https://wiki.apache.org/incubator/ApexProposal > > > > > > > > The text of the proposal is also available at the end of this email > > > > > > > > Apex is an enterprise grade native YARN big data-in-motion platform > > that > > > > unifies batch and stream processing. Apex is a highly distributed, > > > > performant, fault tolerant, stateful and easily operable platform. > > > > > > > > Thanks in advance for your time and help. > > > > > > > > Thks, > > > > Amol > > > > > > > > > > > > > > -------------------------------------------------------------------------------------------- > > > > > > > > == Abstract == > > > > Apex is an enterprise grade native YARN big data-in-motion platform > > that > > > > unifies stream processing as well as batch processing. Apex processes > > big > > > > data in-motion in a highly scalable, highly performant, fault > tolerant, > > > > stateful, secure, distributed, and an easily operable way. It > provides > > a > > > > simple API that enables users to write or re-use generic Java code, > > > thereby > > > > lowering the expertise needed to write big data applications. > > > > > > > > Functional and operational specifications are separated. Apex is > > designed > > > > in a way to enable users to write their own code (aka user defined > > > > functions) as is and leave all operability to the platform. The API > is > > > very > > > > simple and is designed to allow users to drop in their code as is. > The > > > > platform mainly deals with operability and treats functional code as > a > > > > black box. Operability includes fault tolerance, scalability, > security, > > > > ease of use, metrics api, webservices, etc. In other words there is > no > > > > separation of UDF (user defined functions), as all functional code is > > > UDF. > > > > This frees users to focus on functional development, and lets > platform > > > > provide operability support. The same code runs as is with different > > > > operability attributes. The data-in-motion architecture of Apex > unifies > > > > stream as well as batch processing in a single platform. Since Apex > is > > a > > > > native YARN application, it leverages all the components of YARN > > without > > > > duplication. Apex was developed with YARN in mind and has no > > overlapping > > > > components/functionality with YARN. > > > > > > > > The Apex platform is supplemented by project Malhar, which is a > library > > > of > > > > operators that implement common business logic functions needed by > > > > customers who want to quickly develop applications. These operators > > > provide > > > > access to HDFS, S3, NFS, FTP, and other file systems; Kafka, > ActiveMQ, > > > > RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, > > > Redis, > > > > HBase, CouchDB and other databases along with JDBC connectors. The > > Malhar > > > > library also includes a host of other common business logic patterns > > that > > > > help users to significantly reduce the time it takes to go into > > > production. > > > > Ease of integration with all other big data technologies is one of > the > > > > primary missions of Malhar. > > > > > > > > == Proposal == > > > > The goal of this proposal is to establish the core engine of > > DataTorrent > > > > RTS product as an Apache Software Foundation (ASF) project in order > to > > > > build a vibrant, diverse, and self-governed open source community > > around > > > > the technology. DataTorrent will continue to sell management tools, > > > > application building tools, easy to use big data applications, and > > custom > > > > high end business logic operators. This proposal covers the Apex > source > > > > code (written in Java), Apex documentation and other materials > > currently > > > > available on https://github.com/DataTorrent/Apex. This proposal also > > > covers > > > > the Malhar source code (written in Java), Malhar documentation, and > > other > > > > materials currently available on > https://github.com/DataTorrent/Malhar > > . > > > We > > > > have done a trademark check on the name Apex, and have concluded that > > the > > > > Apex name is likely to be a suitable project name. > > > > > > > > == Background == > > > > DataTorrent RTS is a mature and robust product developed as a native > > YARN > > > > application. RTS 1.0 was launched in summer of 2014; RTS 2.0 was > > launched > > > > in Jan 2015. Both were well received by customers. RTS 3.0 was > launched > > > at > > > > end of July 2015. RTS is among the first enterprise grade platform > that > > > was > > > > developed from the ground up as native YARN application. DataTorrent > > RTS > > > is > > > > currently maintained by engineers as a closed source project. Even > > though > > > > the engineers behind RTS are experienced software engineers and are > > > > knowledge leaders in data-in-motion platforms, they have had little > > > > exposure to the open source governance process. Customers are > currently > > > > running applications based on DataTorrent RTS in production. > > > > > > > > == Rationale == > > > > Big data applications written for non-Hadoop platforms typically > > require > > > > major rewrites to get them to work with Hadoop. This rewriting > > creates a > > > > significant bottleneck in terms of resources (expertise) which in > turn > > > > jeopardizes the viability of such an endeavour. It is hard enough to > > > > acquire big data expertise, demanding additional expertise to do a > > major > > > > code conversion makes it a very hard problem for projects to > > successfully > > > > migrate to Hadoop. Also, due to the batch processing nature of > Hadoop’s > > > > MapReduce paradigm, users often have to wait tens of minutes to see > > > results > > > > and act on them due to various delays in data flow. DataTorrent’s RTS > > > > data-in-motion architecture is designed to address this problem. It > > > enables > > > > even the non big data developer to write code and operate it in a > > > scalable, > > > > fault tolerant manner. The big data-in-motion architecture of > > > DataTorrent’s > > > > RTS enables ease of integration into current enterprise > infrastructure. > > > > This goal was achieved by keeping the API simple and empowering users > > to > > > > put in the connector code as is (or with minimal changes). > > > > > > > > Malhar is a manifestation of this reality, and we or the customer > > > engineers > > > > were able to create these connectors within a day or so if not > within a > > > > week. Connectors include those to integrate with message bus(es), > file > > > > systems, databases, other protocols, and more continue to be added. > > Over > > > a > > > > period of time we expect users to simply pick a connector that > already > > > > exists in Malhar and quickly begin integrating with their current > > > > enterprise infrastructure. Within the data-in-motion architecture a > > > stream > > > > application is one with connector(s) to say Kafka, JMS, or Flume; > > while a > > > > batch application is one with connector(s) to HDFS, HBase, FTP, NFS, > > S3n > > > > etc. This allows usage of the platform for both stream as well as > batch > > > > processing with same business logic. Complete separation of user > > written > > > > application code from all operational aspects of the system, as well > as > > > > support code for YARN, significantly expands the potential use cases > > that > > > > can migrate to use Hadoop. > > > > > > > > Apex will enable Hadoop eco-system to migrate a lot more use cases. > It > > > will > > > > enable the Hadoop eco-system to deliver on a promise to rapidly > > transform > > > > current IT infrastructure. Apex will help in significantly increasing > > > > productization of big data projects. One of the main barometers of > > > success > > > > in the Hadoop eco-system is significant reduction of time to market > for > > > big > > > > data applications migrating to Hadoop. We believe that Apex will be > one > > > of > > > > the platforms that will enable users to extract value from big data, > by > > > > reducing time to market. This rapid innovation can be optimally > > achieved > > > > through a vibrant, diverse, self-governed community collectively > > > innovating > > > > around Apex and the Malhar library, while at the same time > > > > cross-pollinating with various other big data platforms. ASF is an > > ideal > > > > place to meet this goal. > > > > > > > > == Initial Goals == > > > > Our initial goals are to bring Apex and Malhar repositories into the > > ASF, > > > > adapt internal engineering processes to open development, and foster > a > > > > collaborative development model in accordance with the "Apache Way." > > > > DataTorrent plans to develop new functionality in an open, > > > community-driven > > > > way. To get there, the existing internal build, test and release > > > processes > > > > will be refactored to support open development. We already have an > > active > > > > user community on google groups that we intend to migrate to Apache. > > > > > > > > == Current Status == > > > > Currently, the project Apex code base is available under Apache 2.0 > > > license > > > > (https://github.com/DataTorrent/Apex). Project Malhar code base is > > > > available under Apache 2.0 license ( > > > https://github.com/DataTorrent/Malhar). > > > > Project Malhar was open sourced 2 years ago which should make it easy > > for > > > > the project Malhar team to adapt to an open, collaborative, and > > > > meritocratic environment. Contributors of Malhar are employees of > > > > DataTorrent or have agreed to the shift to Apache. Project Apex, in > > > > contrast, was developed as a proprietary, closed-source product, but > > the > > > > internal engineering practices adopted by the development team were > > > common > > > > to Malhar, and should lend themselves well to an open environment. > > > > DataTorrent plans to execute a software grant agreement as part of > the > > > > launch of the incubation of Apex as an Apache project. > > > > > > > > The DataTorrent team has always focused on building a robust end user > > > > community of paying and non-paying customers. We think that the > > existing > > > > community centered around the existing google groups mailing list > > should > > > be > > > > relatively easy to transform into an Apache-style community including > > > both > > > > users and developers. > > > > > > > > === Meritocracy === > > > > Our proposed list of initial committers include the current RTS R&D > > team, > > > > and our existing customers. This group will form a base for the > broader > > > > community we will invite to collaborate on the codebase. We intend to > > > > radically expand the initial developer and user community by running > > the > > > > project in accordance with the "Apache Way". Users and new > contributors > > > > will be treated with respect and welcomed. By participating in the > > > > community and providing quality patches/support that move the project > > > > forward, they will earn merit. They also will be encouraged to > provide > > > > non-code contributions (documentation, events, presentations, > community > > > > management, etc.) and will gain merit for doing so. Those with a > proven > > > > support and quality track record will be encouraged to become > > committers. > > > > > > > > === Community === > > > > If Apex is accepted for incubation, the primary initial goal will be > > > > transitioning the core community towards embracing the Apache Way of > > > > project governance. We will solicit major existing contributors to > > become > > > > committers on the project from the start. It should be noted that the > > > > existing community is already more diverse in many ways than some > > > top-level > > > > Apache projects. We expect that we can encourage even more diversity. > > > > > > > > === Core Developers === > > > > While a few core developers are skilled in working in openly governed > > > > Apache communities, most of the core developers are currently NOT > > > > affiliated with the ASF and would require new ICLAs before committing > > to > > > > the project. There would also be a learning curve associated with > this > > > > on-boarding. Changing current development practices to be more open > > will > > > be > > > > an important step. > > > > > > > > === Alignment === > > > > The following existing ASF projects provide related functionality as > > that > > > > provided by Apex and should be considered when reviewing Apex > proposal: > > > > > > > > Apache HadoopⓇ is a distributed storage and processing framework for > > very > > > > large datasets focusing primarily on batch processing for analytic > > > > purposes. Apex is a native YARN application. The Apex and Malhar > > roadmap > > > > includes plans to continue to leverage YARN, and help the YARN > > community > > > > develop the ability to support long running applications. Apex uses > DFS > > > > interface of its core checkpoint/commit. Malhar has a large number of > > > > operators that leverage HDFS and other Apache projects. Our roadmap > > > > includes plans to continue to deepen the currently close integration > > with > > > > HDFS. > > > > > > > > Apache HBase offers tabular data stored in Hadoop based on the Google > > > > Bigtable model. Malhar has HBase connectors to ease integration with > > > HBase. > > > > Malhar roadmap includes plans to continue to enhance integration with > > > > Apache HBase. > > > > > > > > Apache Kafka offers distributed and durable publish-subscribe > > messaging. > > > > Malhar integrates Kafka with Hadoop through feature rich connectors > and > > > > supports ingest as well as analytical functions to incoming data. Raw > > > data > > > > can be ingested from Kafka and results can be written to Kafka. > Malhar > > > > roadmap includes plans to continue to enhance integration with Apache > > > Kafka. > > > > > > > > Apache Flume is a distributed, reliable, and available service for > > > > efficiently collecting, aggregating, and moving large amounts of log > > > data. > > > > Malhar has Flume connectors to ease integration with Flume. These > > > > connectors ensures that ingestion with Flume is fault tolerant and > thus > > > can > > > > be done in real-time with the same SLA as Flume’s HDFS connectors. > > Malhar > > > > roadmap includes plans to continue to enhance integration with Apache > > > Flume. > > > > > > > > Apache Cassandra is a highly scalable, distributed key-value store > that > > > > focuses on eventual consistency. Malhar has connectors to ease > > > integration > > > > with Cassandra. Malhar roadmap includes plans to continue to enhance > > > > integration with Apache Cassandra. > > > > > > > > Apache Accumulo is a distributed key-value store based on Google’s > > > BigTable > > > > design. Malhar has connectors to ease integration with Accumulo. The > > > Malhar > > > > roadmap includes plans to continue to enhance integration with Apache > > > > Accumulo. > > > > > > > > Apache Tez is aimed at building an application framework which allows > > > for a > > > > complex DAG of tasks for process data. The Apex and Malhar roadmaps > > > include > > > > plans to integrate with Apache Tez but this is not currently > supported. > > > > > > > > Apache ActiveMQ and its sub project Apache Apollo offers a powerful > > > message > > > > queue framework. Malhar has ActiveMQ connectors that ease integration > > > with > > > > ActiveMQ. > > > > > > > > Apache Spark is an engine for processing large datasets, typically > in a > > > > Hadoop cluster. Malhar project makes it easy for users to integrate > > with > > > > Spark. The Malhar roadmap includes plans to continue to enhance > > > integration > > > > with Apache Spark. > > > > > > > > Apache Flink is an engine for scalable batch and stream data > > processing. > > > > Malhar project makes it easy for users to integrate with Flink. There > > is > > > > overlap in how Flink leverages data-in-motion architecture for both > > > stream > > > > and batch processing, and it does subscribe to our thought process > that > > > > data-in-motion can handle both stream and batch, meanwhile a batch > only > > > > engine will find it harder to manage streams. We differ in terms of > how > > > we > > > > handle operability, user defined code, metrics, webservices etc. Apex > > is > > > > very operational oriented, while Flink has much more focus on > > functional > > > > elements. Malhar and rapid availability of common business logic is > > > another > > > > differentiator. We believe both these approaches are valid and the > > > > community and innovation will gain by through cross pollination. We > > plan > > > to > > > > integrate with Apache Flink via HDFS for now. > > > > > > > > Apache Hive software facilitates querying and managing large datasets > > > > residing in distributed storage. Malhar project makes it easy for > users > > > to > > > > integrate with Apache Hive. The Malhar roadmap includes plans to > > continue > > > > to enhance integration with Apache Hive. > > > > > > > > Apache Pig is a platform for analyzing large data sets. Pig consists > > of > > > a > > > > high-level language for expressing data analysis programs, coupled > with > > > > infrastructure for evaluating these programs. The Apex and Malhar > > > roadmaps > > > > include plans to integrate with Apache Pig. > > > > > > > > Apache Storm is a distributed realtime computation system. Malhar > makes > > > it > > > > easy for users to integrate with Apache Storm. We plan to integrate > > with > > > > Apache Storm via HDFS for now. Malhar roadmaps include plans to > > continue > > > to > > > > support mechanism for integration with Apache Storm. > > > > > > > > Apache Samza is a distributed stream processing framework. Malhar > makes > > > it > > > > easy for users to integrate with Apache Samza. We plan to integrate > > with > > > > Apache Samza via HDFS or Apache Kafka for now. Malhar roadmaps > include > > > > plans to continue to support mechanism for integration with Apache > > Samza. > > > > > > > > Apache Slider is a YARN application to deploy existing distributed > > > > applications on YARN, monitor them, and make them larger or smaller > as > > > > desired even when the application is running. Once Slider matures, we > > > will > > > > take a look at close integration of Apex with Slider. > > > > > > > > Project Malhar and Apex are aligned to many more Apache projects and > > > other > > > > open source projects as ease of integration with other technologies > is > > > one > > > > of the primary goals of this project. These include Apache Solr, > > > > ElasticSearch, MongoDB, Aerospike, ZeroMQ, CouchDB, CouchBase, > > MemCache, > > > > Redis, RabbitMQ, Apache Derby. > > > > > > > > == Known Risks == > > > > Development has been sponsored mostly by a single company > (DataTorrent, > > > > Inc.) thus far and coordinated mainly by the core DataTorrent RTS and > > > > Malhar team, with active participation from our current customers. > > > > > > > > For the project to fully transition to the Apache Way governance > model, > > > > development must shift towards the merit-centric model of growing a > > > > community of contributors balanced with the needs for extreme > stability > > > and > > > > core implementation coherency. > > > > > > > > The tools and development practices in place for the DataTorrent RTS > > and > > > > Malhar products are compatible with the ASF infrastructure and thus > we > > do > > > > not anticipate any on-boarding pains. Migration from the current > GitHub > > > > repository is also expected to be straightforward. > > > > > > > > === Orphaned products === > > > > DataTorrent is fully committed to DataTorrent Apex and Malhar and the > > > > product will continue to be based on the Apex project. Moreover, > > > > DataTorrent has a vested interest in making Apex succeed by driving > its > > > > close integration with sister ASF projects. We expect this to further > > > > reduce the risk of orphaning the product. > > > > > > > > === Inexperience with Open Source === > > > > DataTorrent has embraced open source software by open sourcing Malhar > > > > project under Apache 2.0 license. The DataTorrent team includes > > veterans > > > > from the Yahoo! Hadoop team. Although some of the initial committers > > have > > > > not been developers on an entirely open source, community-driven > > project, > > > > we expect to bring to bear the open development practices of Malhar > to > > > the > > > > Apex project. Additionally, several ASF veterans agreed to mentor the > > > > project and are listed in this proposal. The project will rely on > their > > > > guidance and collective wisdom to quickly transition the entire team > of > > > > initial committers towards practicing the Apache Way. DataTorrent is > > also > > > > driving the Kafka on YARN (KOYA) initiative. > > > > > > > > === Homogeneous Developers === > > > > While most of the initial committers are employed by DataTorrent, we > > have > > > > already seen a healthy level of interest from our existing customers > > and > > > > partners. We intend to convert that interest directly into > > participation > > > > and will be investing in activities to recruit additional committers > > from > > > > other companies. > > > > > > > > === Reliance on Salaried Developers === > > > > Most of the contributors are paid to work in the Big Data space. > While > > > they > > > > might wander from their current employers, they are unlikely to > venture > > > far > > > > from their core expertises and thus will continue to be engaged with > > the > > > > project regardless of their current employers. > > > > > > > > === Relationships with Other Apache Products === > > > > As mentioned in the Alignment section, Apex may consider various > > degrees > > > of > > > > integration and code exchange with Apache Hadoop (YARN and HDFS), > > Apache > > > > Kafka, Apache HBase, Apache Flume, Apache Cassandra, Apache Accumulo, > > > > Apache Tez, Apache Hive, Apache Pig, Apache Storm, Apache Samza, > Apache > > > > Spark, Apache Slider. Given the success that the DataTorrent RTS > > product > > > > enjoyed, we expect integration points to be inside and outside the > > > project. > > > > We look forward to collaborating with these communities as well as > > other > > > > communities under the Apache umbrella. > > > > > > > > === An Excessive Fascination with the Apache Brand === > > > > While we intend to leverage the Apache ‘branding’ when talking to > other > > > > projects as testament of our project’s ‘neutrality’, we have no plans > > for > > > > making use of Apache brand in press releases nor posting billboards > > > > advertising acceptance of Apex into Apache Incubator. > > > > > > > > > > > > == Documentation == > > > > See documentation for the current state of the project documentation > > > > available as part of the GitHub repositories - > > > > https://github.com/DataTorrent/Apex; > > > https://github.com/DataTorrent/Malhar. > > > > In addition a list of demos that serve as a how to guide are > available > > at > > > > https://github.com/DataTorrent/Malhar/tree/master/demos > > > > > > > > == Initial Source == > > > > DataTorrent has released the source code for Apex under Apache 2.0 > > > License > > > > at https://github.com/DataTorrent/Apex, and that of Malhar under > > Apache > > > 2.0 > > > > licence at https://github.com/DataTorrent/Malhar. We encourage ASF > > > > community members interested in this proposal to download the source > > > code, > > > > review it and try out the software. > > > > > > > > == Source and Intellectual Property Submission Plan == > > > > As soon as Apex is approved to join Apache Incubator, DataTorrent > will > > > > execute a Software Grant Agreement and the source code will be > > > transitioned > > > > onto ASF infrastructure. The code is already licensed under the > Apache > > > > Software License, version 2.0. We know of no legal encumberments that > > > would > > > > inhibit the transfer of source code to the ASF. > > > > > > > > == External Dependencies == > > > > All dependencies fall under the permissive licenses categories, or > weak > > > > copy left (http://www.apache.org/legal/resolved.html#category-b). We > > > intend > > > > to remove the dependencies on GPL licensed technologies on which APex > > or > > > > Malhar depend. These technologies are optional and have been marked > as > > > such. > > > > > > > > Embedded dependencies (relocated): > > > > * None > > > > > > > > Runtime dependencies: > > > > * activemq-client > > > > * ant > > > > * async-http-client > > > > * bval-jsr303 > > > > * commons-beanutils > > > > * commons-codec > > > > * commons-lang3 > > > > * commons-compiler > > > > * embassador > > > > * fastutil > > > > * guava > > > > * hadoop-common > > > > * hadoop-common-tests > > > > * hadoop-yarn-client > > > > * httpclient > > > > * jackson-core-asl > > > > * jackson-mapper-asl > > > > * javax.mail > > > > * jersey-apache-client4 > > > > * jersey-client > > > > * jetty-servlet > > > > * jetty-websocket > > > > * jline > > > > * kryo > > > > * named-regexp > > > > * netlet > > > > * rhino (GPL 2.0, optional) > > > > * slf4j-api > > > > * slf4j-log4j12 > > > > * validation-api > > > > * xbean-asm5-shaded > > > > * zip4j > > > > > > > > Module or optional dependencies > > > > * accumulo-core > > > > * aerospike-client > > > > * amqp-client > > > > * aws-java-sdk-kinesis > > > > * cassandra-driver-core > > > > * couchbase-client > > > > * CouchbaseMock > > > > * elasticsearch > > > > * geoip-api (LGPL, optional) > > > > * hbase > > > > * hbase-client > > > > * hbase-server > > > > * hive-exec > > > > * hive-service > > > > * hiveunit > > > > * javax.mail-api > > > > * jedis > > > > * jms-api > > > > * jri (GPL, optional) > > > > * jriengine (LGPL, optional) > > > > * jruby (LGPL, optional) > > > > * jython (PSF License, optional) > > > > * jzmq (LGPL, optional) > > > > * kafka_2.10 > > > > * lettuce (GPL, optional) > > > > * libthrift > > > > * Memcached-Java-Client > > > > * mongo-java-driver > > > > * mqtt-client > > > > * mysql-connector-java (GPL2, optional) > > > > * org.ektorp > > > > * rengine (LGPL, optional) > > > > * rome > > > > * solr-core > > > > * solr-solrj > > > > * spymemcached > > > > * sqlite4java > > > > * super-csv > > > > * twitter4j-core > > > > * twitter4j-stream > > > > * uadetector-resources > > > > * org.apache.servicemix.bundles.splunk > > > > > > > > Build only dependencies: > > > > * None > > > > > > > > Test only dependencies: > > > > * activemq-broker > > > > * activemq-kahadb-store > > > > * greenmail > > > > * hadoop-yarn-server-tests > > > > * hsqldb > > > > * janino > > > > * junit > > > > * MockFtpServer > > > > * mockito-all > > > > * testng > > > > > > > > Cryptography N/A > > > > > > > > == Required Resources == > > > > === Mailing lists === > > > > * priv...@apex.incubator.apache.org <javascript:;> (moderated > subscriptions) > > > > * comm...@apex.incubator.apache.org <javascript:;> > > > > * d...@apex.incubator.apache.org <javascript:;> > > > > * iss...@apex.incubator.apache.org <javascript:;> > > > > * u...@apex.incubator.apache.org <javascript:;> > > > > > > > > === Git Repository === > > > > * https://git-wip-us.apache.org/repos/asf/incubator-apex.git > > > > * https://git-wip-us.apache.org/repos/asf/incubator-malhar.git > > > > > > > > === Issue Tracking === > > > > * JIRA Project Apex (APEX) > > > > * JIRA Project Malhar (MALHAR) > > > > > > > > === Other Resources === > > > > * Means of setting up regular builds for Apex on > builds.apache.org > > > > * Means of setting up regular builds for Malhar on > > builds.apache.org > > > > > > > > === Rationale for Malhar and Apex having separate git and jira === > > > > We managed Malhar and Apex as two repos and two jiras on purpose. > Both > > > code > > > > bases are released under Apache 2.0 and are proposed for incubation. > In > > > > terms of our vision to enable innovation around a native YARN > > > > data-in-motion that unifies stream processing as well as batch > > processing > > > > Malhar and Apex go hand in hand. Apex has base API that consists of > > java > > > > api (functional), and attributes (operability). Malhar is a > > manifestation > > > > of this api, but from user perspective, Malhar is itself an API to > > > leverage > > > > business logic. Over past three years we have found that the cadence > of > > > > release and api changes in Malhar is much rapid than Apex and it was > > > > operationally much easier to separate them into their own repos. Two > > > repos > > > > will reflect clear separation of engine (Apex) and operators/business > > > logic > > > > (Malhar), and reflect different developer roles. It will allow or > > > > independent release cycles (operator change independent of engine due > > to > > > > stable API). We however do not believe in two levels of committers. > We > > > > believe there should be one community that works across both and > > > innovates > > > > with ideas that Malhar and Apex combined provide the value > proposition. > > > We > > > > are proposing that Apache incubation process help us to foster > > > development > > > > of one community (mailing list, committers), and a yet be ok with two > > > > repos. We are proposing that this be taken up during incubation. > > > Community > > > > will learn if this works. The decision on whether to split them into > > two > > > > projects be taken after the learning curve during incubation. > > > > > > > > == Initial Committers == > > > > * Roma Ahuja (rahuja at directv dot com) > > > > * Isha Arkatkar (isha at datatorrent dot com) > > > > * Raja Ali (raji at silverspringnet dot com) > > > > * Sunaina Chaudhary ( SChaudhary at directv dot com) > > > > * Bhupesh Chawda (bhupesh at datatorrent dot com) > > > > * Chaitanya Chelobu (chaitanya at datatorrent dot com) > > > > * Bright Chen (bright at datatorrent dot com) > > > > * Pradeep Dalvi (pradeep dot dalvi at datatorrent dot com) > > > > * Sandeep Deshmukh (sandeep at datatorrent dot com) > > > > * Yogi Devendra (yogi at datatorrent dot com) > > > > * Cem Ezberci (hasan dot ezberci at ge dot com) > > > > * Timothy Farkas (tim at datatorrent dot com) > > > > * Ilya Ganelin (ilya dot ganelin at capitalone dot com) > > > > * Parag Goradia (parag dot goradia at ge dot com) > > > > * Tushar Gosavi (tushar at datatorrent dot com) > > > > * Priyanka Gugale (priyanka at datatorrent dot com) > > > > * Gaurav Gupta (gaurav at datatorrent dot com) > > > > * Sandesh Hegde (sandesh at datatorrent dot com) > > > > * Siyuan Hua ( siyuan at datatorrent dot com) > > > > * Ajith Joseph (ajoseph at silverspring dot com) > > > > * Amol Kekre ( amol at datatorrent dot com) > > > > * Chinmay Kolhatkar ( chinmay at datatorrent dot com) > > > > * Pramod Immaneni ( pramod at datatorrent dot com) > > > > * Anuj Lal ( anuj dot lal at ge dot com) > > > > * Dongsu Lee (dlee3 at directv dot com) > > > > * Vitaly Li (blossom dot valley at gmail dot com) > > > > * Dean Lockgaard (dean at datatorrent dot com) > > > > * Rohan Mehta (rohan_mehta at apple dot com) > > > > * Adi Mishra (apmishra at directv dot com, adi dot mishra at gmail > > dot > > > > com) > > > > * Chetan Narsude (chetan at datatorrent dot com) > > > > * Darin Nee (dnee at silverspring dot com) > > > > * Alexander Parfenov (sasha at datatorrent dot com) > > > > * Andrew Perlitch (andy at datatorrent dot com) > > > > * Shubham Phatak (shubham at datatorrent dot com) > > > > * Ashwin Putta (ashwin at datatorrent dot com) > > > > * Rikin Shah (rikin dot shah at capitalone dot com) > > > > * Luis Ramos (l dot ramos at ge dot com) > > > > * Munagala Ramanath (ram at datatorrent dot com) > > > > * Vlad Rozov (vlad dot rozov at datatorrent dot com) > > > > * Atri Sharma (atri dot jiit at gmail dot com) > > > > * Chandni Singh (chandni at datatorrent dot com) > > > > * Venkatesh Sivasubramanian (venkateshs at ge dot com) > > > > * Aniruddha Thombare (aniruddha at datatorrent dot com) > > > > * Jessica Wang (jessica at datatorrent dot com) > > > > * Thomas Weise (thomas at datatorrent dot com) > > > > * David Yan (david at datatorrent dot com) > > > > * Kevin Yang (yang dot k at ge dot com) > > > > * Brennon York (brennon dot york at capitalone dot com) > > > > > > > > == Affiliations == > > > > * Apple: Vitaly Li, Rohan Mehta > > > > * Barclays: Atri Sharma > > > > * Class Software: Justin Mclean > > > > * CapitalOne: Ilya Ganelin, Rikin Shah, Brennon York > > > > * DataTorrent: everyone else on this proposal > > > > * DirecTV: Roma Ahuja, Sunaina Chaudhary, Dongsu Lee, Adi Mishra > > > > * General Electric: Cem Ezberci, Parag Goradia, Anuj Lal, Luis > > Ramos, > > > > Venkatesh Sivasubramanian, Kevin Yang > > > > * Hortonworks: Alan Gates, Taylor Goetz, Chris Nauroth, Hitesh > Shah > > > > * MapR: Ted Dunning > > > > * SilverSpring Networks: Raja Ali, Ajith Joseph, Darin Nee > > > > > > > > == Sponsors == > > > > > > > > === Champion === > > > > Ted Dunning > > > > > > > > === Nominated Mentors === > > > > > > > > The initial mentors are listed below: > > > > * Ted Dunning - Apache Member, MapR > > > > * Alan Gates - Apache Member, Hortonworks > > > > * Taylor Goetz - Apache Member, Hortonworks > > > > * Justin Mclean - Apache Member, Class Software > > > > * Chris Nauroth - Apache Member, Hortonworks > > > > * Hitesh Shah: Apache Member, Hortonworks > > > > > > > > === Sponsoring Entity === > > > > > > > > We would like to propose Apache incubator to sponsor this project. > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > <javascript:;> > > > For additional commands, e-mail: general-h...@incubator.apache.org > <javascript:;> > > > > > > > > > -- Sent from My iPad, sorry for any misspellings.