Re: [VOTE] Retire ALOIS podling

2011-06-26 Thread Mohammad Nour El-Din
+1

On Wed, Jun 22, 2011 at 7:06 PM, Henri Yandell  wrote:
> +1.
>
> Source code should be removed from SVN as the podling has not signed
> off on its copyright items.
>
> Hen
>
> On Tue, Jun 21, 2011 at 8:52 AM, Christian Grobmeier
>  wrote:
>> Hello,
>>
>> as already mentioned last week, the ALOIS project is dead and it seems
>> there is no way to recover in near future (or even later). The
>> developers told me in a private message in March that they cannot
>> continue due to personal reasons. It seem this has become truth.
>>
>> I have set up a vote on the dev mailinglist:
>>  * http://s.apache.org/eBx
>> (Note: one of the voters responded on the private list - I counted the vote)
>>
>> So far, no releases have been made.
>>
>> This vote passed before a few hour after being open for 5 days.
>>
>> Please vote for retirement of the alois podling. If this vote passes,
>> I will step to the discussions on retirement and finally retire it.
>>
>> Thanks,
>> Christian
>>
>> [] +1 - please retire
>> [] +/-0
>> [] -1 - please don't retire, because...
>>
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>



-- 
Thanks
- Mohammad Nour
  Author of (WebSphere Application Server Community Edition 2.0 User Guide)
  http://www.redbooks.ibm.com/abstracts/sg247585.html
- LinkedIn: http://www.linkedin.com/in/mnour
- Blog: http://tadabborat.blogspot.com

"Life is like riding a bicycle. To keep your balance you must keep moving"
- Albert Einstein

"Writing clean code is what you must do in order to call yourself a
professional. There is no reasonable excuse for doing anything less
than your best."
- Clean Code: A Handbook of Agile Software Craftsmanship

"Stay hungry, stay foolish."
- Steve Jobs

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] Oozie for the Apache Incubator

2011-06-26 Thread Mohammad Nour El-Din
+1

On Sat, Jun 25, 2011 at 2:15 AM, Phillip Rhodes
 wrote:
> On Fri, Jun 24, 2011 at 3:46 PM, Mohammad Islam  wrote:
>
>> Hi,
>>
>> I would like to propose Oozie to be an Apache Incubator project.
>> Oozie is a server-based workflow scheduling and coordination system to
>> manage
>> data processing jobs for Apache Hadoop.
>>
>>
>> +1
>



-- 
Thanks
- Mohammad Nour
  Author of (WebSphere Application Server Community Edition 2.0 User Guide)
  http://www.redbooks.ibm.com/abstracts/sg247585.html
- LinkedIn: http://www.linkedin.com/in/mnour
- Blog: http://tadabborat.blogspot.com

"Life is like riding a bicycle. To keep your balance you must keep moving"
- Albert Einstein

"Writing clean code is what you must do in order to call yourself a
professional. There is no reasonable excuse for doing anything less
than your best."
- Clean Code: A Handbook of Agile Software Craftsmanship

"Stay hungry, stay foolish."
- Steve Jobs

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] Kafka for the Apache Incubator

2011-06-26 Thread Mohammad Nour El-Din
+1 on the proposal, looking forward the [VOTE] thread to start.

On Sat, Jun 25, 2011 at 3:01 AM, Joe Key  wrote:
> +1
> We will be using it heavily here at HomeHealthCareSOS.com to relay app
> server logs to our DW and Hadoop cluster.
>
> --
> Joe Andrew Key (Andy)
>



-- 
Thanks
- Mohammad Nour
  Author of (WebSphere Application Server Community Edition 2.0 User Guide)
  http://www.redbooks.ibm.com/abstracts/sg247585.html
- LinkedIn: http://www.linkedin.com/in/mnour
- Blog: http://tadabborat.blogspot.com

"Life is like riding a bicycle. To keep your balance you must keep moving"
- Albert Einstein

"Writing clean code is what you must do in order to call yourself a
professional. There is no reasonable excuse for doing anything less
than your best."
- Clean Code: A Handbook of Agile Software Craftsmanship

"Stay hungry, stay foolish."
- Steve Jobs

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] Oozie for the Apache Incubator

2011-06-26 Thread Angelo Kaichen Huang
+1

Thanks for the team. I look forward for this project.

Thanks,
Angelo

> On Fri, Jun 24, 2011 at 3:46 PM, Mohammad Islam  wrote:
>
>> Hi,
>>
>> I would like to propose Oozie to be an Apache Incubator project.
>> Oozie is a server-based workflow scheduling and coordination system to
>> manage
>> data processing jobs for Apache Hadoop.
>>
>>
>> +1
>
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] Oozie for the Apache Incubator

2011-06-26 Thread Suresh Marru
Interesting Project. Time permitting, I would like to contribute to the 
workflow effort

--Suresh

On Jun 24, 2011, at 3:46 PM, Mohammad Islam wrote:

> Hi,
> 
> I would like to propose Oozie to be an Apache Incubator project.  
> Oozie is a server-based workflow scheduling and coordination system to manage 
> data processing jobs for Apache Hadoop. 
> 
> 
> Here's a link to the proposal in the Incubator wiki
> http://wiki.apache.org/incubator/OozieProposal
> 
> 
> I've also pasted the initial contents below.
> 
> Regards,
> 
> Mohammad Islam
> 
> 
> Start of Oozie Proposal 
> 
> Abstract
> Oozie is a server-based workflow scheduling and coordination system to manage 
> data processing jobs for Apache HadoopTM. 
> 
> Proposal
> Oozie is an  extensible, scalable and reliable system to define, manage, 
> schedule,  and execute complex Hadoop workloads via web services. More  
> specifically, this includes: 
> 
>   * XML-based declarative framework to specify a job or a complex 
> workflow of 
> dependent jobs. 
> 
>   * Support different types of job such as Hadoop Map-Reduce, Pipe, 
> Streaming, 
> Pig, Hive and custom java applications. 
> 
>   * Workflow scheduling based on frequency and/or data availability. 
>   * Monitoring capability, automatic retry and failure handing of jobs. 
>   * Extensible and pluggable architecture to allow arbitrary grid 
> programming 
> paradigms. 
> 
>   * Authentication, authorization, and capacity-aware load throttling to 
> allow 
> multi-tenant software as a service. 
> 
> Background
> Most data  processing applications require multiple jobs to achieve their 
> goals,  
> with inherent dependencies among the jobs. A dependency could be  sequential, 
> where one job can only start after another job has finished.  Or it could be 
> conditional, where the execution of a job depends on the  return value or 
> status 
> of another job. In other cases, parallel  execution of multiple jobs may be 
> permitted – or desired – to exploit  the massive pool of compute nodes 
> provided 
> by Hadoop. 
> 
> These  job dependencies are often expressed as a Directed Acyclic Graph, also 
>  
> called a workflow. A node in the workflow is typically a job (a  computation 
> on 
> the grid) or another type of action such as an eMail  notification. 
> Computations 
> can be expressed in map/reduce, Pig, Hive or  any other programming paradigm 
> available on the grid. Edges of the graph  represent transitions from one 
> node 
> to the next, as the execution of a  workflow proceeds. 
> 
> Describing  a workflow in a declarative way has the advantage of decoupling 
> job  
> dependencies and execution control from application logic. Furthermore,  the 
> workflow is modularized into jobs that can be reused within the same  
> workflow 
> or across different workflows. Execution of the workflow is  then driven by a 
> runtime system without understanding the application  logic of the jobs. This 
> runtime system specializes in reliable and  predictable execution: It can 
> retry 
> actions that have failed or invoke a  cleanup action after termination of the 
> workflow; it can monitor  progress, success, or failure of a workflow, and 
> send 
> appropriate alerts  to an administrator. The application developer is 
> relieved 
> from  implementing these generic procedures. 
> 
> Furthermore,  some applications or workflows need to run in periodic 
> intervals 
> or  when dependent data is available. For example, a workflow could be  
> executed 
> every day as soon as output data from the previous 24 instances  of another, 
> hourly workflow is available. The workflow coordinator  provides such 
> scheduling 
> features, along with prioritization, load  balancing and throttling to 
> optimize 
> utilization of resources in the  cluster. This makes it easier to maintain, 
> control, and coordinate  complex data applications. 
> 
> Nearly  three years ago, a team of Yahoo! developers addressed these critical 
>  
> requirements for Hadoop-based data processing systems by developing a  new 
> workflow management and scheduling system called Oozie. While it was  
> initially 
> developed as a Yahoo!-internal project, it was designed and  implemented with 
> the intention of open-sourcing. Oozie was released as a GitHub project in 
> early 
> 2010. Oozie is used in production within Yahoo and  since it has been 
> open-sourced it has been gaining adoption with  external developers 
> 
> Rationale
> Commonly,  applications that run on Hadoop require multiple Hadoop jobs in 
> order 
> to  obtain the desired results. Furthermore, these Hadoop jobs are commonly  
> a 
> combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes  
> map-reduce jobs, Pig jobs, Hive jobs, HDFS operations, Java programs  and 
> shell 
> scripts. 
> 
> Because  of this, developers find themselves writing ad-hoc glue programs to  
> combine these Hadoop jobs. These ad-hoc programs are di

Re: [PROPOSAL] Oozie for the Apache Incubator

2011-06-26 Thread Thilina Gunarathne
+1..  Very interesting stuff..

thanks,
Thilina

On Sun, Jun 26, 2011 at 7:12 PM, Suresh Marru  wrote:

> Interesting Project. Time permitting, I would like to contribute to the
> workflow effort
>
> --Suresh
>
> On Jun 24, 2011, at 3:46 PM, Mohammad Islam wrote:
>
> > Hi,
> >
> > I would like to propose Oozie to be an Apache Incubator project.
> > Oozie is a server-based workflow scheduling and coordination system to
> manage
> > data processing jobs for Apache Hadoop.
> >
> >
> > Here's a link to the proposal in the Incubator wiki
> > http://wiki.apache.org/incubator/OozieProposal
> >
> >
> > I've also pasted the initial contents below.
> >
> > Regards,
> >
> > Mohammad Islam
> >
> >
> > Start of Oozie Proposal
> >
> > Abstract
> > Oozie is a server-based workflow scheduling and coordination system to
> manage
> > data processing jobs for Apache HadoopTM.
> >
> > Proposal
> > Oozie is an  extensible, scalable and reliable system to define, manage,
> > schedule,  and execute complex Hadoop workloads via web services. More
> > specifically, this includes:
> >
> >   * XML-based declarative framework to specify a job or a complex
> workflow of
> > dependent jobs.
> >
> >   * Support different types of job such as Hadoop Map-Reduce, Pipe,
> Streaming,
> > Pig, Hive and custom java applications.
> >
> >   * Workflow scheduling based on frequency and/or data availability.
> >   * Monitoring capability, automatic retry and failure handing of
> jobs.
> >   * Extensible and pluggable architecture to allow arbitrary grid
> programming
> > paradigms.
> >
> >   * Authentication, authorization, and capacity-aware load throttling
> to allow
> > multi-tenant software as a service.
> >
> > Background
> > Most data  processing applications require multiple jobs to achieve their
> goals,
> > with inherent dependencies among the jobs. A dependency could be
>  sequential,
> > where one job can only start after another job has finished.  Or it could
> be
> > conditional, where the execution of a job depends on the  return value or
> status
> > of another job. In other cases, parallel  execution of multiple jobs may
> be
> > permitted – or desired – to exploit  the massive pool of compute nodes
> provided
> > by Hadoop.
> >
> > These  job dependencies are often expressed as a Directed Acyclic Graph,
> also
> > called a workflow. A node in the workflow is typically a job (a
>  computation on
> > the grid) or another type of action such as an eMail  notification.
> Computations
> > can be expressed in map/reduce, Pig, Hive or  any other programming
> paradigm
> > available on the grid. Edges of the graph  represent transitions from one
> node
> > to the next, as the execution of a  workflow proceeds.
> >
> > Describing  a workflow in a declarative way has the advantage of
> decoupling job
> > dependencies and execution control from application logic. Furthermore,
>  the
> > workflow is modularized into jobs that can be reused within the same
>  workflow
> > or across different workflows. Execution of the workflow is  then driven
> by a
> > runtime system without understanding the application  logic of the jobs.
> This
> > runtime system specializes in reliable and  predictable execution: It can
> retry
> > actions that have failed or invoke a  cleanup action after termination of
> the
> > workflow; it can monitor  progress, success, or failure of a workflow,
> and send
> > appropriate alerts  to an administrator. The application developer is
> relieved
> > from  implementing these generic procedures.
> >
> > Furthermore,  some applications or workflows need to run in periodic
> intervals
> > or  when dependent data is available. For example, a workflow could be
>  executed
> > every day as soon as output data from the previous 24 instances  of
> another,
> > hourly workflow is available. The workflow coordinator  provides such
> scheduling
> > features, along with prioritization, load  balancing and throttling to
> optimize
> > utilization of resources in the  cluster. This makes it easier to
> maintain,
> > control, and coordinate  complex data applications.
> >
> > Nearly  three years ago, a team of Yahoo! developers addressed these
> critical
> > requirements for Hadoop-based data processing systems by developing a
>  new
> > workflow management and scheduling system called Oozie. While it was
>  initially
> > developed as a Yahoo!-internal project, it was designed and  implemented
> with
> > the intention of open-sourcing. Oozie was released as a GitHub project in
> early
> > 2010. Oozie is used in production within Yahoo and  since it has been
> > open-sourced it has been gaining adoption with  external developers
> >
> > Rationale
> > Commonly,  applications that run on Hadoop require multiple Hadoop jobs
> in order
> > to  obtain the desired results. Furthermore, these Hadoop jobs are
> commonly  a
> > combination of Java map-reduce jobs, Streaming map-reduce jobs, Pipes
> > map-reduce jobs, Pig jo

Re: [PROPOSAL] Oozie for the Apache Incubator

2011-06-26 Thread Mayank Bansal
+1

Thanks a lot team, I look forward to contribute more to project.

Thanks,
Mayank

On Sun, Jun 26, 2011 at 4:24 PM, Thilina Gunarathne wrote:

> +1..  Very interesting stuff..
>
> thanks,
> Thilina
>
> On Sun, Jun 26, 2011 at 7:12 PM, Suresh Marru  wrote:
>
> > Interesting Project. Time permitting, I would like to contribute to the
> > workflow effort
> >
> > --Suresh
> >
> > On Jun 24, 2011, at 3:46 PM, Mohammad Islam wrote:
> >
> > > Hi,
> > >
> > > I would like to propose Oozie to be an Apache Incubator project.
> > > Oozie is a server-based workflow scheduling and coordination system to
> > manage
> > > data processing jobs for Apache Hadoop.
> > >
> > >
> > > Here's a link to the proposal in the Incubator wiki
> > > http://wiki.apache.org/incubator/OozieProposal
> > >
> > >
> > > I've also pasted the initial contents below.
> > >
> > > Regards,
> > >
> > > Mohammad Islam
> > >
> > >
> > > Start of Oozie Proposal
> > >
> > > Abstract
> > > Oozie is a server-based workflow scheduling and coordination system to
> > manage
> > > data processing jobs for Apache HadoopTM.
> > >
> > > Proposal
> > > Oozie is an  extensible, scalable and reliable system to define,
> manage,
> > > schedule,  and execute complex Hadoop workloads via web services. More
> > > specifically, this includes:
> > >
> > >   * XML-based declarative framework to specify a job or a complex
> > workflow of
> > > dependent jobs.
> > >
> > >   * Support different types of job such as Hadoop Map-Reduce, Pipe,
> > Streaming,
> > > Pig, Hive and custom java applications.
> > >
> > >   * Workflow scheduling based on frequency and/or data
> availability.
> > >   * Monitoring capability, automatic retry and failure handing of
> > jobs.
> > >   * Extensible and pluggable architecture to allow arbitrary grid
> > programming
> > > paradigms.
> > >
> > >   * Authentication, authorization, and capacity-aware load
> throttling
> > to allow
> > > multi-tenant software as a service.
> > >
> > > Background
> > > Most data  processing applications require multiple jobs to achieve
> their
> > goals,
> > > with inherent dependencies among the jobs. A dependency could be
> >  sequential,
> > > where one job can only start after another job has finished.  Or it
> could
> > be
> > > conditional, where the execution of a job depends on the  return value
> or
> > status
> > > of another job. In other cases, parallel  execution of multiple jobs
> may
> > be
> > > permitted – or desired – to exploit  the massive pool of compute nodes
> > provided
> > > by Hadoop.
> > >
> > > These  job dependencies are often expressed as a Directed Acyclic
> Graph,
> > also
> > > called a workflow. A node in the workflow is typically a job (a
> >  computation on
> > > the grid) or another type of action such as an eMail  notification.
> > Computations
> > > can be expressed in map/reduce, Pig, Hive or  any other programming
> > paradigm
> > > available on the grid. Edges of the graph  represent transitions from
> one
> > node
> > > to the next, as the execution of a  workflow proceeds.
> > >
> > > Describing  a workflow in a declarative way has the advantage of
> > decoupling job
> > > dependencies and execution control from application logic. Furthermore,
> >  the
> > > workflow is modularized into jobs that can be reused within the same
> >  workflow
> > > or across different workflows. Execution of the workflow is  then
> driven
> > by a
> > > runtime system without understanding the application  logic of the
> jobs.
> > This
> > > runtime system specializes in reliable and  predictable execution: It
> can
> > retry
> > > actions that have failed or invoke a  cleanup action after termination
> of
> > the
> > > workflow; it can monitor  progress, success, or failure of a workflow,
> > and send
> > > appropriate alerts  to an administrator. The application developer is
> > relieved
> > > from  implementing these generic procedures.
> > >
> > > Furthermore,  some applications or workflows need to run in periodic
> > intervals
> > > or  when dependent data is available. For example, a workflow could be
> >  executed
> > > every day as soon as output data from the previous 24 instances  of
> > another,
> > > hourly workflow is available. The workflow coordinator  provides such
> > scheduling
> > > features, along with prioritization, load  balancing and throttling to
> > optimize
> > > utilization of resources in the  cluster. This makes it easier to
> > maintain,
> > > control, and coordinate  complex data applications.
> > >
> > > Nearly  three years ago, a team of Yahoo! developers addressed these
> > critical
> > > requirements for Hadoop-based data processing systems by developing a
> >  new
> > > workflow management and scheduling system called Oozie. While it was
> >  initially
> > > developed as a Yahoo!-internal project, it was designed and
>  implemented
> > with
> > > the intention of open-sourcing. Oozie was released as a GitHub project
> in
> > 

KEYS and releases

2011-06-26 Thread Craig L Russell

Hi,

I've recently noticed a podling (ok, it was gora ;-) including KEYS as  
part of the release package to be voted on.


I've seen some projects that include KEYS as part of a release.

I've seen other projects add KEYS to the top level of the inclubator/ 
 directory and the same file is updated and used for each  
subsequent release.


My preference would be to manage KEYS separately from releases. So any  
time a new prospective release manager is added to the project, the  
KEYS file would be updated with no vote, since it's not being  
released. Then, no KEYS file would need to be included in the release  
artifacts. And no vote need occur.


Thoughts?

Craig


Craig L Russell
Secretary, Apache Software Foundation
Chair, OpenJPA PMC
c...@apache.org http://db.apache.org/jdo










-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



RE: [VOTE] Retire ALOIS podling

2011-06-26 Thread Noel J. Bergman
+1

--- Noel



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



RE: [VOTE] Retire Stonehenge

2011-06-26 Thread Noel J. Bergman
+1

--- Noel


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



RE: KEYS and releases

2011-06-26 Thread Noel J. Bergman
It seems to me to be a bad idea to distribute keys with releases.  And don't
we already have some ASF-wide policy for managing keys?

--- Noel



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: KEYS and releases

2011-06-26 Thread Christian Grobmeier
> It seems to me to be a bad idea to distribute keys with releases.

+1

> And don't
> we already have some ASF-wide policy for managing keys?

I don't know if there is a policy, but I recently found this:

http://people.apache.org/foaf/index.html
"PGP keys may additionally be added to your profile on
https://id.apache.org/. This will cause them to be added to
https://people.apache.org/keys/, and make them available to other
infrastructure tools in the future."

Then it would be available here:
https://people.apache.org/keys/group/

Why shouldn't all podlings use this a central keys file?

Cheers
Christian

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org