+1 (binding)
On Sun, Apr 19, 2015 at 4:48 PM, Whitehall, Kim D (398M) < kim.d.whiteh...@jpl.nasa.gov> wrote: > > +1 from me too! > Best regards > Kim > This email was sent from a mobile device. Please excuse typos and/or > brevity. > > > On Apr 19, 2015, at 11:47 AM, "jan i" <j...@apache.org> wrote: > > > >> On Sunday, April 19, 2015, Louis Suárez-Potts <lui...@gmail.com> wrote: > >> > >> > >>>> On 19 Apr 2015, at 01:00, Mattmann, Chris A (3980) < > >>> chris.a.mattm...@jpl.nasa.gov <javascript:;>> wrote: > >>> > >>> OK all, discussion has died down, we have 3 mentors, I think it’s > >>> time to proceed to a VOTE. > >>> > >>> I am calling a VOTE now to accept the Climate Model Diagnostic > >>> Analyzer (CMDA) into the Apache Incubator. The VOTE is open for > >>> at least the next 72 hours: > >>> > >>> [ ] +1 Accept Apache Climate Model Diagnostic Analyzer into the Apache > >>> Incubator. > >>> [ ] +0 Abstain. > >>> [ ] -1 Don’t accept Apache Climate Model Diagnostic Analyzer into the > >>> Apache Incubator > >>> because… > >> > >> +1 > > > > > > +1 (binding) > > > > rgds > > jan i > > > >> -louis (non-binding) > >> PS this came across with double bang priority. Really? > >> > >>> > >>> I’ll try and close the VOTE out on Friday. > >>> > >>> Of course I am +1! > >>> > >>> P.S. the text of the latest wiki proposal is pasted below: > >>> > >>> Cheers, > >>> Chris > >>> > >>> > >>> = Apache ClimateModelDiagnosticAnalyzer Proposal = > >>> > >>> == Abstract == > >>> > >>> The Climate Model Diagnostic Analyzer (CMDA) provides web services for > >>> multi-aspect physics-based and phenomenon-oriented climate model > >>> performance evaluation and diagnosis through the comprehensive and > >>> synergistic use of multiple observational data, reanalysis data, and > >> model > >>> outputs. > >>> > >>> == Proposal == > >>> > >>> The proposed web-based tools let users display, analyze, and download > >>> earth science data interactively. These tools help scientists quickly > >>> examine data to identify specific features, e.g., trends, geographical > >>> distributions, etc., and determine whether a further study is needed. > All > >>> of the tools are designed and implemented to be general so that data > from > >>> models, observation, and reanalysis are processed and displayed in a > >>> unified way to facilitate fair comparisons. The services prepare and > >>> display data as a colored map or an X-Y plot and allow users to > download > >>> the analyzed data. Basic visual capabilities include 1) displaying > >>> two-dimensional variable as a map, zonal mean, and time series 2) > >>> displaying three-dimensional variable’s zonal mean, a two-dimensional > >>> slice at a specific altitude, and a vertical profile. General analysis > >> can > >>> be done using the difference, scatter plot, and conditional sampling > >>> services. All the tools support display options for using linear or > >>> logarithmic scales and allow users to specify a temporal range and > months > >>> in a year. The source/input datasets for these tools are CMIP5 model > >>> outputs, Obs4MIP observational datasets, and ECMWF reanalysis datasets. > >>> They are stored on the server and are selectable by a user through the > >> web > >>> services. > >>> > >>> === Service descriptions === > >>> > >>> 1. '''Two dimensional variable services''' > >>> > >>> * Map of two-dimensional variable: This services displays a two > >>> dimensional variable as a colored longitude and latitude map with > values > >>> represented by a color scheme. Longitude and latitude ranges can be > >>> specified to magnify a specific region. > >>> > >>> * Two dimensional variable zonal mean: This service plots the zonal > mean > >>> value of a two-dimensional variable as a function of the latitude in > >> terms > >>> of an X-Y plot. > >>> > >>> * Two dimensional variable time series: This service displays the > >> average > >>> of a two-dimensional variable over the specific region as function of > >> time > >>> as an X-Y plot. > >>> > >>> 2. '''Three dimensional variable services''' > >>> > >>> * Map of a two dimensional slice of a three-dimensional variable: This > >>> service displays a two-dimensional slice of a three-dimensional > variable > >>> at a specific altitude as a colored longitude and latitude map with > >> values > >>> represented by a color scheme. > >>> > >>> * Three dimensional zonal mean: Zonal mean of the specified > >>> three-dimensional variable is computed and displayed as a colored > >>> altitude-latitude map. > >>> > >>> * Vertical profile of a three-dimensional variable: Compute the area > >>> weighted average of a three-dimensional variable over the specified > >> region > >>> and display the average as function of pressure level (altitude) as an > >> X-Y > >>> plot. > >>> > >>> 3. '''General services''' > >>> > >>> * Difference of two variables: This service displays the differences > >>> between the two variables, which can be either a two dimensional > variable > >>> or a slice of a three-dimensional variable at a specified altitude as > >>> colored longitude and latitude maps > >>> > >>> * Scatter and histogram plots of two variables: This service displays > >> the > >>> scatter plot (X-Y plot) between two specified variables and the > >> histograms > >>> of the two variables. The number of samples can be specified and the > >>> correlation is computed. The two variables can be either a > >> two-dimensional > >>> variable or a slice of a three-dimensional variable at a specific > >> altitude. > >>> > >>> * Conditional sampling: This service lets user to sort a physical > >>> quantity of two or dimensions according to the values of another > variable > >>> (environmental condition, e.g. SST) which may be a two-dimensional > >>> variable or a slice of a three-dimensional variable at a specific > >>> altitude. For a two dimensional quantity, the plot is displayed an X-Y > >>> plot, and for a two-dimensional quantity, plot is displayed as a > >>> colored-map. > >>> > >>> > >>> == Background and Rationale == > >>> > >>> The latest Intergovernmental Panel on Climate Change (IPCC) Fourth > >>> Assessment Report stressed the need for the comprehensive and > innovative > >>> evaluation of climate models with newly available global observations. > >> The > >>> traditional approach to climate model evaluation, which is the > comparison > >>> of a single parameter at a time, identifies symptomatic model biases > and > >>> errors but fails to diagnose the model problems. The model diagnosis > >>> process requires physics-based multi-variable comparisons, which > >> typically > >>> involve large-volume and heterogeneous datasets, and computationally > >>> demanding and data-intensive operations. We propose to develop a > >>> computationally efficient information system to enable the > physics-based > >>> multi-variable model performance evaluations and diagnoses through the > >>> comprehensive and synergistic use of multiple observational data, > >>> reanalysis data, and model outputs. > >>> > >>> Satellite observations have been widely used in model-data > >>> inter-comparisons and model evaluation studies. These studies normally > >>> involve the comparison of a single parameter at a time using a time and > >>> space average. For example, modeling cloud-related processes in global > >>> climate models requires cloud parameterizations that provide > quantitative > >>> rules for expressing the location, frequency of occurrence, and > intensity > >>> of the clouds in terms of multiple large-scale model-resolved > parameters > >>> such as temperature, pressure, humidity, and wind. One can evaluate the > >>> performance of the cloud parameterization by comparing the cloud water > >>> content with satellite data and can identify symptomatic model biases > or > >>> errors. However, in order to understand the cause of the biases and > >>> errors, one has to simultaneously investigate several parameters that > are > >>> integrated in the cloud parameterization. > >>> > >>> Such studies, aimed at a multi-parameter model diagnosis, require > >>> locating, understanding, and manipulating multi-source observation > >>> datasets, model outputs, and (re)analysis outputs that are physically > >>> distributed, massive in volume, heterogeneous in format, and provide > >>> little information on data quality and production legacy. Additionally, > >>> these studies involve various data preparation and processing steps > that > >>> can easily become computationally demanding since many datasets have to > >> be > >>> combined and processed simultaneously. It is notorious that scientists > >>> spend more than 60% of their research time on just preparing the > dataset > >>> before it can be analyzed for their research. > >>> > >>> To address these challenges, we propose to build Climate Model > Diagnostic > >>> Analyzer (CMDA) that will enable a streamlined and structured > preparation > >>> of multiple large-volume and heterogeneous datasets, and provide a > >>> computationally efficient approach to processing the datasets for model > >>> diagnosis. We will leverage the existing information technologies and > >>> scientific tools that we developed in our current NASA ROSES COUND, > MAP, > >>> and AIST projects. We will utilize the open-source Web-service > >> technology. > >>> We will make CMDA complementary to other climate model analysis tools > >>> currently available to the research community (e.g., PCMDI’s CDAT and > >>> NCAR’s CCMVal) by focusing on the missing capabilities such as > >> conditional > >>> sampling, and probability distribution function and cluster analysis of > >>> multiple-instrument datasets. The users will be able to use a web > browser > >>> to interface with CMDA. > >>> > >>> == Current Status == > >>> > >>> The current version of ClimateModelDiagnosticAnalyzer was developed by > a > >>> team at The Jet Propulsion Laboratory (JPL). The project was initiated > as > >>> a NASA-sponsored project (ROSES-CMAC) in 2011. > >>> > >>> == Meritocracy == > >>> > >>> The current developers are not familiar with meritocratic open source > >>> development at Apache, but would like to encourage this style of > >>> development for the project. > >>> > >>> == Community == > >>> > >>> While ClimateModelDiagnosticAnalyzer started as a JPL research project, > >> it > >>> has been used in The 2014 Caltech Summer School sponsored by the JPL > >>> Center for Climate Sciences. Some 23 students from different > institutions > >>> over the world participated. We deployed the tool to the Amazon Cloud > and > >>> let every student each has his or her own virtual machine. Students > gave > >>> positive feedback mostly on the usability and speed of our web > services. > >>> We also collected a number of enhancement requests. We seek to further > >>> grow the developer and user communities using the Apache open source > >>> venue. During incubation we will explicitly seek increased academic > >>> collaborations (e.g., with The Carnegie Mellon University) as well as > >>> industrial participation. > >>> > >>> One instance of our web services can be found at: > >>> http://cmacws4.jpl.nasa.gov:8080/cmac/ > >>> > >>> == Core Developers == > >>> > >>> The core developers of the project are JPL scientists and software > >>> developers. > >>> > >>> == Alignment == > >>> > >>> Apache is the most natural home for taking the > >>> ClimateModelDiagnosticAnalyzer project forward. It is well-aligned with > >>> some Apache projects such as Apache Open Climate Workbench. > >>> ClimateModelDiagnosticAnalyzer also seeks to achieve an Apache-style > >>> development model; it is seeking a broader community of contributors > and > >>> users in order to achieve its full potential and value to the Climate > >>> Science and Big Data community. > >>> > >>> There are also a number of dependencies that will be mentioned below in > >>> the Relationships with Other Apache products section. > >>> > >>> > >>> == Known Risks == > >>> > >>> === Orphaned products === > >>> > >>> Given the current level of intellectual investment in > >>> ClimateModelDiagnosticAnalyzer, the risk of the project being abandoned > >> is > >>> very small. The Carnegie Mellon University and JPL are collaborating > >>> (2014-2015) to build a service for climate analytics workflow > >>> recommendation using fund from NASA. A two-year NASA AIST project > >>> (2015-2016) will soon start to add diagnostic analysis methodologies > such > >>> as conditional sampling method, conditional probability density > function, > >>> data co-location, and random forest. We will also infuse the provenance > >>> technology into CMDA so that the history of the data products and > >>> workflows will be automatically collected and saved. This information > >> will > >>> also be indexed so that the products and workflows can be searchable by > >>> the community of climate scientists and students. > >>> > >>> === Inexperience with Open Source === > >>> > >>> The current developers of ClimateModelDiagnosticAnalyzer are > >> inexperienced > >>> with Open Source. However, our Champion Chris Mattmann is experienced > >>> (Champions of ApacheOpenClimateWorkbench and AsterixDB) and will be > >>> working closely with us, also as the Chief Architect of our JPL > section. > >>> > >>> === Relationships with Other Apache Products === > >>> > >>> Clearly there is a direct relationship between this project and the > >> Apache > >>> Open Climate Workbench already a top level Apache project and also > >> brought > >>> to the ASF by its Champion (and ours) Chris Mattmann. We plan on > directly > >>> collaborating with the Open Climate Workbench community via our > Champion > >>> and we also welcome ASF mentors familiar with the OCW project to help > >>> mentor our project. In addition our team is extremely welcoming of ASF > >>> projects and if there are synergies with them we invite participation > in > >>> the proposal and in the discussion. > >>> > >>> === Homogeneous Developers === > >>> > >>> The current community is within JPL but we would like to increase the > >>> heterogeneity. > >>> > >>> === Reliance on Salaried Developers === > >>> > >>> The initial committers are full-time JPL staff from 2013 to 2014. The > >>> other committers from 2014 to 2015 are a mix of CMU faculty, students > and > >>> JPL staff. > >>> > >>> === An Excessive Fascination with the Apache Brand === > >>> > >>> We believe in the processes, systems, and framework Apache has put in > >>> place. Apache is also known to foster a great community around their > >>> projects and provide exposure. While brand is important, our > fascination > >>> with it is not excessive. We believe that the ASF is the right home for > >>> ClimateModelDiagnosticAnalyzer and that having > >>> ClimateModelDiagnosticAnalyzer inside of the ASF will lead to a better > >>> long-term outcome for the Climate Science and Big Data community. > >>> > >>> === Documentation === > >>> > >>> The ClimateModelDiagnosticAnalyzer services and documentation can be > >> found > >>> at: http://cmacws4.jpl.nasa.gov:8080/cmac/. > >>> > >>> === Initial Source === > >>> > >>> Current source resides in ... > >>> > >>> === External Dependencies === > >>> > >>> ClimateModelDiagnosticAnalyzer depends on a number of open source > >> projects: > >>> > >>> * Flask > >>> * Gunicorn > >>> * Tornado Web Server > >>> * GNU octave > >>> * epd python > >>> * NOAA ferret > >>> * GNU plot > >>> > >>> == Required Resources == > >>> > >>> === Developer and user mailing lists === > >>> > >>> * priv...@cmda.incubator.apache.org <javascript:;> (with moderated > >> subscriptions) > >>> * comm...@cmda.incubator.apache.org <javascript:;> > >>> * d...@cmda.incubator.apache.org <javascript:;> > >>> * us...@cmda.incubator.apache.org <javascript:;> > >>> > >>> A git repository > >>> > >>> https://git-wip-us.apache.org/repos/asf/incubator-cmda.git > >>> > >>> A JIRA issue tracker > >>> > >>> https://issues.apache.org/jira/browse/CMDA > >>> > >>> === Initial Committers === > >>> > >>> The following is a list of the planned initial Apache committers (the > >>> active subset of the committers for the current repository at Google > >> code). > >>> > >>> * Seungwon Lee (seungwon....@jpl.nasa.gov <javascript:;>) > >>> * Lei Pan (lei....@jpl.nasa.gov <javascript:;>) > >>> * Chengxing Zhai (chengxing.z...@jpl.nasa.gov <javascript:;>) > >>> * Benyang Tang (benyang.t...@jpl.nasa.gov <javascript:;>) > >>> * Jia Zhang (jia.zh...@sv.cmu.edu <javascript:;>) > >>> * Wei Wang (wei.w...@sv.cmu.edu <javascript:;>) > >>> * Chris Lee (chris....@sv.cmu.edu <javascript:;>) > >>> * Xing Wei (xing....@sv.cmu.edu <javascript:;>) > >>> > >>> > >>> === Affiliations === > >>> > >>> JPL > >>> > >>> * Seungwon Lee > >>> * Lei Pan > >>> * Chengxing Zhai > >>> * Benyang Tang > >>> > >>> CMU > >>> > >>> * Jia Zhang > >>> * Wei Wang > >>> * Chris Lee > >>> * Xing Wei > >>> > >>> == Sponsors == > >>> > >>> NASA > >>> > >>> === Champion === > >>> > >>> Chris Mattmann (NASA/JPL) > >>> > >>> === Nominated Mentors === > >>> > >>> Greg Reddin<<BR>> > >>> Chris Mattmann<<BR>> > >>> Michael Joyce<<BR>> > >>> James Carman > >>> > >>> === Sponsoring Entity === > >>> > >>> The Apache Incubator > >>> > >>> > >>> > >>> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Chris Mattmann, Ph.D. > >>> Chief Architect > >>> Instrument Software and Science Data Systems Section (398) > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>> Office: 168-519, Mailstop: 168-527 > >>> Email: chris.a.mattm...@nasa.gov <javascript:;> > >>> WWW: http://sunset.usc.edu/~mattmann/ > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Adjunct Associate Professor, Computer Science Department > >>> University of Southern California, Los Angeles, CA 90089 USA > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> > >>> > >>> > >>> > >>> > >>> > >>> -----Original Message----- > >>> From: <Mattmann>, Chris Mattmann <chris.a.mattm...@jpl.nasa.gov > >> <javascript:;>> > >>> Reply-To: "general@incubator.apache.org <javascript:;>" < > >> general@incubator.apache.org <javascript:;>> > >>> Date: Monday, March 23, 2015 at 1:55 AM > >>> To: "general@incubator.apache.org <javascript:;>" < > >> general@incubator.apache.org <javascript:;>> > >>> Cc: "Pan, Lei (398K)" <lei....@jpl.nasa.gov <javascript:;>>, "Lee, > >> Seungwon (398K)" > >>> <seungwon....@jpl.nasa.gov <javascript:;>>, "Zhai, Chengxing (398K)" > >>> <chengxing.z...@jpl.nasa.gov <javascript:;>>, "Tang, Benyang (398J)" > >>> <benyang.t...@jpl.nasa.gov <javascript:;>>, "jia.zh...@west.cmu.edu > >> <javascript:;>" > >>> <jia.zh...@west.cmu.edu <javascript:;>> > >>> Subject: [PROPOSAL] Climate Model Diagnostic Analyzer > >>> > >>>> Hi Everyone, > >>>> > >>>> I am pleased to submit for consideration to the Apache Incubator > >>>> the Climate Model Diagnostic Analyzer proposal. We are actively > >>>> soliciting interested mentors in this project related to climate > >>>> science and analytics and big data. > >>>> > >>>> Please find the wiki text of the proposal below and the link up > >>>> on the wiki here: > >> > https://wiki.apache.org/incubator/ClimateModelDiagnosticAnalyzerProposal > >>>> > >>>> Thank you for your consideration! > >>>> > >>>> Cheers, > >>>> Chris > >>>> (on behalf of the Climate Model Diagnostic Analyzer community) > >>>> > >>>> = Apache ClimateModelDiagnosticAnalyzer Proposal = > >>>> > >>>> == Abstract == > >>>> > >>>> The Climate Model Diagnostic Analyzer (CMDA) provides web services for > >>>> multi-aspect physics-based and phenomenon-oriented climate model > >>>> performance evaluation and diagnosis through the comprehensive and > >>>> synergistic use of multiple observational data, reanalysis data, and > >> model > >>>> outputs. > >>>> > >>>> == Proposal == > >>>> > >>>> The proposed web-based tools let users display, analyze, and download > >>>> earth science data interactively. These tools help scientists quickly > >>>> examine data to identify specific features, e.g., trends, geographical > >>>> distributions, etc., and determine whether a further study is needed. > >> All > >>>> of the tools are designed and implemented to be general so that data > >> from > >>>> models, observation, and reanalysis are processed and displayed in a > >>>> unified way to facilitate fair comparisons. The services prepare and > >>>> display data as a colored map or an X-Y plot and allow users to > download > >>>> the analyzed data. Basic visual capabilities include 1) displaying > >>>> two-dimensional variable as a map, zonal mean, and time series 2) > >>>> displaying three-dimensional variable’s zonal mean, a two-dimensional > >>>> slice at a specific altitude, and a vertical profile. General analysis > >> can > >>>> be done using the difference, scatter plot, and conditional sampling > >>>> services. All the tools support display options for using linear or > >>>> logarithmic scales and allow users to specify a temporal range and > >> months > >>>> in a year. The source/input datasets for these tools are CMIP5 model > >>>> outputs, Obs4MIP observational datasets, and ECMWF reanalysis > datasets. > >>>> They are stored on the server and are selectable by a user through the > >> web > >>>> services. > >>>> > >>>> === Service descriptions === > >>>> > >>>> 1. '''Two dimensional variable services''' > >>>> > >>>> * Map of two-dimensional variable: This services displays a two > >>>> dimensional variable as a colored longitude and latitude map with > values > >>>> represented by a color scheme. Longitude and latitude ranges can be > >>>> specified to magnify a specific region. > >>>> > >>>> * Two dimensional variable zonal mean: This service plots the zonal > >> mean > >>>> value of a two-dimensional variable as a function of the latitude in > >> terms > >>>> of an X-Y plot. > >>>> > >>>> * Two dimensional variable time series: This service displays the > >> average > >>>> of a two-dimensional variable over the specific region as function of > >> time > >>>> as an X-Y plot. > >>>> > >>>> 2. '''Three dimensional variable services''' > >>>> > >>>> * Map of a two dimensional slice of a three-dimensional variable: > This > >>>> service displays a two-dimensional slice of a three-dimensional > variable > >>>> at a specific altitude as a colored longitude and latitude map with > >> values > >>>> represented by a color scheme. > >>>> > >>>> * Three dimensional zonal mean: Zonal mean of the specified > >>>> three-dimensional variable is computed and displayed as a colored > >>>> altitude-latitude map. > >>>> > >>>> * Vertical profile of a three-dimensional variable: Compute the area > >>>> weighted average of a three-dimensional variable over the specified > >> region > >>>> and display the average as function of pressure level (altitude) as an > >> X-Y > >>>> plot. > >>>> > >>>> 3. '''General services''' > >>>> > >>>> * Difference of two variables: This service displays the differences > >>>> between the two variables, which can be either a two dimensional > >> variable > >>>> or a slice of a three-dimensional variable at a specified altitude as > >>>> colored longitude and latitude maps > >>>> > >>>> * Scatter and histogram plots of two variables: This service displays > >> the > >>>> scatter plot (X-Y plot) between two specified variables and the > >> histograms > >>>> of the two variables. The number of samples can be specified and the > >>>> correlation is computed. The two variables can be either a > >> two-dimensional > >>>> variable or a slice of a three-dimensional variable at a specific > >>>> altitude. > >>>> > >>>> * Conditional sampling: This service lets user to sort a physical > >>>> quantity of two or dimensions according to the values of another > >> variable > >>>> (environmental condition, e.g. SST) which may be a two-dimensional > >>>> variable or a slice of a three-dimensional variable at a specific > >>>> altitude. For a two dimensional quantity, the plot is displayed an X-Y > >>>> plot, and for a two-dimensional quantity, plot is displayed as a > >>>> colored-map. > >>>> > >>>> > >>>> == Background and Rationale == > >>>> > >>>> The latest Intergovernmental Panel on Climate Change (IPCC) Fourth > >>>> Assessment Report stressed the need for the comprehensive and > innovative > >>>> evaluation of climate models with newly available global observations. > >> The > >>>> traditional approach to climate model evaluation, which is the > >> comparison > >>>> of a single parameter at a time, identifies symptomatic model biases > and > >>>> errors but fails to diagnose the model problems. The model diagnosis > >>>> process requires physics-based multi-variable comparisons, which > >> typically > >>>> involve large-volume and heterogeneous datasets, and computationally > >>>> demanding and data-intensive operations. We propose to develop a > >>>> computationally efficient information system to enable the > physics-based > >>>> multi-variable model performance evaluations and diagnoses through the > >>>> comprehensive and synergistic use of multiple observational data, > >>>> reanalysis data, and model outputs. > >>>> > >>>> Satellite observations have been widely used in model-data > >>>> inter-comparisons and model evaluation studies. These studies normally > >>>> involve the comparison of a single parameter at a time using a time > and > >>>> space average. For example, modeling cloud-related processes in global > >>>> climate models requires cloud parameterizations that provide > >> quantitative > >>>> rules for expressing the location, frequency of occurrence, and > >> intensity > >>>> of the clouds in terms of multiple large-scale model-resolved > parameters > >>>> such as temperature, pressure, humidity, and wind. One can evaluate > the > >>>> performance of the cloud parameterization by comparing the cloud water > >>>> content with satellite data and can identify symptomatic model biases > or > >>>> errors. However, in order to understand the cause of the biases and > >>>> errors, one has to simultaneously investigate several parameters that > >> are > >>>> integrated in the cloud parameterization. > >>>> > >>>> Such studies, aimed at a multi-parameter model diagnosis, require > >>>> locating, understanding, and manipulating multi-source observation > >>>> datasets, model outputs, and (re)analysis outputs that are physically > >>>> distributed, massive in volume, heterogeneous in format, and provide > >>>> little information on data quality and production legacy. > Additionally, > >>>> these studies involve various data preparation and processing steps > that > >>>> can easily become computationally demanding since many datasets have > to > >> be > >>>> combined and processed simultaneously. It is notorious that scientists > >>>> spend more than 60% of their research time on just preparing the > dataset > >>>> before it can be analyzed for their research. > >>>> > >>>> To address these challenges, we propose to build Climate Model > >> Diagnostic > >>>> Analyzer (CMDA) that will enable a streamlined and structured > >> preparation > >>>> of multiple large-volume and heterogeneous datasets, and provide a > >>>> computationally efficient approach to processing the datasets for > model > >>>> diagnosis. We will leverage the existing information technologies and > >>>> scientific tools that we developed in our current NASA ROSES COUND, > MAP, > >>>> and AIST projects. We will utilize the open-source Web-service > >> technology. > >>>> We will make CMDA complementary to other climate model analysis tools > >>>> currently available to the research community (e.g., PCMDI’s CDAT and > >>>> NCAR’s CCMVal) by focusing on the missing capabilities such as > >> conditional > >>>> sampling, and probability distribution function and cluster analysis > of > >>>> multiple-instrument datasets. The users will be able to use a web > >> browser > >>>> to interface with CMDA. > >>>> > >>>> == Current Status == > >>>> > >>>> The current version of ClimateModelDiagnosticAnalyzer was developed > by a > >>>> team at The Jet Propulsion Laboratory (JPL). The project was initiated > >> as > >>>> a NASA-sponsored project (ROSES-CMAC) in 2011. > >>>> > >>>> == Meritocracy == > >>>> > >>>> The current developers are not familiar with meritocratic open source > >>>> development at Apache, but would like to encourage this style of > >>>> development for the project. > >>>> > >>>> == Community == > >>>> > >>>> While ClimateModelDiagnosticAnalyzer started as a JPL research > project, > >> it > >>>> has been used in The 2014 Caltech Summer School sponsored by the JPL > >>>> Center for Climate Sciences. Some 23 students from different > >> institutions > >>>> over the world participated. We deployed the tool to the Amazon Cloud > >> and > >>>> let every student each has his or her own virtual machine. Students > gave > >>>> positive feedback mostly on the usability and speed of our web > services. > >>>> We also collected a number of enhancement requests. We seek to further > >>>> grow the developer and user communities using the Apache open source > >>>> venue. During incubation we will explicitly seek increased academic > >>>> collaborations (e.g., with The Carnegie Mellon University) as well as > >>>> industrial participation. > >>>> > >>>> One instance of our web services can be found at: > >>>> http://cmacws.jpl.nasa.gov:8080/cmac/ > >>>> > >>>> == Core Developers == > >>>> > >>>> The core developers of the project are JPL scientists and software > >>>> developers. > >>>> > >>>> == Alignment == > >>>> > >>>> Apache is the most natural home for taking the > >>>> ClimateModelDiagnosticAnalyzer project forward. It is well-aligned > with > >>>> some Apache projects such as Apache Open Climate Workbench. > >>>> ClimateModelDiagnosticAnalyzer also seeks to achieve an Apache-style > >>>> development model; it is seeking a broader community of contributors > and > >>>> users in order to achieve its full potential and value to the Climate > >>>> Science and Big Data community. > >>>> > >>>> There are also a number of dependencies that will be mentioned below > in > >>>> the Relationships with Other Apache products section. > >>>> > >>>> > >>>> == Known Risks == > >>>> > >>>> === Orphaned products === > >>>> > >>>> Given the current level of intellectual investment in > >>>> ClimateModelDiagnosticAnalyzer, the risk of the project being > abandoned > >> is > >>>> very small. The Carnegie Mellon University and JPL are collaborating > >>>> (2014-2015) to build a service for climate analytics workflow > >>>> recommendation using fund from NASA. A two-year NASA AIST project > >>>> (2015-2016) will soon start to add diagnostic analysis methodologies > >> such > >>>> as conditional sampling method, conditional probability density > >> function, > >>>> data co-location, and random forest. We will also infuse the > provenance > >>>> technology into CMDA so that the history of the data products and > >>>> workflows will be automatically collected and saved. This information > >> will > >>>> also be indexed so that the products and workflows can be searchable > by > >>>> the community of climate scientists and students. > >>>> > >>>> === Inexperience with Open Source === > >>>> > >>>> The current developers of ClimateModelDiagnosticAnalyzer are > >> inexperienced > >>>> with Open Source. However, our Champion Chris Mattmann is experienced > >>>> (Champions of ApacheOpenClimateWorkbench and AsterixDB) and will be > >>>> working closely with us, also as the Chief Architect of our JPL > section. > >>>> > >>>> === Relationships with Other Apache Products === > >>>> > >>>> Clearly there is a direct relationship between this project and the > >> Apache > >>>> Open Climate Workbench already a top level Apache project and also > >> brought > >>>> to the ASF by its Champion (and ours) Chris Mattmann. We plan on > >> directly > >>>> collaborating with the Open Climate Workbench community via our > Champion > >>>> and we also welcome ASF mentors familiar with the OCW project to help > >>>> mentor our project. In addition our team is extremely welcoming of ASF > >>>> projects and if there are synergies with them we invite participation > in > >>>> the proposal and in the discussion. > >>>> > >>>> === Homogeneous Developers === > >>>> > >>>> The current community is within JPL but we would like to increase the > >>>> heterogeneity. > >>>> > >>>> === Reliance on Salaried Developers === > >>>> > >>>> The initial committers are full-time JPL staff from 2013 to 2014. The > >>>> other committers from 2014 to 2015 are a mix of CMU faculty, students > >> and > >>>> JPL staff. > >>>> > >>>> === An Excessive Fascination with the Apache Brand === > >>>> > >>>> We believe in the processes, systems, and framework Apache has put in > >>>> place. Apache is also known to foster a great community around their > >>>> projects and provide exposure. While brand is important, our > fascination > >>>> with it is not excessive. We believe that the ASF is the right home > for > >>>> ClimateModelDiagnosticAnalyzer and that having > >>>> ClimateModelDiagnosticAnalyzer inside of the ASF will lead to a better > >>>> long-term outcome for the Climate Science and Big Data community. > >>>> > >>>> === Documentation === > >>>> > >>>> The ClimateModelDiagnosticAnalyzer services and documentation can be > >> found > >>>> at: http://cmacws.jpl.nasa.gov:8080/cmac/. > >>>> > >>>> === Initial Source === > >>>> > >>>> Current source resides in ... > >>>> > >>>> === External Dependencies === > >>>> > >>>> ClimateModelDiagnosticAnalyzer depends on a number of open source > >>>> projects: > >>>> > >>>> * Flask > >>>> * Gunicorn > >>>> * Tornado Web Server > >>>> * GNU octave > >>>> * epd python > >>>> * NOAA ferret > >>>> * GNU plot > >>>> > >>>> == Required Resources == > >>>> > >>>> === Developer and user mailing lists === > >>>> > >>>> * priv...@cmda.incubator.apache.org <javascript:;> (with moderated > >> subscriptions) > >>>> * comm...@cmda.incubator.apache.org <javascript:;> > >>>> * d...@cmda.incubator.apache.org <javascript:;> > >>>> * us...@cmda.incubator.apache.org <javascript:;> > >>>> > >>>> A git repository > >>>> > >>>> https://git-wip-us.apache.org/repos/asf/incubator-cmda.git > >>>> > >>>> A JIRA issue tracker > >>>> > >>>> https://issues.apache.org/jira/browse/CMDA > >>>> > >>>> === Initial Committers === > >>>> > >>>> The following is a list of the planned initial Apache committers (the > >>>> active subset of the committers for the current repository at Google > >>>> code). > >>>> > >>>> * Seungwon Lee (seungwon....@jpl.nasa.gov <javascript:;>) > >>>> * Lei Pan (lei....@jpl.nasa.gov <javascript:;>) > >>>> * Chengxing Zhai (chengxing.z...@jpl.nasa.gov <javascript:;>) > >>>> * Benyang Tang (benyang.t...@jpl.nasa.gov <javascript:;>) > >>>> > >>>> > >>>> === Affiliations === > >>>> > >>>> JPL > >>>> > >>>> * Seungwon Lee > >>>> * Lei Pan > >>>> * Chengxing Zhai > >>>> * Benyang Tang > >>>> > >>>> CMU > >>>> > >>>> * Jia Zhang > >>>> * Wei Wang > >>>> * Chris Lee > >>>> * Xing Wei > >>>> > >>>> == Sponsors == > >>>> > >>>> NASA > >>>> > >>>> === Champion === > >>>> > >>>> Chris Mattmann (NASA/JPL) > >>>> > >>>> === Nominated Mentors === > >>>> > >>>> TBD > >>>> > >>>> === Sponsoring Entity === > >>>> > >>>> The Apache Incubator > >>>> > >>>> > >>>> > >>>> > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> Chris Mattmann, Ph.D. > >>>> Chief Architect > >>>> Instrument Software and Science Data Systems Section (398) > >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>>> Office: 168-519, Mailstop: 168-527 > >>>> Email: chris.a.mattm...@nasa.gov <javascript:;> > >>>> WWW: http://sunset.usc.edu/~mattmann/ > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> Adjunct Associate Professor, Computer Science Department > >>>> University of Southern California, Los Angeles, CA 90089 USA > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >> <javascript:;> > >>>> For additional commands, e-mail: general-h...@incubator.apache.org > >> <javascript:;> > >>> > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >> <javascript:;> > >>> For additional commands, e-mail: general-h...@incubator.apache.org > >> <javascript:;> > > > > -- > > Sent from My iPad, sorry for any misspellings. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >