I would love to help out. I don't know much about the problem domain, but I am a "sciency" kind of guy.
On Mon, Apr 6, 2015 at 12:30 PM, Mattmann, Chris A (3980) <chris.a.mattm...@jpl.nasa.gov> wrote: > :) you volunteering as a mentor? Could use you help! > > Sent from my iPhone > >> On Apr 6, 2015, at 9:18 AM, James Carman <ja...@carmanconsulting.com> wrote: >> >> Apache Camdan? >> >> On Monday, March 23, 2015, Mattmann, Chris A (3980) < >> chris.a.mattm...@jpl.nasa.gov> wrote: >> >>> Hi Everyone, >>> >>> I am pleased to submit for consideration to the Apache Incubator >>> the Climate Model Diagnostic Analyzer proposal. We are actively >>> soliciting interested mentors in this project related to climate >>> science and analytics and big data. >>> >>> Please find the wiki text of the proposal below and the link up >>> on the wiki here: >>> >>> https://wiki.apache.org/incubator/ClimateModelDiagnosticAnalyzerProposal >>> >>> Thank you for your consideration! >>> >>> Cheers, >>> Chris >>> (on behalf of the Climate Model Diagnostic Analyzer community) >>> >>> = Apache ClimateModelDiagnosticAnalyzer Proposal = >>> >>> == Abstract == >>> >>> The Climate Model Diagnostic Analyzer (CMDA) provides web services for >>> multi-aspect physics-based and phenomenon-oriented climate model >>> performance evaluation and diagnosis through the comprehensive and >>> synergistic use of multiple observational data, reanalysis data, and model >>> outputs. >>> >>> == Proposal == >>> >>> The proposed web-based tools let users display, analyze, and download >>> earth science data interactively. These tools help scientists quickly >>> examine data to identify specific features, e.g., trends, geographical >>> distributions, etc., and determine whether a further study is needed. All >>> of the tools are designed and implemented to be general so that data from >>> models, observation, and reanalysis are processed and displayed in a >>> unified way to facilitate fair comparisons. The services prepare and >>> display data as a colored map or an X-Y plot and allow users to download >>> the analyzed data. Basic visual capabilities include 1) displaying >>> two-dimensional variable as a map, zonal mean, and time series 2) >>> displaying three-dimensional variable’s zonal mean, a two-dimensional >>> slice at a specific altitude, and a vertical profile. General analysis can >>> be done using the difference, scatter plot, and conditional sampling >>> services. All the tools support display options for using linear or >>> logarithmic scales and allow users to specify a temporal range and months >>> in a year. The source/input datasets for these tools are CMIP5 model >>> outputs, Obs4MIP observational datasets, and ECMWF reanalysis datasets. >>> They are stored on the server and are selectable by a user through the web >>> services. >>> >>> === Service descriptions === >>> >>> 1. '''Two dimensional variable services''' >>> >>> * Map of two-dimensional variable: This services displays a two >>> dimensional variable as a colored longitude and latitude map with values >>> represented by a color scheme. Longitude and latitude ranges can be >>> specified to magnify a specific region. >>> >>> * Two dimensional variable zonal mean: This service plots the zonal mean >>> value of a two-dimensional variable as a function of the latitude in terms >>> of an X-Y plot. >>> >>> * Two dimensional variable time series: This service displays the average >>> of a two-dimensional variable over the specific region as function of time >>> as an X-Y plot. >>> >>> 2. '''Three dimensional variable services''' >>> >>> * Map of a two dimensional slice of a three-dimensional variable: This >>> service displays a two-dimensional slice of a three-dimensional variable >>> at a specific altitude as a colored longitude and latitude map with values >>> represented by a color scheme. >>> >>> * Three dimensional zonal mean: Zonal mean of the specified >>> three-dimensional variable is computed and displayed as a colored >>> altitude-latitude map. >>> >>> * Vertical profile of a three-dimensional variable: Compute the area >>> weighted average of a three-dimensional variable over the specified region >>> and display the average as function of pressure level (altitude) as an X-Y >>> plot. >>> >>> 3. '''General services''' >>> >>> * Difference of two variables: This service displays the differences >>> between the two variables, which can be either a two dimensional variable >>> or a slice of a three-dimensional variable at a specified altitude as >>> colored longitude and latitude maps >>> >>> * Scatter and histogram plots of two variables: This service displays the >>> scatter plot (X-Y plot) between two specified variables and the histograms >>> of the two variables. The number of samples can be specified and the >>> correlation is computed. The two variables can be either a two-dimensional >>> variable or a slice of a three-dimensional variable at a specific altitude. >>> >>> * Conditional sampling: This service lets user to sort a physical >>> quantity of two or dimensions according to the values of another variable >>> (environmental condition, e.g. SST) which may be a two-dimensional >>> variable or a slice of a three-dimensional variable at a specific >>> altitude. For a two dimensional quantity, the plot is displayed an X-Y >>> plot, and for a two-dimensional quantity, plot is displayed as a >>> colored-map. >>> >>> >>> == Background and Rationale == >>> >>> The latest Intergovernmental Panel on Climate Change (IPCC) Fourth >>> Assessment Report stressed the need for the comprehensive and innovative >>> evaluation of climate models with newly available global observations. The >>> traditional approach to climate model evaluation, which is the comparison >>> of a single parameter at a time, identifies symptomatic model biases and >>> errors but fails to diagnose the model problems. The model diagnosis >>> process requires physics-based multi-variable comparisons, which typically >>> involve large-volume and heterogeneous datasets, and computationally >>> demanding and data-intensive operations. We propose to develop a >>> computationally efficient information system to enable the physics-based >>> multi-variable model performance evaluations and diagnoses through the >>> comprehensive and synergistic use of multiple observational data, >>> reanalysis data, and model outputs. >>> >>> Satellite observations have been widely used in model-data >>> inter-comparisons and model evaluation studies. These studies normally >>> involve the comparison of a single parameter at a time using a time and >>> space average. For example, modeling cloud-related processes in global >>> climate models requires cloud parameterizations that provide quantitative >>> rules for expressing the location, frequency of occurrence, and intensity >>> of the clouds in terms of multiple large-scale model-resolved parameters >>> such as temperature, pressure, humidity, and wind. One can evaluate the >>> performance of the cloud parameterization by comparing the cloud water >>> content with satellite data and can identify symptomatic model biases or >>> errors. However, in order to understand the cause of the biases and >>> errors, one has to simultaneously investigate several parameters that are >>> integrated in the cloud parameterization. >>> >>> Such studies, aimed at a multi-parameter model diagnosis, require >>> locating, understanding, and manipulating multi-source observation >>> datasets, model outputs, and (re)analysis outputs that are physically >>> distributed, massive in volume, heterogeneous in format, and provide >>> little information on data quality and production legacy. Additionally, >>> these studies involve various data preparation and processing steps that >>> can easily become computationally demanding since many datasets have to be >>> combined and processed simultaneously. It is notorious that scientists >>> spend more than 60% of their research time on just preparing the dataset >>> before it can be analyzed for their research. >>> >>> To address these challenges, we propose to build Climate Model Diagnostic >>> Analyzer (CMDA) that will enable a streamlined and structured preparation >>> of multiple large-volume and heterogeneous datasets, and provide a >>> computationally efficient approach to processing the datasets for model >>> diagnosis. We will leverage the existing information technologies and >>> scientific tools that we developed in our current NASA ROSES COUND, MAP, >>> and AIST projects. We will utilize the open-source Web-service technology. >>> We will make CMDA complementary to other climate model analysis tools >>> currently available to the research community (e.g., PCMDI’s CDAT and >>> NCAR’s CCMVal) by focusing on the missing capabilities such as conditional >>> sampling, and probability distribution function and cluster analysis of >>> multiple-instrument datasets. The users will be able to use a web browser >>> to interface with CMDA. >>> >>> == Current Status == >>> >>> The current version of ClimateModelDiagnosticAnalyzer was developed by a >>> team at The Jet Propulsion Laboratory (JPL). The project was initiated as >>> a NASA-sponsored project (ROSES-CMAC) in 2011. >>> >>> == Meritocracy == >>> >>> The current developers are not familiar with meritocratic open source >>> development at Apache, but would like to encourage this style of >>> development for the project. >>> >>> == Community == >>> >>> While ClimateModelDiagnosticAnalyzer started as a JPL research project, it >>> has been used in The 2014 Caltech Summer School sponsored by the JPL >>> Center for Climate Sciences. Some 23 students from different institutions >>> over the world participated. We deployed the tool to the Amazon Cloud and >>> let every student each has his or her own virtual machine. Students gave >>> positive feedback mostly on the usability and speed of our web services. >>> We also collected a number of enhancement requests. We seek to further >>> grow the developer and user communities using the Apache open source >>> venue. During incubation we will explicitly seek increased academic >>> collaborations (e.g., with The Carnegie Mellon University) as well as >>> industrial participation. >>> >>> One instance of our web services can be found at: >>> http://cmacws.jpl.nasa.gov:8080/cmac/ >>> >>> == Core Developers == >>> >>> The core developers of the project are JPL scientists and software >>> developers. >>> >>> == Alignment == >>> >>> Apache is the most natural home for taking the >>> ClimateModelDiagnosticAnalyzer project forward. It is well-aligned with >>> some Apache projects such as Apache Open Climate Workbench. >>> ClimateModelDiagnosticAnalyzer also seeks to achieve an Apache-style >>> development model; it is seeking a broader community of contributors and >>> users in order to achieve its full potential and value to the Climate >>> Science and Big Data community. >>> >>> There are also a number of dependencies that will be mentioned below in >>> the Relationships with Other Apache products section. >>> >>> >>> == Known Risks == >>> >>> === Orphaned products === >>> >>> Given the current level of intellectual investment in >>> ClimateModelDiagnosticAnalyzer, the risk of the project being abandoned is >>> very small. The Carnegie Mellon University and JPL are collaborating >>> (2014-2015) to build a service for climate analytics workflow >>> recommendation using fund from NASA. A two-year NASA AIST project >>> (2015-2016) will soon start to add diagnostic analysis methodologies such >>> as conditional sampling method, conditional probability density function, >>> data co-location, and random forest. We will also infuse the provenance >>> technology into CMDA so that the history of the data products and >>> workflows will be automatically collected and saved. This information will >>> also be indexed so that the products and workflows can be searchable by >>> the community of climate scientists and students. >>> >>> === Inexperience with Open Source === >>> >>> The current developers of ClimateModelDiagnosticAnalyzer are inexperienced >>> with Open Source. However, our Champion Chris Mattmann is experienced >>> (Champions of ApacheOpenClimateWorkbench and AsterixDB) and will be >>> working closely with us, also as the Chief Architect of our JPL section. >>> >>> === Relationships with Other Apache Products === >>> >>> Clearly there is a direct relationship between this project and the Apache >>> Open Climate Workbench already a top level Apache project and also brought >>> to the ASF by its Champion (and ours) Chris Mattmann. We plan on directly >>> collaborating with the Open Climate Workbench community via our Champion >>> and we also welcome ASF mentors familiar with the OCW project to help >>> mentor our project. In addition our team is extremely welcoming of ASF >>> projects and if there are synergies with them we invite participation in >>> the proposal and in the discussion. >>> >>> === Homogeneous Developers === >>> >>> The current community is within JPL but we would like to increase the >>> heterogeneity. >>> >>> === Reliance on Salaried Developers === >>> >>> The initial committers are full-time JPL staff from 2013 to 2014. The >>> other committers from 2014 to 2015 are a mix of CMU faculty, students and >>> JPL staff. >>> >>> === An Excessive Fascination with the Apache Brand === >>> >>> We believe in the processes, systems, and framework Apache has put in >>> place. Apache is also known to foster a great community around their >>> projects and provide exposure. While brand is important, our fascination >>> with it is not excessive. We believe that the ASF is the right home for >>> ClimateModelDiagnosticAnalyzer and that having >>> ClimateModelDiagnosticAnalyzer inside of the ASF will lead to a better >>> long-term outcome for the Climate Science and Big Data community. >>> >>> === Documentation === >>> >>> The ClimateModelDiagnosticAnalyzer services and documentation can be found >>> at: http://cmacws.jpl.nasa.gov:8080/cmac/. >>> >>> === Initial Source === >>> >>> Current source resides in ... >>> >>> === External Dependencies === >>> >>> ClimateModelDiagnosticAnalyzer depends on a number of open source projects: >>> >>> * Flask >>> * Gunicorn >>> * Tornado Web Server >>> * GNU octave >>> * epd python >>> * NOAA ferret >>> * GNU plot >>> >>> == Required Resources == >>> >>> === Developer and user mailing lists === >>> >>> * priv...@cmda.incubator.apache.org <javascript:;> (with moderated >>> subscriptions) >>> * comm...@cmda.incubator.apache.org <javascript:;> >>> * d...@cmda.incubator.apache.org <javascript:;> >>> * us...@cmda.incubator.apache.org <javascript:;> >>> >>> A git repository >>> >>> https://git-wip-us.apache.org/repos/asf/incubator-cmda.git >>> >>> A JIRA issue tracker >>> >>> https://issues.apache.org/jira/browse/CMDA >>> >>> === Initial Committers === >>> >>> The following is a list of the planned initial Apache committers (the >>> active subset of the committers for the current repository at Google code). >>> >>> * Seungwon Lee (seungwon....@jpl.nasa.gov <javascript:;>) >>> * Lei Pan (lei....@jpl.nasa.gov <javascript:;>) >>> * Chengxing Zhai (chengxing.z...@jpl.nasa.gov <javascript:;>) >>> * Benyang Tang (benyang.t...@jpl.nasa.gov <javascript:;>) >>> >>> >>> === Affiliations === >>> >>> JPL >>> >>> * Seungwon Lee >>> * Lei Pan >>> * Chengxing Zhai >>> * Benyang Tang >>> >>> CMU >>> >>> * Jia Zhang >>> * Wei Wang >>> * Chris Lee >>> * Xing Wei >>> >>> == Sponsors == >>> >>> NASA >>> >>> === Champion === >>> >>> Chris Mattmann (NASA/JPL) >>> >>> === Nominated Mentors === >>> >>> TBD >>> >>> === Sponsoring Entity === >>> >>> The Apache Incubator >>> >>> >>> >>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Chris Mattmann, Ph.D. >>> Chief Architect >>> Instrument Software and Science Data Systems Section (398) >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> Office: 168-519, Mailstop: 168-527 >>> Email: chris.a.mattm...@nasa.gov <javascript:;> >>> WWW: http://sunset.usc.edu/~mattmann/ >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Adjunct Associate Professor, Computer Science Department >>> University of Southern California, Los Angeles, CA 90089 USA >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >>> >>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>> <javascript:;> >>> For additional commands, e-mail: general-h...@incubator.apache.org >>> <javascript:;> >>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org