:) you volunteering as a mentor? Could use you help! Sent from my iPhone
> On Apr 6, 2015, at 9:18 AM, James Carman <ja...@carmanconsulting.com> wrote: > > Apache Camdan? > > On Monday, March 23, 2015, Mattmann, Chris A (3980) < > chris.a.mattm...@jpl.nasa.gov> wrote: > >> Hi Everyone, >> >> I am pleased to submit for consideration to the Apache Incubator >> the Climate Model Diagnostic Analyzer proposal. We are actively >> soliciting interested mentors in this project related to climate >> science and analytics and big data. >> >> Please find the wiki text of the proposal below and the link up >> on the wiki here: >> >> https://wiki.apache.org/incubator/ClimateModelDiagnosticAnalyzerProposal >> >> Thank you for your consideration! >> >> Cheers, >> Chris >> (on behalf of the Climate Model Diagnostic Analyzer community) >> >> = Apache ClimateModelDiagnosticAnalyzer Proposal = >> >> == Abstract == >> >> The Climate Model Diagnostic Analyzer (CMDA) provides web services for >> multi-aspect physics-based and phenomenon-oriented climate model >> performance evaluation and diagnosis through the comprehensive and >> synergistic use of multiple observational data, reanalysis data, and model >> outputs. >> >> == Proposal == >> >> The proposed web-based tools let users display, analyze, and download >> earth science data interactively. These tools help scientists quickly >> examine data to identify specific features, e.g., trends, geographical >> distributions, etc., and determine whether a further study is needed. All >> of the tools are designed and implemented to be general so that data from >> models, observation, and reanalysis are processed and displayed in a >> unified way to facilitate fair comparisons. The services prepare and >> display data as a colored map or an X-Y plot and allow users to download >> the analyzed data. Basic visual capabilities include 1) displaying >> two-dimensional variable as a map, zonal mean, and time series 2) >> displaying three-dimensional variable’s zonal mean, a two-dimensional >> slice at a specific altitude, and a vertical profile. General analysis can >> be done using the difference, scatter plot, and conditional sampling >> services. All the tools support display options for using linear or >> logarithmic scales and allow users to specify a temporal range and months >> in a year. The source/input datasets for these tools are CMIP5 model >> outputs, Obs4MIP observational datasets, and ECMWF reanalysis datasets. >> They are stored on the server and are selectable by a user through the web >> services. >> >> === Service descriptions === >> >> 1. '''Two dimensional variable services''' >> >> * Map of two-dimensional variable: This services displays a two >> dimensional variable as a colored longitude and latitude map with values >> represented by a color scheme. Longitude and latitude ranges can be >> specified to magnify a specific region. >> >> * Two dimensional variable zonal mean: This service plots the zonal mean >> value of a two-dimensional variable as a function of the latitude in terms >> of an X-Y plot. >> >> * Two dimensional variable time series: This service displays the average >> of a two-dimensional variable over the specific region as function of time >> as an X-Y plot. >> >> 2. '''Three dimensional variable services''' >> >> * Map of a two dimensional slice of a three-dimensional variable: This >> service displays a two-dimensional slice of a three-dimensional variable >> at a specific altitude as a colored longitude and latitude map with values >> represented by a color scheme. >> >> * Three dimensional zonal mean: Zonal mean of the specified >> three-dimensional variable is computed and displayed as a colored >> altitude-latitude map. >> >> * Vertical profile of a three-dimensional variable: Compute the area >> weighted average of a three-dimensional variable over the specified region >> and display the average as function of pressure level (altitude) as an X-Y >> plot. >> >> 3. '''General services''' >> >> * Difference of two variables: This service displays the differences >> between the two variables, which can be either a two dimensional variable >> or a slice of a three-dimensional variable at a specified altitude as >> colored longitude and latitude maps >> >> * Scatter and histogram plots of two variables: This service displays the >> scatter plot (X-Y plot) between two specified variables and the histograms >> of the two variables. The number of samples can be specified and the >> correlation is computed. The two variables can be either a two-dimensional >> variable or a slice of a three-dimensional variable at a specific altitude. >> >> * Conditional sampling: This service lets user to sort a physical >> quantity of two or dimensions according to the values of another variable >> (environmental condition, e.g. SST) which may be a two-dimensional >> variable or a slice of a three-dimensional variable at a specific >> altitude. For a two dimensional quantity, the plot is displayed an X-Y >> plot, and for a two-dimensional quantity, plot is displayed as a >> colored-map. >> >> >> == Background and Rationale == >> >> The latest Intergovernmental Panel on Climate Change (IPCC) Fourth >> Assessment Report stressed the need for the comprehensive and innovative >> evaluation of climate models with newly available global observations. The >> traditional approach to climate model evaluation, which is the comparison >> of a single parameter at a time, identifies symptomatic model biases and >> errors but fails to diagnose the model problems. The model diagnosis >> process requires physics-based multi-variable comparisons, which typically >> involve large-volume and heterogeneous datasets, and computationally >> demanding and data-intensive operations. We propose to develop a >> computationally efficient information system to enable the physics-based >> multi-variable model performance evaluations and diagnoses through the >> comprehensive and synergistic use of multiple observational data, >> reanalysis data, and model outputs. >> >> Satellite observations have been widely used in model-data >> inter-comparisons and model evaluation studies. These studies normally >> involve the comparison of a single parameter at a time using a time and >> space average. For example, modeling cloud-related processes in global >> climate models requires cloud parameterizations that provide quantitative >> rules for expressing the location, frequency of occurrence, and intensity >> of the clouds in terms of multiple large-scale model-resolved parameters >> such as temperature, pressure, humidity, and wind. One can evaluate the >> performance of the cloud parameterization by comparing the cloud water >> content with satellite data and can identify symptomatic model biases or >> errors. However, in order to understand the cause of the biases and >> errors, one has to simultaneously investigate several parameters that are >> integrated in the cloud parameterization. >> >> Such studies, aimed at a multi-parameter model diagnosis, require >> locating, understanding, and manipulating multi-source observation >> datasets, model outputs, and (re)analysis outputs that are physically >> distributed, massive in volume, heterogeneous in format, and provide >> little information on data quality and production legacy. Additionally, >> these studies involve various data preparation and processing steps that >> can easily become computationally demanding since many datasets have to be >> combined and processed simultaneously. It is notorious that scientists >> spend more than 60% of their research time on just preparing the dataset >> before it can be analyzed for their research. >> >> To address these challenges, we propose to build Climate Model Diagnostic >> Analyzer (CMDA) that will enable a streamlined and structured preparation >> of multiple large-volume and heterogeneous datasets, and provide a >> computationally efficient approach to processing the datasets for model >> diagnosis. We will leverage the existing information technologies and >> scientific tools that we developed in our current NASA ROSES COUND, MAP, >> and AIST projects. We will utilize the open-source Web-service technology. >> We will make CMDA complementary to other climate model analysis tools >> currently available to the research community (e.g., PCMDI’s CDAT and >> NCAR’s CCMVal) by focusing on the missing capabilities such as conditional >> sampling, and probability distribution function and cluster analysis of >> multiple-instrument datasets. The users will be able to use a web browser >> to interface with CMDA. >> >> == Current Status == >> >> The current version of ClimateModelDiagnosticAnalyzer was developed by a >> team at The Jet Propulsion Laboratory (JPL). The project was initiated as >> a NASA-sponsored project (ROSES-CMAC) in 2011. >> >> == Meritocracy == >> >> The current developers are not familiar with meritocratic open source >> development at Apache, but would like to encourage this style of >> development for the project. >> >> == Community == >> >> While ClimateModelDiagnosticAnalyzer started as a JPL research project, it >> has been used in The 2014 Caltech Summer School sponsored by the JPL >> Center for Climate Sciences. Some 23 students from different institutions >> over the world participated. We deployed the tool to the Amazon Cloud and >> let every student each has his or her own virtual machine. Students gave >> positive feedback mostly on the usability and speed of our web services. >> We also collected a number of enhancement requests. We seek to further >> grow the developer and user communities using the Apache open source >> venue. During incubation we will explicitly seek increased academic >> collaborations (e.g., with The Carnegie Mellon University) as well as >> industrial participation. >> >> One instance of our web services can be found at: >> http://cmacws.jpl.nasa.gov:8080/cmac/ >> >> == Core Developers == >> >> The core developers of the project are JPL scientists and software >> developers. >> >> == Alignment == >> >> Apache is the most natural home for taking the >> ClimateModelDiagnosticAnalyzer project forward. It is well-aligned with >> some Apache projects such as Apache Open Climate Workbench. >> ClimateModelDiagnosticAnalyzer also seeks to achieve an Apache-style >> development model; it is seeking a broader community of contributors and >> users in order to achieve its full potential and value to the Climate >> Science and Big Data community. >> >> There are also a number of dependencies that will be mentioned below in >> the Relationships with Other Apache products section. >> >> >> == Known Risks == >> >> === Orphaned products === >> >> Given the current level of intellectual investment in >> ClimateModelDiagnosticAnalyzer, the risk of the project being abandoned is >> very small. The Carnegie Mellon University and JPL are collaborating >> (2014-2015) to build a service for climate analytics workflow >> recommendation using fund from NASA. A two-year NASA AIST project >> (2015-2016) will soon start to add diagnostic analysis methodologies such >> as conditional sampling method, conditional probability density function, >> data co-location, and random forest. We will also infuse the provenance >> technology into CMDA so that the history of the data products and >> workflows will be automatically collected and saved. This information will >> also be indexed so that the products and workflows can be searchable by >> the community of climate scientists and students. >> >> === Inexperience with Open Source === >> >> The current developers of ClimateModelDiagnosticAnalyzer are inexperienced >> with Open Source. However, our Champion Chris Mattmann is experienced >> (Champions of ApacheOpenClimateWorkbench and AsterixDB) and will be >> working closely with us, also as the Chief Architect of our JPL section. >> >> === Relationships with Other Apache Products === >> >> Clearly there is a direct relationship between this project and the Apache >> Open Climate Workbench already a top level Apache project and also brought >> to the ASF by its Champion (and ours) Chris Mattmann. We plan on directly >> collaborating with the Open Climate Workbench community via our Champion >> and we also welcome ASF mentors familiar with the OCW project to help >> mentor our project. In addition our team is extremely welcoming of ASF >> projects and if there are synergies with them we invite participation in >> the proposal and in the discussion. >> >> === Homogeneous Developers === >> >> The current community is within JPL but we would like to increase the >> heterogeneity. >> >> === Reliance on Salaried Developers === >> >> The initial committers are full-time JPL staff from 2013 to 2014. The >> other committers from 2014 to 2015 are a mix of CMU faculty, students and >> JPL staff. >> >> === An Excessive Fascination with the Apache Brand === >> >> We believe in the processes, systems, and framework Apache has put in >> place. Apache is also known to foster a great community around their >> projects and provide exposure. While brand is important, our fascination >> with it is not excessive. We believe that the ASF is the right home for >> ClimateModelDiagnosticAnalyzer and that having >> ClimateModelDiagnosticAnalyzer inside of the ASF will lead to a better >> long-term outcome for the Climate Science and Big Data community. >> >> === Documentation === >> >> The ClimateModelDiagnosticAnalyzer services and documentation can be found >> at: http://cmacws.jpl.nasa.gov:8080/cmac/. >> >> === Initial Source === >> >> Current source resides in ... >> >> === External Dependencies === >> >> ClimateModelDiagnosticAnalyzer depends on a number of open source projects: >> >> * Flask >> * Gunicorn >> * Tornado Web Server >> * GNU octave >> * epd python >> * NOAA ferret >> * GNU plot >> >> == Required Resources == >> >> === Developer and user mailing lists === >> >> * priv...@cmda.incubator.apache.org <javascript:;> (with moderated >> subscriptions) >> * comm...@cmda.incubator.apache.org <javascript:;> >> * d...@cmda.incubator.apache.org <javascript:;> >> * us...@cmda.incubator.apache.org <javascript:;> >> >> A git repository >> >> https://git-wip-us.apache.org/repos/asf/incubator-cmda.git >> >> A JIRA issue tracker >> >> https://issues.apache.org/jira/browse/CMDA >> >> === Initial Committers === >> >> The following is a list of the planned initial Apache committers (the >> active subset of the committers for the current repository at Google code). >> >> * Seungwon Lee (seungwon....@jpl.nasa.gov <javascript:;>) >> * Lei Pan (lei....@jpl.nasa.gov <javascript:;>) >> * Chengxing Zhai (chengxing.z...@jpl.nasa.gov <javascript:;>) >> * Benyang Tang (benyang.t...@jpl.nasa.gov <javascript:;>) >> >> >> === Affiliations === >> >> JPL >> >> * Seungwon Lee >> * Lei Pan >> * Chengxing Zhai >> * Benyang Tang >> >> CMU >> >> * Jia Zhang >> * Wei Wang >> * Chris Lee >> * Xing Wei >> >> == Sponsors == >> >> NASA >> >> === Champion === >> >> Chris Mattmann (NASA/JPL) >> >> === Nominated Mentors === >> >> TBD >> >> === Sponsoring Entity === >> >> The Apache Incubator >> >> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Chief Architect >> Instrument Software and Science Data Systems Section (398) >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 168-519, Mailstop: 168-527 >> Email: chris.a.mattm...@nasa.gov <javascript:;> >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Associate Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> <javascript:;> >> For additional commands, e-mail: general-h...@incubator.apache.org >> <javascript:;> >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org