Hello all,

I would like to start off with an introduction. My name is Andrew Evans.  I 
have 3 years of programming / dev experience in Java, Scala, Python, 
PostgreSQL, Spring (Boot, REST) ; etc. I am working on a startup as well to 
bring power to medium size datasets and build mobile applications capable of 
utilizing text and numeric data as one to make better predictions.

Between this and full time work, I have started several open sourced (currently 
BSD 2 claused) projects which could greatly benefit the community and empower 
everyone using big data with a simplified pipeline for ETL and Acquisition as 
well as a Scala/ Java version of Fabric for simplified system administration.

I could really use some help making the following projects better and have full 
SRS and SDS documents available.

OpenETL - A pipeline built around Pentaho and adding data Quality Assurance and 
some other basics such as initial SQL importing, communications, file system 
management, and large document parsing as needed.
https://github.com/asevans48/OpenETL


Acquisition Tools - A set of tools for acquiring and parsing data initially 
from any source over networks or via file systems with an aim of also including 
images and NLP.  The current system is parallizable and threadable with a few 
tools to improve acquisition and initial intake.
https://github.com/asevans48/AcquisitionTools


ScalaFabric - Actually much broader but still fairly simple. It includes 
wrappers around the AWS SDK and Mesos SDK as well as interaction with the REST 
templates for Marathon and Chronos using Apache Http Components. A pipeline is 
in place to allow entire clusters to be generated from a single line of code 
and serialized clases or Json objects using FasterXML at the moment.
https://github.com/asevans48/ScalaFabric

Potentially, all three coudl be wrapped into a single environment with the last 
providing Carte or acquisition node support. I have the program set up to be 
able to support multiple clusters.

If anyone is interested in helping, please let me know. Even a fork of one or 
more of the projects would be nice. I would be happy to shoot the SRS, SDS, and 
other docs over and get you integrated into the Scrum board at SeeNowDo. It is 
also possible to generate Java Docs from the code.

I do dream of one day making all three Apache level projects.

Thank you for your time,

Andrew Evans
Java Dev @ Hygenics Data, LLC
Co-Founder and Dev @ SimplrTek, LLC and its subsidiaries

Reply via email to