Hi Chris,

first of all thank you so much for your feedback :)

I'm definitely interested to strictly follow Apache Tika, not only for this
opportunity but I could have specific capabilities to implement also for
some of our clients :-P
I didn't know nothing about your involvement in other standards so this is
awesome!

Inside Apache ManifoldCF we have a Tika Transformer that we can use in the
pipeline when we start a scheduled job against content repository for
indexing or migration process.
I know something about Tika because I typically work also on Solr but
probably following the mailing list can give me a wide vision about it.

I'll ask if we can bring more people in the committee and I think that your
contribution can be absolutely valuable.
I'll let you know any update on this.

Cheers,
PJ


2018-05-23 2:07 GMT+02:00 Chris Mattmann <mattm...@apache.org>:

> PJ I invite you to join and comment on the Tika lists. We already are
> working
> on standards in a number of the areas below, including even beyond some of
> the basic things you cite. For example we are already doing Sentiment
> Analysis,
> Deep Learning, and other NLP and have these integrated into Tika as part
> of a
> Broader ecosystem.
>
>
>
> Feel free to join the discussion at d...@tika.apache.org.
>
>
>
> You can also read more about it under Advanced Content Integration here:
>
>
>
> https://wiki.apache.org/tika/#Advanced_Content_Extraction_
> with_Tika_-_Integration
>
>
>
> Look also at NER, Object Detection ,Text Captioning and Computer Vision.
>
>
>
> Regarding participation in this committee at ECM, I’m definitely
> interested
> if it’s worthwhile.
>
>
>
> Chris Mattmann
>
>
>
>
>
>
>
> From: Piergiorgio Lucidi <piergior...@apache.org>
> Reply-To: "dev@community.apache.org" <dev@community.apache.org>
> Date: Tuesday, May 22, 2018 at 4:30 PM
> To: "dev@community.apache.org" <dev@community.apache.org>
> Subject: ASF involvement in the new ECM Standard Committee
>
>
>
> Hi,
>
>
>
> I'm directly involved in the new committee dedicated to design the new
>
> white papers about the ECM / Content Services guidelines and toolkits. The
>
> main goal of these documents is to suggest best practices, guidelines and,
>
> starting from this year, Open Source technology stacks to use in the
>
> enterprise context.
>
>
>
> I worked during the last three years contributing in the AIIM committee
>
> with Betsy Fanning but now we will have a new home with a new team.
>
> Yesterday I had a very interesting discussion with Robert Blatt about the
>
> new direction to follow for the next development. The Open Source topic
>
> will be the most relevant one in the next iteration of our work and we are
>
> discussing about a potential white paper totally dedicated to the Open
>
> Source alternatives in the market.
>
>
>
> Even if I'm currently contributing as an individual in this committee, it
>
> seems that we could be involved as a foundation in this project. I think
>
> that It could be a good opportunity to spread our brand also on
>
> collaboration like this. We know best practices, approaches and technology
>
> stack where we have a huge amount of experience, skills and projects.
>
>
>
> I'm wondering if the ASF was never been involved in this kind of
>
> contributions or if it can be any problem with our involvement on this in
>
> terms of brand. I have to ask more details about this program but in the
>
> meanwhile I would like to receive some feedbacks from you. I'm asking also
>
> because Robert Blatt is very interested to involve us officially in the
>
> program.
>
>
>
> I would like to thank Shane for sharing the framework published by Mozilla
>
> some days ago in our ComDev room on HipChat.
>
> Mozilla described a very interesting report adding also some technology
>
> stacks:
>
> https://blog.mozilla.org/blog/2018/05/15/whats-your-open-
> source-strategy-here-are-10-answers/
>
>
>
> Specifically we are talking about areas such as: Content, Search and
>
> Capture and even if OCR is not present in our projects, we have some native
>
> integrations for example with Tesseract on Tika. It can be interesting to
>
> understand which Apache projects can be combined with external libraries to
>
> build a custom Capture Services solution.
>
>
>
> For example considering the involvement of Tesseract, it could be the
>
> following proposal:
>
>
>
>    - Apache ManifoldCF for crawling any source content repository (API ->
>
>    contents as images or PDF)
>
>    - Apache PDFBox for extracting images from PDF
>
>    - Apache ManifoldCF for injecting contents in Solr
>
>    - Tesseract for extracting text from images (configured inside Apache
>
>    Tika)
>
>    - Apache Solr for indexing extracted text
>
>
>
> We could also try to design a section totally dedicated to the Apache
>
> technology stacks:
>
>
>
>    - Apache Content Services (JackRabbit, ...)
>
>    - Apache Search Services (Lucene, Solr, ManifoldCF)
>
>    - Apache Semantic Services (UIMA, Stanbol, ...)
>
>    - Apache BigData Services (Hadoop, ...)
>
>    - Apache DevOps Services (Mesos, ...)
>
>    - Apache Libraries Services (Commons, ...)
>
>    - ... and so on :-P
>
>
>
> This potential work can be useful internally for us to create our new
>
> Apache brochures dedicated to specific areas of our proposal.
>
> I'm not talking about something that is totally focused only on
>
> technologies but also on best practices, approaches and the good path for a
>
> natural adoption.
>
>
>
> I'm trying to understand if contributing on one side (ECM Standards) can
>
> help me to design and improve our Apache brochures.
>
> On the other hand the Apache areas can be also useful for the new white
>
> papers.
>
>
>
> Please let me know what you think.
>
> Thank you.
>
>
>
> Cheers,
>
> PJ
>
>
>
> --
>
> Piergiorgio
>
>
>
>


-- 
Piergiorgio Lucidi
Open Source Evangelist and Digital Transformation Specialist
Member / Mentor / PMC Member / Committer @ The Apache Software Foundation
Community Star / Wiki Gardener / Global Forum Moderator @ Alfresco
Author and Technical Reviewer @ Packt Publishing
Technical Advisory Group Member @ Microsoft
Top Community Contributor @ Crafter
Project Leader / Committer @ JBoss
https://www.open4dev.com

Reply via email to