Arky,
  This sounds great.  Are you using the new language models I added to
Tika 2.x?  Those include the languages you mention and a couple more
you requested earlier (?).

 Cheers,

         Tim

On Wed, Nov 10, 2021 at 2:13 PM Arky <hitmana...@gmail.com> wrote:
>
> Hi,
>
> My personal interest is to get Tika to work better with Southern and
> Southeast Asian languages.
>
> Any conversation on how we could contribute corpora to help train the
> models for languages like Burmese, Thai, Khmer and Vietnamese would be
> great.
>
>
> Apart from general introductions, I would be happy to give use case of
> how downstream projects use Tika for their work to injest and extract
> data from multi-lingual documents.
>
> Cheers
>
> --arky
>
>
>
>
>
> On 11/11/21 1:18 AM, Tim Allison wrote:
> > But seriously... how about a hands-on workshop on tika-pipes for the
> > first week of December (focus on fileshare to Solr)?  We can follow
> > Eric's recommendation of having a brief around the room to introduce
> > each other and then a smaller actual tutorial.
> >
> > Was the day of week/time of day ok?  I realize that TWTh can be heavy
> > meeting days for some, but I also know that folks take MF off. :D
> >
> > On Tue, Nov 9, 2021 at 3:53 PM Tim Allison <talli...@apache.org> wrote:
> >>
> >> Will sign up Ken for next week....kidding.  Yes, that sounds great
> >> when you're ready!
> >>
> >> On Tue, Nov 9, 2021 at 3:16 PM Ken Krugler <kkrugler_li...@transpac.com> 
> >> wrote:
> >>>
> >>> Hi Tim,
> >>>
> >>> Maybe how to embed Tika in a scalable processing framework (Flink, Spark, 
> >>> AWS Lambda???) to process a large corpus in parallel?
> >>>
> >>> — Ken
> >>>
> >>>> On Nov 9, 2021, at 11:00 AM, Tim Allison <talli...@apache.org> wrote:
> >>>>
> >>>> All,
> >>>>    Many thanks to those who attended today.  It was great to e-meet
> >>>> old friends and users from around the world.  Many thanks to Lewis
> >>>> McGibbney for getting the ball rolling on these.
> >>>>    Let's use this thread to discuss possible topics and scheduling for
> >>>> the next meetups?
> >>>>
> >>>> Question 1: Pace...one a month or so?
> >>>>
> >>>> Question 2: Topics?
> >>>> a) tika-pipes hands-on workshop
> >>>> b) get to know the users -- 5 minute go-around the room "this is how
> >>>> we use it; these are our pain points"
> >>>> c) ???
> >>>>
> >>>>   Again, thank you!
> >>>>
> >>>>            Best,
> >>>>
> >>>>                   Tim
> >>>
> >>> --------------------------
> >>> Ken Krugler
> >>> http://www.scaleunlimited.com
> >>> Custom big data solutions
> >>> Flink, Pinot, Solr, Elasticsearch
> >>>
> >>>
> >>>
>

Reply via email to