How about mapping a number for each string? Maybe you can do it with custom Transformer.
> On Jan 19, 2016, at 12:02 AM, Hilmi Yildirim <hilmi.yildi...@dfki.de> wrote: > > Ok. In this case I will use an Array instead. > > Am 18.01.2016 um 14:56 schrieb Theodore Vasiloudis: >> I agree with Till, the data types are different here so you need a custom >> string vector. >> >> The Vector abstraction in FlinkML is designed with numerical vectors in >> mind. >> >> On Mon, Jan 18, 2016 at 2:33 PM, Till Rohrmann <trohrm...@apache.org> wrote: >> >>> Hi Hilmi, >>> >>> I think in your case it makes sense to define a custom vector of strings. >>> The easiest implementation could be an Array[String] or List[String]. >>> >>> The reason why it does not make so much sense to make Vector and >>> DenseVector >>> generic is that these types are algebraic data types. How would you define >>> algebraic operations such as scalar product, outer product, multiplication, >>> etc. on a vector of strings? Then you would have to provide different >>> implementations for the different type parameters. >>> >>> Cheers, >>> Till >>> >>> >>> On Mon, Jan 18, 2016 at 1:40 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de> >>> wrote: >>> >>>> Hi, >>>> how I explained it in a previous E-Mail, I need a LabeledVector where the >>>> label is also a vector. After we discussed this issue, I created a new >>>> class named LabeledSequenceVector with the labels as a Vector. In my use >>>> case, I want to train a POS-Tagger system, so the "vector" is a vector of >>>> strings and the "labels" is also a vector of strings. If I use the Flink >>>> Vector/DenseVector implementation then the vector does only have double >>>> values but I need String values. >>>> >>>> Best Regards, >>>> Hilmi >>>> >>>> >>>> Am 18.01.2016 um 13:33 schrieb Chiwan Park: >>>> >>>>> Hi Hilmi, >>>>> >>>>> In NLP, which types are used for vector values? I think we can cover >>>>> typical case using double values. >>>>> >>>>> On Jan 18, 2016, at 9:19 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de> >>>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> the Vector and DenseVector implementations of Flink ML only allow >>> Double >>>>>> values. But there are cases where the values are not Doubles, e.g. in >>> NLP. >>>>>> Does it make sense to make the implementations generic, i.e. Vector[T] >>> and >>>>>> DenseVector[T]? >>>>>> >>>>>> Best Regards, >>>>>> Hilmi >>>>>> >>>>>> -- >>>>>> ================================================================== >>>>>> Hilmi Yildirim, M.Sc. >>>>>> Researcher >>>>>> >>>>>> DFKI GmbH >>>>>> Intelligente Analytik für Massendaten >>>>>> DFKI Projektbüro Berlin >>>>>> Alt-Moabit 91c >>>>>> D-10559 Berlin >>>>>> Phone: +49 30 23895 1814 >>>>>> >>>>>> E-Mail: hilmi.yildi...@dfki.de >>>>>> >>>>>> ------------------------------------------------------------- >>>>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH >>>>>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern >>>>>> >>>>>> Geschaeftsfuehrung: >>>>>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) >>>>>> Dr. Walter Olthoff >>>>>> >>>>>> Vorsitzender des Aufsichtsrats: >>>>>> Prof. Dr. h.c. Hans A. Aukes >>>>>> >>>>>> Amtsgericht Kaiserslautern, HRB 2313 >>>>>> ------------------------------------------------------------- >>>>>> >>>>>> Regards, >>>>> Chiwan Park >>>>> >>>>> > > > -- > ================================================================== > Hilmi Yildirim, M.Sc. > Researcher > > DFKI GmbH > Intelligente Analytik für Massendaten > DFKI Projektbüro Berlin > Alt-Moabit 91c > D-10559 Berlin > Phone: +49 30 23895 1814 > > E-Mail: hilmi.yildi...@dfki.de > > ------------------------------------------------------------- > Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH > Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern > > Geschaeftsfuehrung: > Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) > Dr. Walter Olthoff > > Vorsitzender des Aufsichtsrats: > Prof. Dr. h.c. Hans A. Aukes > > Amtsgericht Kaiserslautern, HRB 2313 > ------------------------------------------------------------- > Regards, Chiwan Park