Hi Someshwar, Thanks for the response, I have added my comments to the ticket <https://issues.apache.org/jira/browse/SPARK-48463>.
Thanks, Chhavi Bansal On Thu, 6 Jun 2024 at 17:28, Someshwar Kale <skale1...@gmail.com> wrote: > As a fix, you may consider adding a transformer to rename columns (perhaps > replace all columns with dot to underscore) and use the renamed columns in > your pipeline as below- > > val renameColumn = new > RenameColumn().setInputCol("location.longitude").setOutputCol("location_longitude") > val si = new > StringIndexer().setInputCol("location_longitude").setOutputCol("longitutdee") > val pipeline = new Pipeline().setStages(Array(renameColumn, si)) > pipeline.fit(flattenedDf).transform(flattenedDf).show() > > > refer my comment > <https://issues.apache.org/jira/browse/SPARK-48463?focusedCommentId=17852751&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17852751> > for > elaboration. > Thanks!! > > *Regards,* > *Someshwar Kale* > > > > > > On Thu, Jun 6, 2024 at 3:24 AM Chhavi Bansal <meetchhavi1...@gmail.com> > wrote: > >> Hello team >> I was exploring feature transformation exposed via Mllib on nested >> dataset, and encountered an error while applying any transformer to a >> column with dot notation naming. I thought of raising a ticket on spark >> https://issues.apache.org/jira/browse/SPARK-48463, where I have >> mentioned the entire scenario. >> >> I wanted to get suggestions on what would be the best way to solve the >> problem while using the dot notation. One workaround is to use`_` while >> flattening the dataframe, but that would mean having an additional overhead >> to convert back to `.` (dot notation ) since that’s the convention for our >> other flattened data. >> >> I would be happy to make a contribution to the code if someone can shed >> some light on how this could be solved. >> >> >> >> -- >> Thanks and Regards, >> Chhavi Bansal >> > -- Thanks and Regards, Chhavi Bansal