Hi Palwell, Tried doing that, but its becoming null for all the dates after the transformation with functions.
df2 = dflead.select('Enter_Date',f.to_date(df2.Enter_Date)) [image: Inline image 1] Any insight? Thanks, Aakash. On Fri, Aug 18, 2017 at 12:23 AM, Patrick Alwell <palw...@hortonworks.com> wrote: > Aakash, > > I’ve had similar issues with date-time formatting. Try using the functions > library from pyspark.sql and the DF withColumns() method. > > —————————————————————————————— > > from pyspark.sql import functions as f > > lineitem_df = lineitem_df.withColumn('shipdate',f.to_date(lineitem_ > df.shipdate)) > > —————————————————————————————— > > You should have first ingested the column as a string; and then leveraged > the DF api to make the conversion to dateType. > > That should work. > > Kind Regards > > -Pat Alwell > > > On Aug 17, 2017, at 11:48 AM, Aakash Basu <aakash.spark....@gmail.com> > wrote: > > Hey all, > > Thanks! I had a discussion with the person who authored that package and > informed about this bug, but in the meantime with the same thing, found a > small tweak to ensure the job is done. > > Now that is fine, I'm getting the date as a string by predefining the > Schema but I want to later convert it to a datetime format, which is making > it this - > > >>> from pyspark.sql.functions import from_unixtime, unix_timestamp > >>> df2 = dflead.select('Enter_Date', > >>> from_unixtime(unix_timestamp('Enter_Date', > 'MM/dd/yyy')).alias('date')) > > > >>> df2.show() > > <image.png> > > Which is not correct (as it is converting the 15 to 0015 instead of 2015. > Do you guys think using the DateUtil package will solve this? Or any other > solution with this built-in package? > > Please help! > > Thanks, > Aakash. > > On Thu, Aug 17, 2017 at 12:01 AM, Jörn Franke <jornfra...@gmail.com> > wrote: > >> You can use Apache POI DateUtil to convert double to Date ( >> https://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/DateUtil.html). >> Alternatively you can try HadoopOffice (https://github.com/ZuInnoTe/h >> adoopoffice/wiki), it supports Spark 1.x or Spark 2.0 ds. >> >> On 16. Aug 2017, at 20:15, Aakash Basu <aakash.spark....@gmail.com> >> wrote: >> >> Hey Irving, >> >> Thanks for a quick revert. In Excel that column is purely string, I >> actually want to import that as a String and later play around the DF to >> convert it back to date type, but the API itself is not allowing me to >> dynamically assign a Schema to the DF and I'm forced to inferSchema, where >> itself, it is converting all numeric columns to double (Though, I don't >> know how then the date column is getting converted to double if it is >> string in the Excel source). >> >> Thanks, >> Aakash. >> >> >> On 16-Aug-2017 11:39 PM, "Irving Duran" <irving.du...@gmail.com> wrote: >> >> I think there is a difference between the actual value in the cell and >> what Excel formats that cell. You probably want to import that field as a >> string or not have it as a date format in Excel. >> >> Just a thought.... >> >> >> Thank You, >> >> Irving Duran >> >> On Wed, Aug 16, 2017 at 12:47 PM, Aakash Basu <aakash.spark....@gmail.com >> > wrote: >> >>> Hey all, >>> >>> Forgot to attach the link to the overriding Schema through external >>> package's discussion. >>> >>> https://github.com/crealytics/spark-excel/pull/13 >>> >>> You can see my comment there too. >>> >>> Thanks, >>> Aakash. >>> >>> On Wed, Aug 16, 2017 at 11:11 PM, Aakash Basu < >>> aakash.spark....@gmail.com> wrote: >>> >>>> Hi all, >>>> >>>> I am working on PySpark (*Python 3.6 and Spark 2.1.1*) and trying to >>>> fetch data from an excel file using >>>> *spark.read.format("com.crealytics.spark.excel")*, but it is inferring >>>> double for a date type column. >>>> >>>> The detailed description is given here (the question I posted) - >>>> >>>> https://stackoverflow.com/questions/45713699/inferschema-usi >>>> ng-spark-read-formatcom-crealytics-spark-excel-is-inferring-d >>>> >>>> >>>> Found it is a probable bug with the crealytics excel read package. >>>> >>>> Can somebody help me with a workaround for this? >>>> >>>> Thanks, >>>> Aakash. >>>> >>> >>> >> >> > >