Re: How to delete a record from parquet files using dataframes

Jakob Odersky Wed, 24 Feb 2016 15:38:39 -0800

You can `filter` (scaladoc
<http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrame@filter%28String%29:DataFrame>)
your dataframes before saving them to- or after reading them from parquet
files


On Wed, Feb 24, 2016 at 1:28 AM, Cheng Lian <l...@databricks.com> wrote:

> Parquet is a read-only format. So the only way to remove data from a
> written Parquet file is to write a new Parquet file without unwanted rows.
>
> Cheng
>
>
> On 2/17/16 5:11 AM, SRK wrote:
>
>> Hi,
>>
>> I am saving my records in the form of parquet files using dataframes in
>> hdfs. How to delete the records using dataframes?
>>
>> Thanks!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-delete-a-record-from-parquet-files-using-dataframes-tp26242.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: How to delete a record from parquet files using dataframes

Reply via email to