Appreciated Michael, but this doesn’t help my case, the filter string is being submitted from outside my program, is there any other alternative? some literal string parser or anything I can do before?
Saif From: Michael Armbrust [mailto:mich...@databricks.com] Sent: Wednesday, April 13, 2016 6:29 PM To: Ellafi, Saif A. Cc: user Subject: Re: Strange bug: Filter problem with parenthesis You need to use `backticks` to reference columns that have non-standard characters. On Wed, Apr 13, 2016 at 6:56 AM, <saif.a.ell...@wellsfargo.com<mailto:saif.a.ell...@wellsfargo.com>> wrote: Hi, I am debugging a program, and for some reason, a line calling the following is failing: df.filter("sum(OpenAccounts) > 5").show It says it cannot find the column OpenAccounts, as if it was applying the sum() function and looking for a column called like that, where there is not. This works fine if I rename the column to something without parenthesis. I can’t reproduce this issue in Spark Shell (1.6.0), any ideas on how can I analyze this? This is an aggregation result, with the default column names afterwards. PS: Workaround is to use toDF(cols) and rename all columns, but I am wondering if toDF has any impact on the RDD structure behind (e.g. repartitioning, cache, etc) Appreciated, Saif