Here is what I ended up doing. Improvements are welcome.
from pyspark.sql import SQLContext, Row
from pyspark.sql.types import StructType, StructField, IntegerType,
StringType
from pyspark.sql.functions import asc, desc, sum, count
sqlContext = SQLContext(sc)
error_schema = StructType([
I have the following simple example that I can't get to work correctly.
In [1]:
from pyspark.sql import SQLContext, Row
from pyspark.sql.types import StructType, StructField, IntegerType,
StringType
from pyspark.sql.functions import asc, desc, sum, count
sqlContext = SQLContext(sc)
error_schema
I am using pyspark with netezza. I am getting a java exception when trying to
show the first row of a join. I can show the first row for of the two
dataframes separately but not the result of a join. I get the same error for
any action I take(first, collect, show). Am I doing something wrong