I am new in pyspark and i am learning it in order to complete my Thesis project in university.
I am trying to create a dataframe by reading from a postgresql database table, but i am facing a problem when i try to connect my pyspark application with postgresql db server. Could you please explain me the steps that are required in order to have a successfull connection with the database? I am using python 2.7, spark-2.3.0-bin-hadoop2.7, pycharm IDE and windows environmen. What i have done is that i have launched a pyspark shell with --jars /path to postgresql jar/ and the df = sqlContext.read.jdbc(url='jdbc:postgresql://localhost:port/[database]?user='username'&password='paswd', table='table name') Sent from Mail for Windows 10