最佳答案
I am using Spark 1.3 and would like to join on multiple columns using python interface (SparkSQL)
The following works:
I first register them as temp tables.
numeric.registerTempTable("numeric")
Ref.registerTempTable("Ref")
test = numeric.join(Ref, numeric.ID == Ref.ID, joinType='inner')
I would now like to join them based on multiple columns.
I get SyntaxError
: invalid syntax with this:
test = numeric.join(Ref,
numeric.ID == Ref.ID AND numeric.TYPE == Ref.TYPE AND
numeric.STATUS == Ref.STATUS , joinType='inner')