- Sort by Spark SQL is resulting in Narrow dependency.
- Sort of Dataset API and order by of Spark SQL is resulting in Wide Dependency.
What is the difference between Sort by in Spark SQL vs Sort of Dataset API vs Order by in Spark SQL?
Asked
Active
Viewed 1,029 times
0
Vinay K L
- 45
- 1
- 10
1 Answers
0
There are two different things here:
In general Spark uses
sortas an alias fororderBy- What is the difference between sort and orderBy functions in SparkHive has
SORT BYclause, which sorts data locally per partition - such operation is calledsortWithinPartitionsin Spark.
user11056709
- 16