# Bucketing, Sorting and Partitioning

For file-based data source, it is also possible to bucket and sort or partition the output. Bucketing and sorting are applicable only to persistent tables:

```
peopleDF.write.bucketBy(42, "name").sortBy("age").saveAsTable("people_bucketed")
```

while partitioning can be used with both save and saveAsTable when using the Dataset APIs.

```
usersDF.write.partitionBy("favorite_color").format("parquet").save("file:///tmp/namesPartByColor.parquet")
```

It is possible to use both partitioning and bucketing for a single table:

```
usersDF
  .write
  .partitionBy("favorite_color")
  .bucketBy(42, "name")
  .saveAsTable("users_partitioned_bucketed")
```

partitionBy creates a directory structure as described in the Partition Discovery section. Thus, it has limited applicability to columns with high cardinality.&#x20;

bucketBy distributes data across a fixed number of buckets and can be used when a number of unique values is unbounded.
