Chen Hirsh's Data Engineering Blog Blog

0

Write data to one CSV file in Databricks

Exporting data to a CSV file in Databricks can sometimes result in multiple files, odd filenames, and unnecessary metadata—issues that aren’t ideal when sharing data externally. This guide explores two practical solutions: using Pandas for small datasets and leveraging Spark’s coalesce to consolidate partitions into a single, clean file. Learn how to choose the right approach for your use case and ensure your CSV exports are efficient, shareable, and hassle-free.

The Databricks Debugger

Exploring the Databricks Debugger: Writing flawless code on the first try is a dream, but debugging is a reality for most developers. In this post, I dive into the new Databricks code cell debugger, sharing my first impressions and tips for getting started with this powerful tool.