Category: Data Engineering

0

Write data to one CSV file in Databricks

Exporting data to a CSV file in Databricks can sometimes result in multiple files, odd filenames, and unnecessary metadata—issues that aren’t ideal when sharing data externally. This guide explores two practical solutions: using Pandas for small datasets and leveraging Spark’s coalesce to consolidate partitions into a single, clean file. Learn how to choose the right approach for your use case and ensure your CSV exports are efficient, shareable, and hassle-free.

How to plan a successful Data Architecture

In the era of cloud computing, it’s really easy to create and change data services, so in each project we have architecture decisions to make, and each developer has to deal with these considerations.
This is a short summary of a meetup I gave about Data Architecture in the “Microsoft Data Engineers Club” community.