Think Twice Before Deleting a User: Avoiding Ownership Chaos in Databricks

Have you ever made a change in your system and immediately regretted it? A few weeks ago, I did just that while working with a customer on their Databricks platform. His IT guys made some changes, moving a user to another domain. In Databrick, this is considered a new user, so I added the new user and gave him the same permissions as the old user.

And then, without thinking twice, I deleted the old user from Databricks.

A few hours later, all hell broke loose:

  1. Workflows owned by that user failed
  2. SQL queries using views owned by that user failed

So what is the ownership concept in Datarbricks, what are the best practices regarding it, and how to fix the issue if I delete a user?

Databricks Ownership Concept

In Databricks, a lot of objects have an owner. It’s usually the creator of the object, but you can change the owner to another user or group. The owner has full permissions on the object, including the ability to delete and assign permission to it.

SQL object owner

Every SQL object, like catalogs, schemas, tables, and views has an owner. You can view the owner in the Catalog Explorer UI, and you can change it by clicking on the pencil icon, and reassigning the object. The Owner can be a user or a group.

Workflows owner

Workflows have two roles to notice: the job owner and the job “run as” user. The job owner (by default the creator of the job) can fully manage the job, including changing its permissions. the job will fail if that user is deleted. The “run as” user is the identity the job will use to access resources, like tables and files, and the job may fail if that user doesn’t have the needed permissions (or if it is deleted)

The “Run as” user is at the top of the job details (click on the pencil to change):

To view the Job owner, go to the bottom of the job details. To change the owner, click on edit permissions, remove the current owner, and add another (please note that only users can be the owner of jobs, not groups):

Other Objects Ownership

Folders and notebooks also have owners. But since deleting the user will not create an issue, and admins can still access these objects and reassign them, I will not relate to them here.

Best Practices Regarding Ownership

To avoid issues like the one I described above, conform to the best practices:

SQL Objects

  • For high-level SQL objects, like catalogs and schemas, change the owner to a group after creation
  • For SQL objects that are created and not changed frequently, like Volumes, also change the owner to a group
  • For tables and views, that are created and changed frequently, it is not feasible to change the owner each time an object is created. Therefore, we will need to be careful before removing a user, in case he owns objects. More on that later.

To change an object owner, you can use the UI (click on the pencil next to the owner. See screenshot above). Or use an SQL command. For example for schema:

use catalog my_catalog;
ALTER SCHEMA chen
    SET OWNER TO `new user`

Workflows

  • Set the “run as” user to a service principal instead of an actual user
  • Set the owner of the job to be a service principal instead of an actual user

On how to create a service principal, see here – https://docs.databricks.com/en/admin/users-groups/service-principals.html

To change the job owner and “run-as” user, see the above screenshots.

Note – to use the service principal to run jobs, you need to change two permissions:

  1. Add the Service principal: user role on that service principal to your account, so you can give it permissions on the job
  2. Add the Service principal: user role on that service principal to itself. Otherwise, it cannot run as a job.

Before removing a user

Jobs – Go to workflows, and manually check if the user is a job owner. At the moment, I don’t have a better solution.

SQL tables – If using Unity Catalog, use the information schema tables to list objects owned by that user:

select table_catalog,table_schema,table_name,table_type
from system.information_schema.tables
where table_owner='chen@****.com'

You can use a Python loop to programmatically change the owner on each table. In the example below, I list tables that I am the owner of, and change the owner to “admin_group”.

fix_list = spark.sql("select table_name from system.information_schema.tables where table_owner='chen@****.com' and table_schema='my_schema'").collect()
for i in fix_list:
    table_name=i.table_name
    print(table_name)
    fix_Sql = f"alter table my_schema.{table_name} set owner to `admin_group`;"
    spark.sql(fix_Sql)

What if I already removed the user?

If a user is deleted, every object he owns will show the owner as “unknown”. So you should:

  1. Go to workflows, find jobs whose owner is unknown, and change them (see above)
  2. Use the query above to find objects whose owner is “unknown” and change them to another user or group.

Conclusion

Use the best practices above for object ownership, check before deleting a user, and generally think twice before doing something on your system. But if you do make a mistake, at least write a blog about it :-).

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *