|
| 1 | +# Databricks Unity Catalog Quickstart 🌐🚀 |
| 2 | + |
| 3 | +**Accelerate Your Unity Catalog Setup with Optimized Terraform Automation!** |
| 4 | + |
| 5 | +Welcome to the **databricks-uc-quickstart** repository! This project helps you deploy Unity Catalog (UC) on Databricks swiftly and efficiently, using Terraform scripts pre-configured with recommended settings. Eliminate tedious setup and configuration overhead to quickly launch your data governance initiatives. |
| 6 | + |
| 7 | +## 📋 Best Practices Enforced |
| 8 | + |
| 9 | +This quickstart enforces the following Unity Catalog best practices: |
| 10 | + |
| 11 | +- **Catalog Design Defaults**: Pre-configured catalog structures optimized for data governance |
| 12 | +- **Workspace Defaults**: Standard workspace configurations for consistent deployments |
| 13 | +- **Role Permission Defaults**: Pre-defined role-based access controls following least privilege principles |
| 14 | +- **Storage Setup Defaults**: Optimized storage configurations for Unity Catalog |
| 15 | +- **Data Sharing Defaults**: Secure data sharing configurations ready for collaboration |
| 16 | +- **Research UC Default and Compatibility**: Ensures compatibility with existing Databricks features |
| 17 | +- **Volume Defaults**: Standard volume configurations for data storage |
| 18 | +- **Enable System Tables and Grant Role Access**: Automatic system table enablement with appropriate role permissions |
| 19 | + |
| 20 | +Additionally, this quickstart includes **Industry Templates for ABAC** (Attribute-Based Access Control) implemented through Python and SQL notebooks, allowing users to leverage industry-ready functions and policies for fine-grained access control. |
| 21 | + |
| 22 | +The Terraform configurations can be customized by modifying the variables in your deployment, while ABAC policies are managed through the provided notebooks. |
| 23 | + |
| 24 | +## 🌟 Key Benefits |
| 25 | + |
| 26 | +- **Automated Terraform Deployment**: Effortlessly set up and manage Unity Catalog. |
| 27 | +- **Instant Setup**: Deploy UC with recommended default configurations. |
| 28 | +- **Reduced Boilerplate**: Minimal setup—focus more on your core data projects. |
| 29 | +- **Flexible & Customizable**: Easily adapt configurations to match your unique requirements. |
| 30 | + |
| 31 | +## 🏗️ What Gets Deployed |
| 32 | + |
| 33 | +This Terraform quickstart deploys a complete Unity Catalog environment with the following components: |
| 34 | + |
| 35 | +### **Core Infrastructure** |
| 36 | +- **3 Unity Catalog Environments**: Production, Development, and Sandbox catalogs |
| 37 | +- **Cloud Storage**: Dedicated storage accounts/buckets for each catalog with proper IAM/RBAC |
| 38 | +- **External Locations**: Secure storage credential and external location mappings |
| 39 | +- **System Schemas**: Access, billing, compute, and storage system tables (if permissions allow) |
| 40 | + |
| 41 | +### **Access Management** |
| 42 | +- **User Groups**: Production service principals, developers, and sandbox users |
| 43 | +- **Catalog Permissions**: Role-based access control with environment-specific privileges |
| 44 | +- **System Schema Grants**: Appropriate permissions for monitoring and governance |
| 45 | + |
| 46 | +### **Compute Resources** |
| 47 | +- **Cluster Policies**: Environment-specific policies with cost controls and security settings |
| 48 | +- **Clusters**: Pre-configured clusters for each environment with proper access controls |
| 49 | + |
| 50 | +### **Cloud-Specific Resources** |
| 51 | + |
| 52 | +**AWS Deployment:** |
| 53 | +- S3 buckets with versioning and encryption |
| 54 | +- IAM roles and policies for Unity Catalog access |
| 55 | +- Cross-account trust relationships |
| 56 | + |
| 57 | +**Azure Deployment:** |
| 58 | +- Storage accounts with containers |
| 59 | +- Managed identities and access connectors |
| 60 | +- RBAC assignments for Databricks integration |
| 61 | + |
| 62 | +## 🚀 Quick Start |
| 63 | + |
| 64 | +Follow these steps to rapidly deploy Unity Catalog using Terraform: |
| 65 | + |
| 66 | +### 📌 Prerequisites |
| 67 | + |
| 68 | +Ensure you have: |
| 69 | + |
| 70 | +- A Databricks Account |
| 71 | +- [Terraform Installed](https://developer.hashicorp.com/terraform/downloads) |
| 72 | +- Basic knowledge of Databricks and Terraform |
| 73 | + |
| 74 | +**Workspace Requirements:** |
| 75 | +- An existing Databricks workspace is required |
| 76 | +- Workspace ID must be provided in the Terraform configuration (see template.tfvars.example) |
| 77 | +- The quickstart will configure Unity Catalog resources and permissions within your existing workspace |
| 78 | + |
| 79 | +### 🛠 Installation Steps |
| 80 | + |
| 81 | +1. **Clone this Repository:** |
| 82 | + |
| 83 | + ```bash |
| 84 | + git clone https://github.com/databrickslabs/sandbox.git |
| 85 | + cd sandbox/uc-quickstart/ |
| 86 | + ``` |
| 87 | + |
| 88 | +2. **Choose Your Cloud Provider:** |
| 89 | + |
| 90 | + Navigate to the appropriate directory based on your cloud provider: |
| 91 | + |
| 92 | + **For AWS:** |
| 93 | + ```bash |
| 94 | + cd aws/ |
| 95 | + ``` |
| 96 | + |
| 97 | + **For Azure:** |
| 98 | + ```bash |
| 99 | + cd azure/ |
| 100 | + ``` |
| 101 | + |
| 102 | +3. **Follow Cloud-Specific Setup:** |
| 103 | + |
| 104 | + Each cloud provider has specific prerequisites and configuration steps detailed in their respective README files: |
| 105 | + - [AWS Setup Instructions](aws/README.md) |
| 106 | + - [Azure Setup Instructions](azure/README.md) |
| 107 | + |
| 108 | +### ✅ Verify Deployment |
| 109 | + |
| 110 | +Once deployment is complete, verify the setup directly within your Databricks workspace to ensure all components are correctly configured. |
| 111 | + |
| 112 | +## 🔧 Need Help? |
| 113 | + |
| 114 | +For cloud-specific troubleshooting and detailed configuration help: |
| 115 | +- **AWS Issues:** See [AWS README](aws/README.md#troubleshooting) |
| 116 | +- **Azure Issues:** See [Azure README](azure/README.md#troubleshooting) |
| 117 | +- **General Questions:** Check the [main documentation](https://docs.databricks.com/en/data-governance/unity-catalog/index.html) |
| 118 | + |
| 119 | +## 📄 License |
| 120 | + |
| 121 | +This project is licensed under the Databricks License—see [LICENSE](../LICENSE) for full details. |
| 122 | + |
0 commit comments