Databricks SQL allows customers to operate a multi-cloud lakehouse architecture that provides data warehousing performance at data lake economics.
- Databricks SQL integrates with BI tools, like Tableau and Microsoft Power BI, you use today to query your most complete and recent data in your data lake.
- Complement existing BI tools with a SQL-native interface that allows data analysts and data scientists to query data lake data directly within Databricks.
- Share query insights through rich visualizations and drag-and-drop dashboards with automatic alerting for important changes in your data.
- Bring reliability, quality, scale, security, and performance to your data lake to support traditional analytics workloads using your most recent and complete data.
In addition to support for your existing BI tools, Databricks SQL offers a full-featured SQL-native query editor that allows data analysts to write queries in a familiar syntax and easily explore Delta Lake table schemas. Regularly used SQL code can be saved as snippets for quick reuse, and query results can be cached to keep run times short.
Easily Create Visualizations and Share Dashboards
Once queries are built, Databricks SQL also allows analysts to make sense of the results through a wide variety of rich visualizations. The visualizations can be quickly organized into dashboards with an intuitive drag-and-drop interface. Dashboards can be easily shared with stakeholders, both within and outside the organization, via a web browser. To keep everyone current, dashboards can be configured to automatically refresh, as well as to alert the team to meaningful changes in the data.
Data Lake Administration
Traditionally, it took a dedicated engineering effort to gain meaningful visibility into what queries and workloads were being run on data lakes. Now, Databricks SQL allows customers to get granular visibility into how data is being used and accessed at any time across their entire lakehouse infrastructure. Administrators can explore usage across SQL Endpoints, users, and time; as well as drill down into the phases of each query’s execution to troubleshoot problems and support audits.
Reliability and Governance for Data Lakes
No more malformed data ingestion, difficulty deleting data for compliance, or issues modifying data for change data capture. Databricks SQL is built upon Delta Lake, an open and structured approach to building data lakes that adds reliability, quality, and performance capabilities that data lakes natively lack. Delta Lake provides ACID transactions on data lakes to ensure that every operation either fully succeeds or fully aborts for later retries without requiring new data pipelines to be created. Additionally, Delta Lake records all past transactions on your data lake, so it’s easy to access and use previous versions of your data for compliance and machine learning use cases.
Increasingly, enterprises are deploying applications in multiple clouds. For many organizations, this creates an exponential increase in architectural complexity. Databricks SQL allows data teams to adopt a single data management and SQL analytics toolset to standardize operating procedures across multiple clouds. Combined with a commitment to using open source standards, this makes Databricks SQL the most flexible and open analytics platform available in the cloud.
Connect existing BI tools to one source of truth for all your data
Collaboratively explore the latest and freshest data
Build data-enhanced applications
Now more than ever, organizations need a data strategy that enables speed and agility to be adaptable. As organizations are rapidly moving their data to the cloud, we’re seeing growing interest in doing analytics on the data lake. The introduction of Databricks SQL delivers an entirely new experience for customers to tap into insights from massive volumes of data with the performance, reliability and scale they need. We’re proud to partner with Databricks to bring that opportunity to life.
—Francois Ajenstat, Chief Product Officer, Tableau
“Shell has been undergoing a digital transformation as part of our ambition to deliver more and cleaner energy solutions. As part of this, we have been investing heavily in our data lake architecture. Our ambition has been to enable our data teams to rapidly query our massive datasets in the simplest possible way. The ability to execute rapid queries on petabyte scale datasets using standard BI tools is a game changer for us. Our co-innovation approach with Databricks has allowed us to influence the product roadmap and we are excited to see this come to market.”
— Dan Jeavons, GM Data Science
“At Atlassian, we need to ensure teams can collaborate well across functions to achieve constantly evolving goals. A simplified lakehouse architecture would empower us to ingest high volumes of user data and run the analytics necessary to better predict customer needs and improve the experience of our customers. A single, easy-to-use cloud analytics platform allows us to rapidly improve and build new collaboration tools based on actionable insights.”
— Rohan Dhupelia, Data Platform Senior Manager, Atlassian
“At Wejo, we’re collecting data from more than 50 million accessible connected cars to build a better driving experience. Databricks and a robust lakehouse architecture will allow us to provide automated analytics to our customers, empowering them to glean insights on nearly 5 trillion data points per month, all in a streaming environment from car to marketplace in seconds.”
— Daniel Tibble, Head of Data, wejo
“As a company focused on providing data-driven research to our customers, the massive amount of data in our data lake is our lifeblood. By leveraging Databricks and Delta Lake, we have already been able to democratize data at scale, while lowering the cost of running production workloads by 60%, saving us millions of dollars. We’re excited to build on this momentum by leveraging the Databricks lakehouse architecture that will further empower everyone across our organization – from research analysts to data scientists – to interchangeably use the same data, helping us to provide innovative insights to our customers faster than ever before.”
— Steve Pulec, Chief Technology Officer, YipitData