Transform with Hadoop Native SQL

Do you want to drive significant business change from your data?

Could your team be even more productive with Hadoop?

Are you transforming your business with Hadoop?

Successful companies that disrupt and lead industries are able to quickly develop applications. These organizations use data, analytics, and machine learning in new and inventive ways to create engaging experiences for their customers and employees. This begins with Hadoop. And the key to moving fast with data is to be able to quickly and easily express questions that are raised from business problems. For this, you need SQL.

Existing SQL on Hadoop solutions have their drawbacks: some are not suited for interactive workloads, others suffer from operational complexity, many have scalability limitations. Few offer the full range of SQL functionality or lack key functionality for data science, and too many lack a robust enough execution engine to handle workloads from legacy data warehousing systems. Others still are closed source or lack transparency as to their governance and future.

The future of SQL on Hadoop is Hadoop Native SQL. To be Hadoop Native is to understand and govern an open source project using The Apache Way. To be Hadoop Native is to integrate with HDFS, YARN and Hadoop related projects in ways that improve the whole. To be Hadoop Native is to be open. And so, Hadoop Native SQL exploits, improves and extends Hadoop in its most natural and intuitive ways so as to realize its full potential of reimagining data infrastructure.

What does Hadoop Native mean?

Are you generating hundreds, thousands or millions of reports a week with Hadoop? Are you able to use SQL for data wrangling, business intelligence and data science in Hadoop? Is your SQL layer fully integrated with your Hadoop infrastructure? Are you replacing legacy data warehousing with Hadoop? If you haven’t realized transformative changes with Hadoop, you’re not alone. But that’s about to change with Hadoop Native SQL.

Connecting a simple SQL analytics solution, alone, next to Hadoop does not realize the full potential of Hadoop. An advanced SQL query processing engine, operating natively in Hadoop, can deliver new levels of performance and functionality. In addition, adopting an Hadoop Native approach can assure integrated development and execution that keeps pace with the overall Apache Hadoop rate of innovation.

Hadoop Native means transparent collaboration via the Apache Way

Apache communities must have full access and the ability to contribute and drive project development and roadmaps at will. Developers plug into the Apache Hadoop release plan and procedures, not the other way around. And any SQL solution for Apache Hadoop must be transparent as well.

Hadoop Native means operating directly in and with Hadoop clusters

Query engines must execute fast and low-latency advanced SQL analytics directly in HDFS without connectors. In addition, Hadoop Native means that you don’t have to move your data to another analytics repository. Keep it all in Hadoop (HDFS).

Hadoop Native means dynamically scaling, irrespective of data or cluster size

As your data infrastructure grows, SQL analytics must scale along with it. Hadoop Native SQL analytics enable data scientists to tackle datasets of any size - small to enormous, all executed directly on the native Apache Hadoop cluster.

Hadoop Native means common management with the Hadoop ecosystem

Tools that integrate and smoothly interoperate with YARN for resource management and Apache Ambari for installation and operation simplify overall management. Integration with Hcatalog for common source of data truth across multiple data types.

Why Hadoop Native SQL Matters

With Hadoop Native SQL, Hadoop becomes the analytical database that the industry has been driving toward. Now, by adding advanced performance capabilities and robust SQL compliance, driven by over a decade of development experience, Hadoop Native SQL can support more workloads, larger volumes and more advanced queries. This is a major shift for the Apache open source and database industries. You can be a part of the future.

“What today is about is actually trying to move past the notion of ‘SQL on Hadoop’ to something where SQL is synonymous with Hadoop, SQL is inside Hadoop. SQL is Hadoop. These kinds of trends I think have really been a process of discovery for the industry and we're excited to participate.“

Scott Yara


The Journey to Hadoop Native SQL

To start the journey to Hadoop Native SQL, Pivotal has contributed HAWQ to the Apache Software Foundation. It has been accepted into incubation. Our commitment is to realize the full potential of Hadoop Native SQL within this community, via The Apache Way. We hope you’ll join us.

Pivotal will continue to commercialize the fruits of this work, as Pivotal HDB, powered by Apache HAWQ (incubating), in its Data Suite.

Learn More