The World’s First Open-Source, Multi-Cloud Data Platform Built for Advanced Analytics

Advanced analytics meets traditional business intelligence with Pivotal Greenplum, the world’s first fully-featured, multi-cloud, massively parallel processing (MPP) data platform based on the open source Greenplum Database. Pivotal Greenplum provides comprehensive and integrated analytics on multi-structured data. Powered by one of the world’s most advanced cost-based query optimizers, Pivotal Greenplum delivers unmatched analytical query performance on massive volumes of data.

Pivotal Greenplum for Advanced Analytics

With Pivotal Greenplum, data professionals can test diverse models in parallel on multi-structured data sets—including machine learning, text, graph, and geo-spatial. Rapidly create and deploy models for complex applications in cybersecurity, predictive maintenance, risk management, fraud detection, and many other areas.

Integrated Data for Wider Use Cases

Analyze more types of data in a single environment—structured, text, geospatial, and graph. Petabyte scale for deep analytical insights.

In-Database Analytics for Speed

MPP architecture trains more models in less time. Ensemble modeling for more predictive power.

Adapt Quickly to Changing Data

Scale-out, highly concurrent architecture enables rapid modeling of individuals and entities at scale. Statistical methods via familiar SQL, harnessed to massive parallelism, accelerate common data prep and quality tasks.

Case Study
How Conversant Uses Data Science to Bring Ultra Transparency to Online Advertising

Greenplum for Extreme Weather Predictions and Analytics at Japan's NICT

Greenplum 5: Proven, Open-Source, Multi-Cloud Data Analytics Platform


Analyze data at speed and scale. In combination with Apache MADlib, the open-source library of analytical functions for the PostgreSQL family of databases, Pivotal Greenplum can execute over 50 statistical, machine learning, and graph methods, via SQL.

Flexible integration of data and analytics. Apache Spark developers can leverage the bi-directional Pivotal Greenplum-Spark Connector either to transfer data to and from Spark, or to execute highly-parallel models where the data exists -- either in Spark, or with MADlib in Pivotal Greenplum. The Pivotal Greenplum Extension Framework (PxF) provides an abstraction framework and pre-built plugins to many data sources, including HDFS, Hive, and HBase, to work with your data lake. You can also read/write data from/to cloud storage, including Amazon S3 objects.

Customize analytics with procedural languages. Pivotal Greenplum supports development in R, Python, Java, and other standard languages. It shared-nothing, MPP architecture can distribute execution across the entire cluster for processing that can be orders of magnitude faster than single-node R or Python.

Protect and isolate workloads with containers. PL/Container is a new preview feature that implements a trusted language execution engine capable of isolating executors from the host OS. Containers are pre-configured with Pivotal Greenplum for data science workloads and can also be customized or created for different end user workloads.

Use familiar tools. Work with leading BI and advanced analytics software that are ODBC/JDBC compatible, or have native integrations, including SAS, IBM Cognos, SAP Analytics Solutions, Qlik, Tableau, Apache Zeppelin, and Jupyter.

Analyze data where it is. Pivotal Greenplum is infrastructure-agnostic and available for bare metal, private cloud, and public cloud deployments