Parallel Postgres for enterprise analytics at scale

With improved transaction processing capability and support for streaming ingest, Greenplum can address workloads across a spectrum of analytic and operational contexts, from traditional business intelligence to deep learning. Greenplum is designed to run anywhere—on-premises, in public and private clouds, and in modern containerized environments like Kubernetes—for easier installation, operation, and upgrades.

Analytics from BI to AI

Consolidate more workloads in a single environment

Greenplum reduces data silos by providing you with a single, scale-out environment for converging analytic and operational workloads, like streaming ingestion. Execute point queries, fast data ingestion, data science exploration, and long-running reporting queries with greater scale and concurrency.

Deploy anywhere

Run analytics on public and private clouds, Kubernetes, or on-premises

Greenplum provides your enterprise with flexibility and choice because it can be deployed on all major public and private cloud platforms, on-premises, and with container orchestration systems like Kubernetes. Deploy and manage hundreds of Greenplum instances easily.

Open-source innovation

Pre-integrated components for easier consumption

Pivotal Greenplum is based on PostgreSQL and the Greenplum Database project. It offers optional use-case specific extensions like PostGIS for geospatial analysis, and GPText (based on Apache Tika and Apache Solr) for document extraction, search, and natural language processing. These are pre-integrated to ensure a consistent experience, not a “wild-west” DIY approach to open source. Instead of depending on expensive proprietary databases, users can benefit from the contributions of a vibrant community of developers.

Enterprise data science

Streamline data science operations and simplify workflows

Tackle data science from experimentation to massive deployment with Apache MADlib, the open-source library of in-cluster machine learning functions for the Postgres family of databases. MADlib with Greenplum provides multi-node, multi-GPU and deep learning capabilities. It also offers automation-friendly features such as model versioning, and the capability to push models from training to production via a REST API. Users avoid the pain of porting and re-coding analytical models.

“Whatever use case we can dream up and whatever ways we can think of to better understand the user, Greenplum allows us to do it.”

John Conley, Vice President of Data Warehousing, Conversant

Architecture




Features


Create automated, repeatable deployments with Kubernetes

Greenplum for Kubernetes replaces StatefulSets with an application-specific operator that provides an automation layer for easier deployment and operation wherever Kubernetes is installed. Deployed on Pivotal Container Service (PKS), Greenplum can provide stateful data persistence to your Cloud Foundry applications.

Cloud-agnostic for flexible deployment

Greenplum is available on leading public cloud marketplaces—Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP)—with “bring your own license” (BYOL) and hourly consumption models. It’s also available for VMware vSphere and OpenStack private clouds. Best of all, it’s the same Greenplum version and the same tools across all clouds for a consistent experience.

Value and performance in an appliance-like experience

Greenplum Building Blocks (GBB) is the most performant way to run Pivotal Greenplum in an on-premises deployment. It’s a Pivotal-certified and supported blueprint for Dell hardware configurations that replace proprietary appliances. Users can also deploy Greenplum on HP- and Cisco-certified configurations, as well as their own commodity hardware.

Analytics from business intelligence to artificial intelligence

Machine learning, deep learning, graph, text, and statistical methods are all provided in one scale-out MPP database. Use geospatial analytics based on open-source PostGIS, and text analytics based on Apache Solr with Greenplum’s GPText. Extensive support for R and Python analytical libraries, as well as Keras and Tensorflow.

Handle streaming data and cloud data with ease

Greenplum includes integration to the Kafka ecosystem, certified by Confluent. Together with improved low-latency writes, Greenplum provides fast event processing for streaming use cases. The ability to query Amazon S3 objects in place leads to better integration of cloud data.

Maximize uptime and protect data integrity

Greenplum has features for high availability, intelligent fault detection, and fast online differential recovery, as well as full and incremental backup and disaster recovery. Security and authentication features address enterprise policy and regulatory requirements.

Industry-leading performance

With its unique, cost-based query optimizer designed for large-scale data workloads, Greenplum scales interactive and batch-mode analytics to large datasets in the petabytes without degrading query performance and throughput.

Based on open-source projects

Avoid proprietary vendor lock-in. The Greenplum Database open-source project is 100% in alignment with the PostgreSQL community. All major Pivotal Greenplum contributions are part of the Greenplum Database project and share the same database core, including the MPP architecture, analytical interfaces, and security capabilities.

Massively parallel, highly concurrent architecture

Greenplum features a shared-nothing architecture that automates parallel processing of data and queries and petabyte-scale data ingestion. It’s open-source, cost-based query optimizer (GPORCA) was developed specifically to address advanced analytics, creating query plans that execute complex joins at breakthrough performance on large data volumes.

Use Cases

Enterprise analytics and AI

With support for advanced algorithms such as multi-layer perceptron and convolutional neural networks in Apache MADlib, users can begin to tackle cutting edge use cases in speech recognition, image recognition, machine translation, and computer vision. With optional support for REST APIs, you can train, test, and deploy in a single language (SQL), reducing the occurrence of errors when putting models into production at scale.

Flexible deployment on Kubernetes, cloud, or on-premises

Move your analytics workloads to the platform of your choice under the terms and in the timeframes you choose. Deploy in Kubernetes, AWS, Microsoft Azure, GCP, private clouds, or on-premises with GBB. Have the freedom to select the best platform for each project and workload based on ease of use, performance, and total cost of ownership (TCO).

Enterprise data warehouse modernization and replatforming

Replatform legacy enterprise data warehouses (EDWs) to replace expensive, proprietary databases. Modernize with the only open source-based, multi-cloud platform for analytics offering the full range of data warehouse functionality that your enterprise demands. Gain the power of an MPP system in conjunction with proven technology to reduce the cost and complexity of application migration.

Contact Us

Thank you for your interest!

We will get back to you shortly.

Disclaimer
This website contains statements which are intended to outline the general direction of certain of Pivotal's offerings. It is intended for information purposes only and may not be incorporated into any contract. Any information regarding the pre-release of Pivotal offerings, future updates or other planned modifications is subject to ongoing evaluation by Pivotal and is subject to change. All software releases are on an “if and when available” basis and are subject to change. This information is provided without warranty or any kind, express or implied, and is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions regarding Pivotal's offerings. Any purchasing decisions should only be based on features currently available. The development, release, and timing of any features or functionality described for Pivotal's offerings on this website remain at the sole discretion of Pivotal. Pivotal has no obligation to update forward-looking information on this website.

Contact Us