With Pivotal Greenplum, data professionals can test diverse models in parallel on multi-structured data sets—including machine learning, text, graph, and geo-spatial. Rapidly create and deploy models for complex applications in cybersecurity, predictive maintenance, risk management, fraud detection, and many other areas.
Analyze more types of data in a single environment—structured, text, geospatial, and graph. Petabyte scale for deep analytical insights.
MPP architecture trains more models in less time. Ensemble modeling for more predictive power.
Scale-out, highly concurrent architecture enables rapid modeling of individuals and entities at scale. Statistical methods via familiar SQL, harnessed to massive parallelism, accelerate common data prep and quality tasks.
Analyze data at speed and scale. In combination with Apache MADlib, the open-source library of analytical functions for the PostgreSQL family of databases, Pivotal Greenplum can execute over 50 statistical, machine learning, and graph methods, via SQL.
Flexible integration of data and analytics. Apache Spark developers can leverage the bi-directional Pivotal Greenplum-Spark Connector either to transfer data to and from Spark, or to execute highly-parallel models where the data exists -- either in Spark, or with MADlib in Pivotal Greenplum. The Pivotal Greenplum Extension Framework (PxF) provides an abstraction framework and pre-built plugins to many data sources, including HDFS, Hive, and HBase, to work with your data lake. You can also read/write data from/to cloud storage, including Amazon S3 objects.
Customize analytics with procedural languages. Pivotal Greenplum supports development in R, Python, Java, and other standard languages. It shared-nothing, MPP architecture can distribute execution across the entire cluster for processing that can be orders of magnitude faster than single-node R or Python.
Protect and isolate workloads with containers. PL/Container is a new preview feature that implements a trusted language execution engine capable of isolating executors from the host OS. Containers are pre-configured with Pivotal Greenplum for data science workloads and can also be customized or created for different end user workloads.
Use familiar tools. Work with leading BI and advanced analytics software that are ODBC/JDBC compatible, or have native integrations, including SAS, IBM Cognos, SAP Analytics Solutions, Qlik, Tableau, Apache Zeppelin, and Jupyter.
Analyze data where it is. Pivotal Greenplum is infrastructure-agnostic and available for bare metal, private cloud, and public cloud deployments