Data Science & Advanced Analytics
Uncover value using platforms for data science and advanced analytics that connect with Pivotal Greenplum.

Conduct data science investigations on extremely large data sets stored in Pivotal Greenplum. Leverage native in-database analytics for data preparation, machine learning, graphing, statistics, and text analysis. Push down processing of R and Python programs.

Choose your favorite language

Pivotal Greenplum provides language interfaces for SQL, Python, and R in highly parallelized connections. Python and R run in trusted containers via the Pl/Container extension.

Leverage powerful text analytics capabilities

The GPText library provides massively parallel text indexing, search, and natural language processing (NLP) based on Apache Solr and Apache Tika for extracting content from raw document sources.

Process large data sets in parallel

Queries execute simultaneously across multiple nodes for faster processing of multi-petabyte data sets.

Mix data types and data sources

Organizations can combine data in Pivotal Greenplum with other data sources supported by their data science platforms, or via external tables through Greenplum. They can utilize relational, graph, geospatial, and text data in mixed algorithms.

Utilize SQL-based machine learning

Call data preparation, classification, and machine learning functions that execute from Apache MADlib via SQL, R, or python.

Enable self-service analytics pipelines

Enterprises can leverage data science platforms that simplify development, data access, and data governance so that analytics and data science users can easily pick data sources, including Pivotal Greenplum, from a catalog.

Anaconda
partner

Anaconda, Inc. develops Python data science platform for companies to adopt open data science analytics architecture. It offers Anaconda Distribution, for the distribution of data science packages and Anaconda Enterprise, an open data science platform.

Boundless
partner

Boundless Suite is a complete open source geospatial software platform for managing data, building maps and applications across web browsers, desktops, and mobile devices.

Dataiku
partner

Dataiku DSS is the collaborative data science software platform for teams of data scientists, data analysts, and engineers to explore, prototype, build, and deliver their own data products more efficiently.

SAS
partner

SAS helps customers at more than 83,000 sites make better decisions faster through innovative analytics, business intelligence and data management software and services.