Dataiku & Pivotal Greenplum Enable Self-Service Analytics and Data Science at Scale

Together, Dataiku and Pivotal Greenplum allow companies to free data teams to collaborate on petabyte scale data sets, on-premises and in all major public cloud offerings. Data access and governance are baked into a seamless visual development experience, so there is no waiting for access to data sources or DBA assistance.

High-Performance Analytics at Petabyte Scale

Dataiku enables data science and analytics teams to load and transform data sets into Pivotal Greenplum during data preparation. Analysts and data scientists can then leverage Greenplum for in-database parallel processing of complex queries, data visualization and reporting.

Simplify Collaboration across Data Teams

Dataiku and Pivotal Greenplum facilitate collaboration between cross-functional teams to work on the same projects, sharing datasets and code without impacting performance.

Mature Your Data Analytics Operations

Dataiku enables self-service analytics of large datasets stored in Pivotal Greenplum, making it easier and faster to operationalize machine learning models and ensure a tangible business impact. The Dataiku platform also enforces data governance for user roles and teams.

观看
Dataiku: For Everyone in the Data-Powered Organization

Dataiku Overview

Dataiku is the path to Enterprise AI that powers self-service analytics for data analysts, scientists and engineers, while ensuring the operationalization of machine learning. With Dataiku, customers can take predictive analytics and machine learning projects from inception to production at scale to realize tangible business impact.

More about Dataiku




集成功能

Loading and transforming petabyte-scale datasets during data preparation, model building and visualization.

Writes code recipes that create datasets using the results of a SQL query on existing SQL datasets.

Evokes in-database machine learning functions in Apache MADlib.

Performing in-database parallel processing of visual recipes and reporting.

Facilitate cross-team collaboration by sharing datasets, code, best practices—from a single repository—without impacting performance.

Utilizing the underlying MPP architecture of Greenplum to guarantee scalability and performance.


工作原理

Dataiku can be deployed on-premises or in the cloud (e.g. AWS, Azure, etc) and connect via JDBC to Pivotal® Greenplum deployments. Dataiku users can then connect to, load, transform and query data tables stored within Pivotal Greenplum.

To facilitate visual development, data engineers can create custom SQL Recipes in Dataiku to invoke in-database analytics functions of Pivotal Greenplum such as those for data preparation and machine learning in Apache MADlib, for geospatial analysis in PostGIS, and text analytics in GPText. This allows data science teams to leverage the MPP architecture of Pivotal Greenplum to process terabyte and petabyte sized data sets in parallel for faster results.

Read the documentation


使用入门

Let's talk about it.

Contact us about Dataiku.

感谢您的关注!

我们会尽快给您回复。

了解更多

联系我们