Conduct data science investigations on extremely large data sets stored in Pivotal Greenplum. Leverage native in-database analytics for data preparation, machine learning, graphing, statistics, and text analysis. Push down processing of R and Python programs.
Pivotal Greenplum provides language interfaces for SQL, Python, and R in highly parallelized connections. Python and R run in trusted containers via the Pl/Container extension.
The GPText library provides massively parallel text indexing, search, and natural language processing (NLP) based on Apache Solr and Apache Tika for extracting content from raw document sources.
Queries execute simultaneously across multiple nodes for faster processing of multi-petabyte data sets.
Organizations can combine data in Pivotal Greenplum with other data sources supported by their data science platforms, or via external tables through Greenplum. They can utilize relational, graph, geospatial, and text data in mixed algorithms.
Call data preparation, classification, and machine learning functions that execute from Apache MADlib via SQL, R, or python.
Enterprises can leverage data science platforms that simplify development, data access, and data governance so that analytics and data science users can easily pick data sources, including Pivotal Greenplum, from a catalog.
Anaconda, Inc. develops Python data science platform for companies to adopt open data science analytics architecture. It offers Anaconda Distribution, for the distribution of data science packages and Anaconda Enterprise, an open data science platform.
Boundless Suite is a complete open source geospatial software platform for managing data, building maps and applications across web browsers, desktops, and mobile devices.
Dataiku DSS is the collaborative data science software platform for teams of data scientists, data analysts, and engineers to explore, prototype, build, and deliver their own data products more efficiently.
SAS helps customers at more than 83,000 sites make better decisions faster through innovative analytics, business intelligence and data management software and services.