Spring XD simplifies the development of big data applications by supporting numerous data source types, processing modules, and repositories. An open source distributed runtime, Spring XD is highly extensible, dynamically scalable, and uses an easy-to-use declarative language that business analysts and data scientists can use without needing to program and compile code.
Moving data out of its native repositories for statistical analysis is slow and limits the data used for machine learning to a subset of data. Apache MADLib is an open source library of machine learning algorithms designed to run on scale-out systems. This allows data scientists to quickly build features of an analytical model against large data sets in Greenplum, Apache HAWQ, and PostgreSQL.