Pivotal GemFire

The Scale-Out, In-Memory Distributed Data Grid for Mission-Critical Applications




Scaling Out Strategic Data-Driven Applications

For today’s data-driven companies, executing on business strategy means developing and deploying custom data-intensive applications at scale. These applications help companies optimize business processes, create new revenue opportunities and enhance competitive differentiation.

To achieve success, these applications must be delivered with performance, scale, global reach and always-on availability. Often, developers have difficulty meeting these service level expectations by scaling out traditional RDBMS’s and they cannot meet data consistency requirements with most in-memory data grids and caching technologies.

These days, application developers and IT architects are having to meet ever higher service level requirements, dealing with many terabytes of operational data utilized in thousands to hundreds of thousands of concurrent transactions at global levels of scale previously only seen in the most extreme applications.

Pivotal GemFire is an in-memory distributed data grid for high scale custom applications. GemFire provides in-memory access for all operational data spread across hundreds of nodes with a “shared nothing” architecture. This enables GemFire to provide low latency data access to applications at massive scale with many concurrent transactions involving terabytes of operational data. Designed for maintaining consistency of concurrent operations across its distributed data nodes, Pivotal GemFire can support ACID transactions for massively scaled applications such as stock trading, financial payments and ticket sales in proven customer deployments of more than 10 million user transactions a day. Originally developed to serve data for mission critical applications in the financial industry, GemFire offers built in fail-over and resilient self-healing clusters to allow developers to meet the most stringent service level requirements for data accessibility.

Scale-Out Performance

In-Memory Storage: GemFire stores all required data in RAM memory across distributed nodes to provide fastest access to data while minimizing the performance penalty of reading from disk.

Elastic, Linear Scalability: GemFire provides linear scalability that allows you to predictably increase capacity for number of operations per second, and data storage simply by adding additional nodes to a cluster. Data distribution and system resource usage is automatically adjusted as nodes are added or removed, making it easy to scale up or down to quickly meet expected or unexpected spikes of demand.

Optimized Data Distribution Across Nodes: GemFire will automatically optimize how data is distributed across nodes to optimize latency and usage of system resources. You can also configure partitioning and replication of data to further optimize application response time. GemFire will appropriately direct processing operations on data to the specific nodes where data resides in order to reduce latency and network traffic.

Consistent Data Grid Operations Across Globally Distributed Applications

Performance-Optimized Persistence: To ensure durability of data in the event of node failure, GemFire writes to disk a log of all creates, updates, and deletes of data managed by a node. This log can then be read to reconstruct the last consistent state of the in-memory data grid on that node when a node comes back online.

Configurable Consistency: GemFire is capable of providing ACID consistency across distributed nodes to support high capacity transactional applications. You can also configure consistency models for higher performance such as allowing the entire grid to cache and operate on data, or turn consistency off for highest performance caching.

Distributed Queries and Regional Functions: Pivotal GemFire supports the Object Query Language (OQL) for authoring queries. Queries are sent to the appropriate nodes that serve relevant partitions of data. Query results are then merged and sent back to the client application. Developers can define indexes on key values to improve performance. In a similar fashion, when functions that operate on classes of data are invoked, processing will be routed to appropriate nodes responsible for serving partitions of targeted data.

High Availability, Resilience and Global Scale

Cluster Resilience and Fail Over: GemFire provides continuous uptime with built in high availability and disaster recovery. Multiple failure detection models detect and react to failures quickly, ensuring that the cluster is always available, and that the data set is always complete.

Resilient Self-Healing: GemFire self-healing automation allows a node to quickly rejoin a cluster once it becomes operational again, with fast startup, reconnect, and incremental updates of changed data, all handled without administrator intervention.

Rolling Upgrade: When it becomes time to deploy a new version of application or GemFire software, system administrators can take advantage of redundancy zones to update portions of a cluster automatically at the same time. The remaining redundant nodes can stay operational serving the application in a highly available manner. This means no planned maintenance downtime is required for version upgrades or patching.

Cluster-to-Cluster WAN Connectivity: GemFire allows multiple clusters to be connected via WAN gateways. This allows appli- cation data access to span across the globe, and allows companies to meet local data requirements, such as country-specifc privacy regulations. WAN connected clusters also enable multi-site fail over capability, ensuring ongoing availability and built-in disaster recovery in the case of catastrophic failure.

Powerful Developer Features

Data Types, Languages and API Support: GemFire allows developers to manage data from user-defined classes as well as JSON documents. Native language clients and support are provided for Java, C++, and C# programming languages. Applications written in other programming languages can access the same features via a REST API. Other supported API’s include Java Hashmap, Memcached, and Spring Data GemFire.

Flexible Schemas and Versioning: GemFire schema serialization, called PDX, allows data types can be dynamically modified, such as when new kinds or version of an application are deployed against the same data nodes. The system automatically bridges between application versions allowing the different versions to work with the same data, since schema type information is dynamically discovered while processing queries from any version application. Data types in PDX are language independent, allowing applications written in any language to access the same data.

Out of Box Caching and Powerful Application Features: Developers can add GemFire caching to their applications running on Pivotal Tc Server with little or no modification to their application code. Tc Server will cache user sessions, even across web servers and data centers. Spring L2 Hibernate is also supported in Tc Server. Developers need only annotate their code to invoke this Spring framework capability.

GemFire provides powerful advanced application features to developers that want to leverage its distributed caching and data grid capabilities. Like many data grid platforms, developers can embed and generate queries, in Gemfire’s case using OQL. OQL can also be used to set up “continuous queries” that return a streaming result set updated whenever there are new entries meeting your query criteria. GemFire provides a sophisticated event handling mechanism providing a publish and subscribe approach and durable asynchronous queues suitable for mission critical application requirements.

Easy Administration of Distributed Nodes

Automated Tuning & Simplified Cluster Configuration: GemFire is built to automate administrative tasks as much as possible. This includes automating tuning of system resources between nodes in a cluster by intelligently managing the placement of data while reducing network round trips. Data gets replicated only to those nodes that need the data, and requests for access are routed intelligently using the most direct path available. This data placement and resource allocation is adjusted automatically if nodes are added to, or removed from the cluster. Furthermore, node configuration is handled centrally with automatic redundancy for high-availability. New nodes can get their configuration from the centralized configuration manager upon startup to quickly join a cluster with no additional system administration tasks.

Comprehensive Monitoring & Administration Tools: GemFire provides a comprehensive set of online and offline tools for monitoring and administering clusters. The online dashboard allows drill down into cluster and node status, and querying of stored data. The offline analytics tool allows diagnosis of system bottlenecks through analysis of historical statistics logging. A command line tool allows administrators to take action on clusters and nodes such as starting, stopping and configuring settings.

Flexible Deployment Options: GemFire runs in Java Virtual Machines in 32 and 64-bit mode on Windows, Linux, and Solaris operating systems. Client nodes running in C++, C#, .Net, and Java are supported. Other popular web-scale programming languages such as Ruby, Node.JS, Scala, and Python can access GemFire capabilities via Rest API. GemFire grids can be set up with active/active multi-site bi-directional WAN replication to enable disaster recovery, business continuity, and geographical proximity for lowest possible latency world-wide.

Download the PDF