Grid Appliances: Simplifying Grid Computing with Self-organizing Wide-area Virtual Clusters

There is no doubt that a grid system is powerful; however, the expertise required and the complexity involved in setting up and maintaining one are deterrents to wider adoption of Grid computing, especially by non-experts. This project focuses on lowering the complexity and operator involvement required for the deployment and management of a wide-area Grid through the use of virtualization and self-organizing networks [2]. A flagship implementation of this approach is found in the Grid Appliance [1], which provides a simple interface to setup, maintain, connect to, and add features to a Grid system. To enable this, these key technologies are used:

  • Systems virtualization—system virtual machines such as VMware and Xen provide an abstracted system for which a complete computer system can be installed upon. This allows users to time-share a full-fledged guest environment (O/S, libraries, tools, applications) with their host without requiring modifications to the host configuration.
  • Software packaging in appliances—with system virtualization, complete environments can be convenient distributed as a single file (an appliance image) and a developer can mend the system to his or her purpose.
  • Network virtualization—IP-over-P2P (IPOP) provides the ability for machines in different domains to seamlessly communicate as if they are on the same local area network.
  • Grid middleware—virtualization enables us to leverage existing middleware; in particular, the Grid Appliance integrates Condor, which supports flexible resource discovery through match-making, and provides various fault-tolerant job scheduling and data transfer features.

Currently, development is focused on improvements in the management interface to facilitate the creation of multiple independent pools, including in local environments with stringent firewall requirements (e.g. a corporate intranet Grid), and to provide seamless integration with X.509-based IPsec for strong host authentication and end-to-end traffic encryption. Other important features underdevelopment include virtual file systems which use peer-to-peer data transfers to improve the handling of large write-once, read-many files; dynamic publish/discovery of job managers and workers using a distributed hash table (DHT); and flexible, decentralized, self-organizing resource discovery using unstructured peer-to-peer queries.

[1] Wolinsky, D., A. Agrawal, P. Boykin, J. Davis, A. Ganguly, V. Paramygin, P. Sheng, R. Figueiredo, "On the design of Virtual Machine Sandboxes for Distributed Computing in Wide Area Overlays of Virtual Workstations," VTDC, 2006.
[2] Ganguly, A., A. Agrawal, P. Boykin, R. Figueiredo, "WOW: Self-Organizing Wide Area Overlay Networks of Virtual Workstations," HPDC, 2006.

Download a poster here

This material is based upon work supported by the National Science Foundation under Grant No. 0438246, 0537455 (PI: Renato Figueiredo), two SUR grants from IBM and a gift from VMware Corp. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.