"A Grid is a collection of distributed computing resources available over a local or wide area network that appear to an end user or application as one large virtual computing system." - IBM"Conceptually, a grid is quite simple. It is a collection of computing resources that perform tasks. In its simplest form, a grid appears to users as a large system that provides a single point of access to powerful distributed resources." - Sun
"[The Grid] lets people share computing power, databases, and other on-line tools securely across corporate, institutional, and geographic boundaries without sacrificing local autonomy." - Globus Alliance
"Grid computing is computing as a utility - you do not care where data resides, or what computer processes your requests. Analogous to
way utilities work, clients request information or computation and have it delivered - as much as they want, and whenever they want." - Oracle
Grid computing has been so heavily hyped up in
technology press this year that most major players in
software arena have staked a claim in what they see as
'next big thing'. As you can see from
definitions above, no-one seems to be quite sure what they are working towards. Indeed it could be said that there seems to be more activity in arguing over a definition than there is in developing
technology.
So is
Grid just marketing hype or can we expect to see some real benefits from these ideas?
To tackle this question, first of all it's necessary to look at
types of application that businesses will be running on
computing platforms of
future.
Of particular interest is
ratio between network usage, processing time and disk storage required for a particular task. The following example explains why this is
case:
Currently, one pound will buy you 1 GB of internet traffic, 8 hours of CPU time or 10 million database accesses.1
The SETI@Home project has so far used 1,643,925 years of CPU time, donated by millions of computers around
world, searching for patterns or signals in radio telescope data. Using
above figures, to do this in a traditional manner would have cost about fourteen billion pounds.
However, due to
nature of
task at hand, SETI parcelled
work into about a billion packages and sent them out to people volunteering their spare CPU cycles. As each task could be described in just 0.5MB of data this required a total network bandwidth of 500,000GB which cost them about a million pounds.
A fourteen billion pound calculation for one million is a pretty good saving!
However, this relies on one key factor about
SETI calculation:
work could be sent out in 0.5MB parcels and each parcel would represent about 14 hours of work. It is this ratio of CPU cost to network cost of 10,000:1 that made SETI@Home viable. It is worth noting that this is not a common feature of many tasks that businesses will want to perform.
Most business-related calculations that could benefit from
huge computing resources that
Grid promises rely heavily on access to large amounts of proprietary information. The fact is that
cost of shipping this information across
network will immediately negate
benefits of having someone else manage
processing resources for you.