Distributed computing is where you use a large collection of computers to do some processing that would take a long time if you were to use only one. Basically what you do is split your application up into lots of little pieces, and each computer works on a separate piece of the problem, and communicates with the others to pass data around. Because lots of machines are working in parallel, they can often get the job done quicker. For example, if you are rendering a movie, each frame could be processed independently of the others, and with enough hardware you could finish the process in hours rather than years.
Writing programs in a way that makes this possible can be very difficult. It’s a bit like managing a large organisation – different people must be assigned different tasks, and they all have to work together in a way that gets the job done efficiently. It’s important to ensure that each person can mostly work independently, and not have to spend lots of time in meetings or waiting for other people to finish things. This is hard to get right, whether you’re dealing with people or computers. In both cases the problem is basically the same, but with computers we can create programming tools that make it easier to solve.
An example of these sorts of tools are workflow systems, which allow you to expose the important operations used by your applications as services, and then define a high-level structure that describes how these operations are combined together to produce the results you want. The services are accessed over a network, in much the same way as your browser accesses a web site. A workflow engine manages the process of running your application by making calls out to these different services in the correct order, and transferring output data produced by each operation as inputs to others. An example of a workflow is shown below.
Creating a workflow is basically a form of programming, and there are various languages around designed for this purpose. The problem with these languages however is that they aren’t very powerful, at least compared to normal programming languages like Java and C. If your application consists of independent tasks, or tasks which have simple dependency relationships with each other, then this is OK. But for more complex workflows, you soon start running into the limitations of workflow languages, which means you have to resort to low-level parallel programming technologies such as MPI. The latter is more powerful, but also much harder to write in, as you have to worry about lots of low-level details, and it’s easier to introduce errors.
What I’ve done in my project is to provide a middle ground between these two types of technologies. I’ve created a programming language that lets you specify arbitrarily complex workflows involving application logic and data manipulation, in addition to invoking tasks exposed as services. It is a functional programming language, which means that everything is expressed in terms of data rather than actions, and functions are only allowed to compute results, rather than modify state. This allows the compiler to automatically parallelise the program, without forcing you to explicitly deal with things like how the work is distributed between computers.
Using this language, you can create a wide range of complex workflows that do everything from simple data queries to complex scientific data processing. It is designed as a generic mechanism upon which higher level languages can be implemented, such as those designed for XML processing, which can be useful for some types of applications. The main benefit of my approach is that it makes it easier to write certain types of parallel programs than other languages, because your code is automatically parallelised and can easily integrate with existing services written in other languages.
For full details, see my publications, in particular my thesis, titled Applying Functional Programming Theory to the Design of Workflow Engines.