Applications are becoming increasingly concurrent, yet the traditional way of doing this, threads and locks, is very troublesome. This article highlights Erlang and Scala, two programming languages that use a different approach to concurrency: the actor model. The continuous increase of computer processor clock rate has recently slowed down, in part due to problems with heat dissipation. As the clock rate of a processor is increased, the heat generated increases too. Increasing current clock rates even further is troublesome, as the heat generated would push computer chips towards the limits of what is physically possible. Instead, processor manufacturers have turned towards multi-core processors, which are processors capable of doing multiple calculations in parallel. There has been an increased interest in software engineering techniques to fully utilize the capabilities offered by these processors. At the same time, applications are more frequently built in a distributed way. In order to perform efficiently, without wasting most of the time on waiting for I/O operations, applications are forced to do more and more operations concurrently, to obtain maximum efficiency. However, reasoning about concurrent systems can be far from trivial, as there can be a large number of interacting processes, with no predefined execution order. This paper discusses the concurrency model in Erlang and Scala, two languages that have recently gained in popularity, in part due to their support for scalable concurrency. This first section introduces both languages, by describing their key features. ErlangErlang is a concurrent programming language, originally designed by Ericsson (Erlang is named after A. K. Erlang, a Danish mathematician and Ericsson: Ericsson Language). It was designed to be distributed and fault-tolerant, for use in highly-available (non-stop) soft real-time telecom applications. Erlang is a pure functional language, it features single assignment and eager evaluation. It also has built-in language constructs for distribution and concurrency. Erlang has a dynamic type system, where the typing of expressions is checked at run-time [1]. Typing information is fully optional and only used by the static type-checker, the run-time environment even allows running applications with invalid typing specifications. Run-time type checking is always performed by looking at the types of the data itself. This usually allows applications to run correctly, even when encountering unexpected data types. The listings below show a simple program, written in Erlang and its execution in the Erlang shell. This program performs a recursive calculation of the factorial function. A simple program in Erlang
Using the Erlang program
To allow for non-stop application, Erlang support hot-swapping of code. This means that it is possible to replace code while executing a program, making it possible to do upgrades and maintenance without interrupting the running system. ScalaScala, which stands for Scalable Language, is a programming language that aims to provide both object-oriented and functional programming styles, while staying compatible with the Java Virtual Machine (JVM). It was designed in 2001 by Martin Odersky and his group at EPFL in Lausanne, Switzerland, by combining the experiences gathered from designing multiple other anguages. The Scala language has been designed to be scalable [2], as it is supposed to scale along with the needs of its users. Offering functional constructs allows the developer to write short and concise code, whereas having object-oriented concepts allows the language to be used for large complex projects. Scala is fully interoperable with the Java language, which allows developers to use all Java libraries from within Scala. As such, it is not a separate language community, but rather allows you to take advantage of the enormous Java ecosystem. There is also an initiative to build a Scala version that runs on top of the .NET Common Language Runtime (CLR) [3], this ensures the portability of Scala to other underlying platforms. Scala tries to stay to stay close to pure object-oriented programming and therefore does not have constructs like static fields and methods. Every value in Scala is an object and every operation is a method call [4]. Scala allows you to define your own operators and language constructs. This makes it possible to extend the Scala language according to your specific needs, which again helps it to grow (scale up) along with the project. Scala uses strict-typing, yet allows most of the typing to be unspecified. When type information is not specified, the compiler will do smart type inference and attempt to infer this information from the code itself. This save programming effort and allows for more generic code. Type information is only required when the compiler cannot prove that correct type usage will happen. The listing below shows a simple program in Scala, which shows some of its functional features, as well as some object-oriented features. As can be seen, an object with a method is defined. The definition of this method however is done in a pure functional style. There is no need to write a return statement, as the value of the function is considered to be the return statement. A simple program in Scala
A different way of concurrency: The Actor ModelThe problem with threadsThe traditional way of offering concurrency in a programming language is by using threads. In this model, the execution of the program is split up into concurrently running tasks. It is as if the program is being executed multiple times, the difference being that each of these copies operated on shared memory. This can lead to a series of hard to debug problems, as can be seen below. The first problem, on the left, is the lost-update problem. Suppose two processes try to increment the value of a shared object The solution to this problems is the use of locks. Locks provide mutual exclusion, meaning that only one process can acquire the lock at the same time. By using a locking protocol, making sure the right locks are acquired before using an object, lost-update problems are avoided. However, locks have their own share of problems. One of them is the deadlock problem, which is pictured on the right. In this example two processes try to acquire the same two locks These are just some of the problems that might occur when attempting to use threads and locks.
Both Erlang and Scala take a different approach to concurrency: the Actor Model. It is necessary to look at the concepts of the actor model first, before studying the peculiarities of the languages itself. The Actor ModelThe Actor Model, which was first proposed by Carl Hewitt in 1973 [5] and was improved, among others, by Gul Agha [6]. This model takes a different approach to concurrency, which should avoid the problems caused by threading and locking. In the actor model, each object is an actor. This is an entity that has a mailbox and a behaviour. Messages can be exchanged between actors, which will be buffered in the mailbox. Upon receiving a message, the behaviour of the actor is executed, upon which the actor can: send a number of messages to other actors, create a number of actors and assume new behaviour for the next message to be received. Of importance in this model is that all communications are performed asynchronously. This implies that the sender does not wait for a message to be received upon sending it, it immediately continues its execution. There are no guarantees in which order messages will be received by the recipient, but they will eventually be delivered. A second important property is that all communications happen by means of messages: there is no shared state between actors. If an actor wishes to obtain information about the internal state of another actor, it will have to use messages to request this information. This allows actors to control access to their state, avoiding problems like the lost-update problem. Manipulation of the internal state also happens through messages. Each actor runs concurrently with other actors: it can be seen as a small independently running process. Actors in ErlangIn Erlang, which is designed for concurrency, distribution and scalability, actors are part of the language itself. Due to its roots in the telecom industry, where a very large amount of concurrent processes are normal, it is almost impossible to think of Erlang without actors, which are also used to provide distribution. Actors in Erlang are called processes and are started using the built-in A simple application that uses an actor can be seen below. In this application, an actor is defined which acts as a basic counter. We send 100.000 increment messages to the actor and then request it to print its internal value. Actors in Erlang
Lines 1 & 2 defines the module and the exported functions. Lines 4 till 7 contain the Actor scheduling in ErlangErlang uses a preemptive scheduler for the scheduling of processes [7]. When they have executed for a too long period of time (usually measured in the amount of methods invoked or the amount of CPU-cycles used), or when they enter a \texttt{receive} statement with no messages available, the process is halted and placed on a scheduling queue. This allows for a large number of processes to run, with a certain amount of fairness. Long running computations will not cause other processes to become unresponsive. Starting with release R11B, which appeared in May 2006, the Erlang run-time environment has support for symmetric multiprocessing (SMP) [8]. This means that it is able to schedule processes in parallel on multiple CPUs, allowing it to take advantage of multi-core processors. The functional nature of Erlang allows for easy parallelization. An Erlang lightweight process (actor) will never run in parallel on multiple processors, but using a multi-threaded run-time allows multiple processes to run at the same time. Big performance speedups have been observed using this technique. Actors in ScalaActors in Scala are available through the The same sample application, this time written in Scala can be seen below Actors in Scala
We can see the following code: Lines 1 & 2 import the abstract Lines 7 till 21 define the Finally, on lines 23 till 32, we find the main application, which first constructs a counter, then sends 100.000 The output of this program shows that the counter has been incremented up to a value of 100.000. This means that in this case all our messages were delivered in order. This might not always be the case: recall that there are no guarantees on the order of message delivery in the actor model. The example above shows the ease with which actors can be used in Scala, even though they are not part of the language itself. It also shows the similarities between the actors library in Scala and the Erlang language constructs. This is no coincidence, as the Scala actors library was heavily inspired by Erlang. The Scala developers have however expanded upon Erlangs concepts and added a number of features, which will be highlighted in the following sections. RepliesThe authors of To facilitate this, a
A Synchronous messagingScala contains a construct which can be used to wait for message replies. This allows for synchronous invocation, which is more like method invocation. The syntax to do this is shown below, to the right, contrasted by the normal way of writing this to the left. When entering the
ChannelsMessages, as used in the previous examples, are used in a somewhat loosely typed fashion. It is possible to send any kind of message to an actor. Messages that are not matched by any of the Scala has a very rich type system and the Scala developers wanted to take advantage of this. Therefore they added the concept of channels [9]. These allow you to specify the type of messages that can be accepted using generics. This enables type-safe communication. The mailbox of an actor is a channel that accepts any type of message. Thread-based vs. Event-based actorsScala makes the distinction between thread-based and event-based actors. Thread-based actors are actors which each run in their own JVM thread. They are scheduled by the Java thread scheduler, which uses a preemptive priority-based scheduler. When the actor enters a There is an important drawback to this method: each thread can be considered as being heavy-weight and uses a certain amount of memory and imposes some scheduling overhead. When large amounts of actors are started, the virtual machine might run out of memory or it might perform suboptimal due to large scheduling overhead. In situations where this is unacceptable, event-based actors can be used. These actor are not implemented by means of one thread per actor, yet instead they run on the same thread. An actor that waits for a message to be received is not represented by a blocked thread, but by a closure. This closure captures the state of the actor, such that it's computation can be continued upon receiving a message [10]. The execution of this closure happens on the thread of the sender. Event-based actors provide a more light-weight alternative, allowing for very large numbers of concurrently running actors. They should however not be used for parallelism: since all actors execute on the same thread, there is no scheduling fairness. A Scala programmer can use event-based actors by using a Actor scheduling in ScalaScala allows programmers to mix thread-based actors and event-based actors in the same program: this way programmers can choose whether they want scalable, lightweight event-based actors, or thread-based actors that allow for parallelism, depending on the situation in which they are needed. Scala uses a thread pool to execute actors. This thread pool will be resized automatically whenever necessary. If only event-based actors are used, the size of this thread pool will remain constant. When blocking operations are used, like EvaluationThe next section will evaluate some of the features in both languages. It aims to show that while some of these features facilitate the implementation, their power also comes with a risk. One should be aware of the possible drawbacks of the chosen technologies, to avoid potential pitfalls. The dangers of synchronous message passingThe synchronous message passing style available in Scala (using It should however be used with great care. Due to the very selective nature of the This is a dangerous situation, as it might lead to deadlocks. To see this, consider the following example:
In this example, each actor sends a message to the other actor. It will never receive an answer, as the actor is first awaiting a different message. If it were implemented using a message loop, like you would generally do in Erlang, no problem would arise. This is shown below. This does not mean that synchronous message passing should be avoided: in certain cases, it is necessary and in these cases the extra syntax makes this much easier to program. It is however important to be aware of the potential problems caused by this programming style.
The full source code for these examples can be found online, see the appendix for more information. Safety in Scala concurrencyAnother potential pit-fall in Scala comes from the fact that it mixes actors with object-oriented programming. It is possible to expose the internal state of an actor through publicly available methods for retrieving and modifying this state. When doing so, it is possible to modify an object by directly invoking its methods, that is: without using messages. Doing so means that you no longer enjoy the safety provided by the actor model. Erlang on the other hand, due to its functional nature, strictly enforces the use of messages between processes: there is no other way to retrieve and update information in other processes. This illustrates possibly the biggest trade-off between Erlang and Scala: having a pure functional language, like Erlang, is safe, but more difficult for programmers to use. The object-oriented, imperative style of Scala is more familiar and makes programming easier, yet requires more discipline and care to produce safe and correct programs. SummaryThis article described the actor model for the implementation of concurrency in applications, as an alternative to threading and locking. It highlighted Erlang and Scala, two languages with an implementation of the actor model and showed how these languages implement this model. Erlang is a pure functional language, providing little more than the basic features of functional languages. This should certainly not be seen as a weakness though: this simplicity allows it to optimize specifically for the cases for which it was defined as well as implement more advanced features like hot-swapping of code. Scala on the other hand uses a mix of object-oriented and functional styles. This makes it easier for a programmer to write code, especially given the extra constructs offered by Scala, but this flexibility comes with a warning: discipline should be used to avoid inconsistencies. The differences between these languages should be seen and evaluated in their design context. Both however provide an easy to use implementation of the actor model, which greatly facilitates the implementation of concurrency in applications. References
Appendix: Source code + PDFThe source code for the sample programs and a PDF version can be found on http://ruben./files/articles/concurrency-in-erlang-scala/erlang-scala-concurrency.tar.gz. It has been tested on Ubuntu Linux 8.10, with Erlang 5.6.3 and Scala 2.7.2final (installed from Debian packages). It should work on any platform. |
|