Tuesday, July 22, 2008

OSCON: Practical Erlang Programming

This afternoon I'm sitting in "Practical Erlang Programming" given by Francesco Cesarini. Erlang has been around for almost twenty years, and is a niche language. However we're increasingly starting to hear more about it due to the growth in the number of multi-core machines. So I figured I so go and find out what all the fuss was about...


Update: I think Francesco has really overestimated the capacity of the wireless network, he's just told ninety people to download the source bundle and install Erlang.

Update: Okay, we're kicking off with data types; integers, floats, atoms are the simple types. Then we have tuples and lists. Interestingly variables in Erlang are single assignment, an values of variables can not be changed once it has been bound. Puzzled, variables are not very variable at that point?

1> A = 123.
123
2> A.
123
3> A = 124.
** exception error no match of right hand
4> f().
ok
5> A = 124.
124

Update: Pattern matching is used for assigning vales to variables, controlling the executing flow of programs and extracting values.

Update: Moving onto to function calls, this looks, well. Odd.

area( {square, Side} )      ->
Side * Side ;
area( {circle, Radius } ) ->
3.14 * Radius * Radius;
area( {triangle, A, B, C} ) ->
S = ( A + B + C )/2,
math:sqrt(S*(S-A)*(S-B)*(S-C));
area( Other ) ->
{error, invalid_object}.

Function have clauses separated by ';'. Erlang programs consist of a collection of modules that contain functions that call each other. Function and modules names must be atoms.

factorial(0) -> 
1;
factorial(N) ->
N * factorial(N-1).

Variables are local to functions and allocated and deallocated automatically.

Update: Modules are stored in files with the .erl suffix, module and file names must be the same. Modules are names with the -module(Name). directive.

-module(demo).
-export([double/1]).

% Exporting the function double with arity 1

double(X) ->
times(X, 2).

times( X, N ) ->
X * N.

compiling this from the command line

1> cd("/Users/aa/").
2> c(demo).
{ok,demo}
3> demo:double(10).
20
4> demo:times(1,2)
**exception error: undefined function demo:times

Update: We've now looked at the basics, we're moving on to sequential Erlang; Conditionals, guards and recursion.

case lists:members(foo, List) of
true -> ok;
false -> {error, unknown}
end

In a conditional one branch must always succeed, you can put the '_' or an unbound variable in the last clause to ensure this happens.

if
X < 1 -> smaller;
X > 1 -> greater ;
X == 1 -> equal
end

Again, one branch must always succeed, by using true as the last guard you ensure that the last clause will always succeed should previous ones evaluate to false, see it as an 'else' clause. So we can have,

factorial(N) when N > 0 ->
N * factorial( N - 1 );
factorial(0) ->
1.

instead of having this,

factorial(0) -> 
1;
factorial(N) ->
N * factorial(N-1).

Of these two the top one is the faster one, but you really shouldn't really worry about that when using Erlang, apparently...

All variables in guards have to be bound. If all guards have to succeed, use ',' to seperate them, if one has to succeed, use ';' to separate them. Guards have to be free of side effects.

Update: On to recursion,

average(X) -> sum(X) / len (X).

sum([H|T) -> H + sum(T);
sum([]) -> 0.

len([_|T]) -> 1 + len(T);
len([]) -> 0.

Note the pattern of recursion is the same in both cases. Taking a list and evaluating an element is a very common pattern...

Update: Now onto Build In Functions (BIFs)...

date()
time()
length(List)
size(Tuple)
atom_to_list(Atom)
list_to_tuple(List)
integer_to_list(234)
tuple_to_list(Tuple)

BIFs are by convention regarded as being in the erlang module. There are BIFs for process and port handling, object access and examination, meta programming, type conversion, etc.

Update: We're running through all the possible run time errors.

Update: We're breaking (late!) for coffee...

Update: ...and we're back, and walking through some examples, and onwards to concurrent Erlang.

Pid2 = spawn(Mod, Func, Args)

Before the spawn code is executed by Pid1, afterwards a new process Pid2 is created. The identified Pid2 is only known to Pid1. A process terminates abnormally when run-time error occurs, and normally when there is no more code to execute. Processes do not share data, and the only way to do so is using message passing. Sending a message will never fail, messages sent to non existing processes are thrown away. Received messages are stored in a process mailbox, and will receive them inside a receive clause,

recieve
{resetn, Board } -> reset(Board);
{shut_down, Board{ -> {error, unknown_msg}
end

Unlike a case block, receive suspends the process until a message which matches a case is received. Message passing is asynchronous, one of the things you look for in stress testing Erlang systems is running out of memory because of full mailboxes.

Update: We're getting into a static versus dynamic typing argument, the bizarreness is that even the Francesco seems to think that static typing is a good thing. Why is that? I'm really surprised, after all I'd argue that there are a bunch of reasons to use loosely typed languages in preference to statically typed ones.

Update: It's also interesting that some people in the audience here aren't getting the "let it crash" mantra coming from Francesco. In a highly concurrent language where everything is a process, letting a process crash is just how you handle errors. A process crash is essentially the same as throwing an exception.

Update: I'm starting to loose the thread of the talk now. Pity, Francesco has just got to the interesting bit. It's been a long day...

Update: ...and we're done. Chris was also blogging the tutorial so head over to his post for more coverage.