Summary
In order to make the basic concepts of Erlang accessible to a Python user, I split Erlang into four pieces:
-
The 'data manipulation' stuff
Erlang is very similar to Python here.
You can do list comprehensions, ranges, dicts. Practically everything you do in Python is available in Erlang.
On top of this, Erlang gives you pattern matching and atoms. I really like these bits.
Oh and Erlang has no mutable state.
-
The OTP stuff
Fred Hebert's brilliant book dedicates a number of chapters to OTP. But what on earth is it?
OTP is a massive topic but most texts introduce
gen_server
early on. As an approximation, we will show how thegen_server
can be seen as a clever kind ofimport
. Erlang's OTP is much bigger than this, but this should give you a sense of where OTP sits in the stack. -
The process stuff
OTP is possible because of the central role that processes have been given in Erlang. Processes and OTP are at the core of Erlang's strategy for building fault-tolerant code.
But processes are just processes. There is nothing else going on that you need to know about, at least for this level of understanding.
-
Fault-tolerant systems
This is where Erlang stands out: you can build systems which are protected from those very-rare-but-hard-to-find bugs.
Of course, in a system that handles a lot of load, the term 'very rare' is relative. I quote from Fred's Zen of Erlang:
... a once in a billion bug will show up every 3 hours in a system doing 100,000 requests a second ...
... a once in a million bug could similarly show up once every 10 seconds on such a system ...
Erlang 'data manipulation' stuff: lists, functions, etc.
Here is the factorial function in Erlang:
factorial(1) -> 1;
factorial(N) -> N * factorial(N-1).
You can see for yourself how pattern matching is used.
Here is a list comprehension which produces a list of tuples:
# In Python:
[ (x, y, x*y) for x in range(1, 10) for y in range(1, 10) ]
% In Erlang:
[ {X, Y, X*Y} || X<-lists:seq(1, 10), Y<-lists:seq(1, 10) ]
Note that all variables in Erlang must start with a capital letter, otherwise they get interpreted as an atom.
To run Erlang code you can use the escript command. On that page there is an example for the factorial function.
#!/usr/bin/env escript
% vim: ft=erlang
main([String]) ->
try
N = list_to_integer(String),
F = fac(N),
io:format("factorial ~w = ~w\n", [N,F])
catch
_:_ ->
usage()
end;
main(_) ->
usage().
usage() ->
io:format("usage: factorial integer\n"),
halt(1).
fac(0) -> 1;
fac(N) -> N * fac(N-1).
OTP
The gen_server 'behaviour'
Suppose I wanted to make it possible to 'hot swap' code in Python. Here is one simple way to do that.
I create two files, one of functions:
# my_functions.py
def func_a(x):
return 2 * x
def func_b(x):
return x + 100
and one which acts as a layer of indirection:
# my_module.py
from my_functions import func_a
func = func_a
At the command line I can call the function in my_module
:
> import my_module
> my_module.func(10)
20
I can hot-swap too:
> from my_functions import func_b
> my_module.func = func_b
> my_module.func(10)
120
The important point is that this has been done by creating a layer of indirection. I do not call the functions directly, I call something which goes on to call the functions. No magic here.
This is somewhat analogous to how gen_server
works: you register your
functions with a gen_server
and call your functions via the gen_server
.
Registering is done with the handle_call/3
and handle_cast/2
functions in
the gen_server
. Hot swapping is done with the code_change/3
function.
What does this mean?
OTP is a lot of things and contains a lot of goodies, but the idea I want to
emphasize is that gen_server
and some of the other OTP 'behaviours' (that is
the name they get given) are playing with the way the code itself is managed.
You do not have to use these OTP structures, but if you do then you can benefit
from a lot of code-management that people have perfected over the years.
Processes
The short piece above on OTP should give you a sense of how Erlang brings something quite different to the table: you are able to have a much finer control on how your code is loaded/run/stopped/started/etc.
The building block for this is the process. In Erlang it is trivially easy to create a new process and run a function within that process. Here we run a simple multiply-and-print function from within a newly-spawned process:
% this_module.erl
my_multiplication_function(Arg1, Arg2) ->
io:format("~p times ~p is ~p", [Arg1, Arg2, Arg1 * Arg2]).
start() ->
spawn(this_module, my_multiplication_function, [2, 10]),
But why?
One of the main reasons for wrapping functions inside processes is that it allows us to build fault-tolerant systems.
If we run a function inside a process and a one-in-a-million bug happens that kills the function then our process will exit and we can react to that -- probably by restarting the process and re-running the function (perhaps with a modified set of parameters that are less likely to break).
Fault-tolerant systems
When Erlangers talk about 'let it crash', they do not mean that they write shoddy code which does not handle edge cases and therefore breaks.
They mean that if a function breaks for some unexpected (and probably hard to debug) reason, then:
- let the process die,
- let the process tell a supervising process that it is dead and give it a stack trace,
- so that the supervisor process (or its supervisor, or its supervisor's supervisor, ...) can potentially restart that process and function in a safer state.
These random 'breaks for unexpected reasons' will happen very often in a system which handles many many requests. I quote from Fred's Zen of Erlang:
... a once in a billion bug will show up every 3 hours in a system doing 100,000 requests a second ...
... a once in a million bug could similarly show up once every 10 seconds on such a system ...
Answering my own StackOverflow question
Go have a look at my Erlang noobie question on StackOverflow, and my answer.
At that point in time I had grokked that Erlang's data-manipulation language was sufficiently similar to Python and had gotten some familiarity with the process-spawning parts, but I had not yet grokked how OTP fitted into the language.