Announcing purescript-sequences
In many mainstream programming languages, the data structures that you are provided with are mutable by default. In particular, this is the case with arrays in JavaScript.
This is arguably quite unfortunate: immutability can offer some very significant benefits. For example, consider the following (JavaScript) code:
var s1 = getDataset1()
var s2 = getDataset2()
var allData = s1.concat(s2)
Assume that getDataset1
and getDataset2
each returns an array — so we
are getting two separate data sets and concatenating them.
Now suppose that later on in the code, I want to modify s1
; let’s say, I want
to take s1[0]
and increment it: s1[0] += 1
. Should allData
change in the
same way? Hopefully not; it seems this would lead to horrible bugs which would
be a nightmare to track down.
So we want to ensure that changes in s1
are not reflected in allData
.
Sadly, the only way we can do this in the presence of mutability is to copy
the whole array — despite the fact that, in many cases, we don’t even
need two copies of the data, nor do we need to spend the extra time copying
them.
In short, mutability puts us in a bind where we often have no choice but to waste our computer’s valuable memory and time in order to protect us from ourselves.
In PureScript, however, values are immutable by default. So we should be
able to sidestep this problem. Taking our example from above, there should be
no need to defensively copy everything; our concat
operation can reuse
parts of its arguments in order to decrease the amount of work it has to do, as
well as the amount of memory your program needs to use.
Currently, most PureScript code uses the Prim.Array
type, which is a normal
JavaScript array. This is often very useful when you want to do interop with
JavaScript libraries. But it prevents us from claiming the benefits of
immutability.
purescript-sequences is an
attempt to reap these benefits. It implements a general-purpose sequence data
type, Seq
. The API is similar to that of an array, but the internal
arrangement of data is very different. By leveraging immutability, it is able
to push a new element on to either end or remove an element from either end in
constant time, and concatenate two sequences in logarithmic time —
compare this to the corresponding JavaScript array operations, which are both
linear time.
I’ve just released version 0.1.0, which I believe is ready for public consumption. I’ve dogfooded it too; my multiplayer pacman game (source) now uses it internally.
I’ve also put together a benchmark, which shows the time taken
to cons a certain number of elements onto sequences and arrays. As expected,
the sequence results look linear: each cons
is O(1), so we expect that doing
n cons
operations is O(n). The array results look quadratic, which is again
what we expect: for arrays, each cons
is O(n), so we expect n cons
operations to be O(n2).
However, Seq
appears to be slower than native arrays up to some crossover
point. On my machine, it’s around 6,000 elements. So this library probably
won’t give you a speed boost in all cases. It also probably won’t be worth
using if you need to pass arrays back and forth between JavaScript libraries
and PureScript code, as converting between the two types is O(n).
Anyway, enjoy! Let me know if it works for you (or even, if it doesn’t).