Code By Martin

Simple JSON Parser in Ioke

| Comments

After having enjoyed most of the days off over Christmas my fingers started itching - time to do some programming. Luckily around the same time I stumbled across Ola Bini’s posts on the Ioke language. As the Ioke guide begins:

Ioke is a general purpose language. It is a strongly typed, extremely dynamic, prototype object oriented language. It is homoiconic and it’s closest ancestors is Io, Smalltalk, Ruby and Lisp - but it’s quite a distance from all of them.

So what does this all mean? It means it’s fun! The language syntax is very regular (although not quite as regular as lisp) - code is data, data is code and everything is a message. It’s also one of relatively few languages that provide macros similar to those in lisp. In Ioke this means that a message stream can be operated on (modified, transformed, etc) before it’s evaluated, using normal Ioke methods. Anyone who groks lisp should be familiar with the power this gives.

In Ioke, a chain of messages is separated by space, with the first message being sent to the current receiver and subsequent messages are sent to the result of the previous one. For instance, in the code

"ioke rocks" upper println

the text literal “ioke rocks” results in an internal message which creates a Text object. To this object the upper messages is sent, which results in an upper case copy of the Text object is created. To this object the println message is sent, which prints the upper case text to the console.

As the Ioke reader handles space separated tokens, it’s quite easy to create macros to process non Ioke code. As a learning exercise I decided to try to implement a JSON parser. Nothing too fancy - and doesn’t even have to be secure - just enough to parse simple JSON into Ioke objects.

JSON is usually parsed in one of two ways - into custom objects/classes or into collections like dictionaries and arrays. For this exercise I decided to create Ioke dictionaries and arrays and I want the result to work like this:

json({
  "string" : "string1",
  "int" : 1234,
  "arr" : ["item1", "item2"],
  "dict" : {"key1":"value1"}
}) println  ;; => {dict={key1=value1}, arr=[item1, item2], int=1234, string=string1}

json(["string",
  1234,
  ["item1", "item2"],
  {"key1": "val1"}
]) println ;; => [string, 1234, [item1, item2], {key1=val1}]

Now, looking at JSON, it’s suspiciously similar to Ioke - [] is used for arrays, {} is used for “dictionaries”, comma separates entries, ” quotes strings, etc. The main problem is the colon “:” used for separating keys and values in a dictionary. As Ioke is a dynamic language we can redefine how messages are handled. Instead of using colon to define symbols, we can redefine it to instead create pairs suitable for a dict:

Text cell(":") = macro(call resendToMethod("=>"))

”:” is originally defined on DefaultBehavior Literals, but in the JSON the “:” message is always sent to Text, so it’s enough to redefine it there.

This single redefinition allows us to parse JSON directly with the Ioke reader - very nice.

But… Redefining “:” for all Text objects is not really good (especially not when Ola adds concurrency constructs!). Another alternative would be to redefine how the text objects are created and create a new “:” cell just for our JSON text objects - but I never got this to work, as the internal:createText message works with a raw Java string, which I couldn’t work with in a macro/method…

Update: With Ola’s recent commit, the previously described workaround can now be simplified to the following code.

Now the above redefinition of the “:” message can be used, but is visible only in the scope of a let:

json = macro(let(Text cell(":"), DefaultBehavior cell("=>"),
    call argAt(0)
))

;; Used as:

json({
  "string" : "string1",
  "int" : 1234,
  "arr" : ["item1", "item2"],
  "dict" : {"key1":"value1"}
  }) println

json(["string",
  1234,
  ["item1", "item2"],
  {"key1": "val1"}
  ]) println

As can be seen, the Ioke macro functionality gives us nice control over how messages are handled, and the environment can be changed before the arguments are evaluated. I’m looking forward to the day when we have full Java interop and can use these powerful constructs to control Java code!

Source code to the above example is committed to my Ioke fork at: git://github.com/melwin/ioke.git

/M

PS: The thing that confuses me the most: how the heck are you supposed to pronounce Ioke?

Update: Sam Aaron shared the fact that Ola pronounces it eye-oh-key in the following interview: http://www.akitaonrails.com/2008/11/22/rails-podcast-brasil-qcon-special-ola-bini-jruby-ioke

Ioke Syntax Highlighter for GeSHi

| Comments

Speaking of Ioke - just added a syntax highlighter definition for GeSHi in my Ioke fork repository. The commit can be found here:

http://github.com/melwin/ioke/commit/54cfd7e54de0be910385c6ec805693fd3ed4e294

The keyword definitions I borrowed (read stole) from Sam Aaron’s TextMate bundle (also in the Ioke repository). As Sam just said in the #ioke on freenode: “what goes around comes around”. :)

Highlighting test:

m = #/({areaCode}\d{3})-({localNumber}\d{5})/ =~ number

    describe("start",
      it("should return the start index of group zero, which is the whole group",
        (#/foo/ =~ "foobar") start should == 0
        (#/foo/ =~ "abcfoobar") start should == 3

        (#/foo/ =~ "foobar") start(0) should == 0
        (#/foo/ =~ "abcfoobar") start(0) should == 3
      )

      it("should return the start index of another group",
        (#/(..) (..) (..)/ =~ "fooab cd efbar") start(2) should == 6
      )

      it("should return the start index from the name of a named group",
        (#/({one}..) ({two}..) ({three}..)/ =~ "fooab cd efbar") start(:two) should == 6
      )

      it("should return -1 for a group that wasn't matched",
        (#/(..)((..))?/ =~ "ab") start(2) should == -1
        (#/({no}..)(({way}..))?/ =~ "ab") start(:way) should == -1

        (#/(..)((..))?/ =~ "ab") start(10) should == -1
        (#/({no}..)(({way}..))?/ =~ "ab") start(:blarg) should == -1
      )

      it("should validate type of receiver",
        Regexp Match should checkReceiverTypeOn(:start)
      )
    )
    
    x = #/bla #{"foo"} bar/

/M

Caching Filter Queries With Coherence

| Comments

A pretty nice thing that I recently tried with a customer was storing a query result in a query cache.

For instance, consider the following method:

public Collection getByFilter1(String cacheName, Filter f) {
    NamedCache c = CacheFactory.getCache(cacheName);
    return c.entrySet(f);
}

A query is executed across all nodes containing cache data in a cluster. The filter acts on the values in the cache, not the keys. And as all values are usually not available locally on every node the query needs to execute on the separate nodes.

The query above is correct, but not very efficient. In the above case we query and retrieve the data as well from the separate nodes. A more efficient way would be to only get the keys for the matching values from the nodes and then retrieve the values from the local near cache:

public Collection getByFilter2(String cacheName, Filter f) {
    NamedCache c = CacheFactory.getCache(cacheName);
    Set keys = c.keySet(f);
    return c.getAll(keys).values();
}

However, in this case we still perform the query every time.

Now - the clever part. Not so much on my side, but the Coherence engineers have thought things through and made the Filter implementations have good hashCode and equals implementations. This together with the fact that they are serializable makes them possible to use as keys in a cache! Sweet! Without changing our method’s interface we can add a query cache so that each query only is performed once.

public Collection getByFilter3(String cacheName, Filter f) {
    NamedCache c = CacheFactory.getCache(cacheName);

    NamedCache queryC = CacheFactory.getCache(cacheName + ".querycache");
    Set keys = (Set)queryC.get(f);

    if(keys == null) {
        keys = c.keySet(f);
        queryC.put(f, keys);
    }

    return c.getAll(keys).values();
}

Note that we only save the keys in the query cache. This to avoid having several caches with the same data. When near caching is used, getting the data for the keys can still be a local only operation. Compared to getting the data from the separate nodes it’s several orders of magnitude faster - depending on the usage patterns.

Of course if queries are different every time, the query cache will not help much. But in most high load applications the same data tends to be needed several times.

Additionally, the properties queried on should in most cases be indexed. This is important to avoid too much overhead when searching for an entry in a cache.

One thing to think about when adding query caches is: how is data updated?

As part of updating the value caches the query caches should preferably be emptied of the outdated cached queries. This could be done programmatically “manually” when the data is updated, or by hooking in code to clean the entries from the query cache using the map listener mechanism. The relevant code could do a query using the ContainsFilter.

To summarize: a little code can go a long way to improve performance without affecting the interface used by an application. Good when query heavy applications are adapted to use a distributed cache like Coherence.

/M

Scala Syntax Highlighting for WP

| Comments

Writing the previous post I realized that the Wordpress plugin for syntax highligting I was using (Highlight Source Pro) didn’t support Scala.

Instead I tried the WP-Syntax plugin, but this didn’t support Symbol literals:

val sym = 'foobar
println("Symbol is: " + sym)
val other = 'barfoo

This seems to be because of single quote being interpreted as a quotation character. A fix seems to be removing the single quote from the Geshi language definition file scala.php in the wp-syntax/geshi/geshi directory. While there one can also specify a regex for symbols and give them their own color:

val sym = 'foobar
println("Symbol is: " + sym)
val other = 'barfoo

Also check out the scala.php in the LAMP repository - it seems to add some additional keywords and other colors. I used this and made the changes described above.

Full new scala.php for Geshi after the break.

Clustering Scala Actors With Oracle Coherence for Fun and Profit

| Comments

[Disclaimer: I work for Oracle.]

Although I haven’t used it too much yet, Scala is definitely one of the languages I find most interesting right now. Many customers I work with are heavily Java focused, and getting a more flexible and powerful language with superb Java interoperability to use on the JVM feels very liberating. Now I just need to convince the customers that Scala is the future… :] But if Gosling likes it it must be good, right?

A few months ago Jonas BonĂ©r wrote about how Scala actors can be clustered with Terracotta. I really enjoyed the article and I think the idea of distributed, redundant actors is very appealing. The actor paradigm is a nice way of developing concurrent applications (see the intro in Jonas’ blog entry for more info) and if we can liberate the actors from the confines of a single JVM and easily distribute them over multiple hosts - all the better.

Clustering with Coherence

I’m not going to compare Coherence and Terracotta here. In short, Coherence provides, among other things, a distributed caching and code execution mechanism without single points of failure. Coherence can be downloaded for evaluation purposes from the Oracle web site.

The idea I wanted to try was to have Scala actors that store their state in the distributed Coherence cache and run as normal Scala actors on a single node in the cluster at a time. The node the actor runs on should be allocated by Coherence and if the node fails, the actor should be automatically started on another node with maintained state and without any lost messages.

Also, I wanted this to work as similarly to normal Scala actors as possible with compatibility between the two.

The Result

Before investigating the proof of concept solution, let’s look at the result and what it gives us.

Here’s a simple test application with a normal Scala actor. It uses the recommended way of creating actors with actor { ... }:

package coherencetest1

import java.util.Date

import scala.actors.Actor._

object ActorTest {
  def main(args : Array[String]) : Unit = {

    actor {
      var pings : Int = 0
      
      println("Actor started.")
      
      self ! ('ping, 1)
      
      loop {
        react {
        case ('ping, i : Int) =>
          pings = pings + 1
          println(new Date + " - Got ping: " + i + " Total pings: " + pings)
          
          Thread.sleep(1000)

          self ! ('ping, i+1)
        }
      }
    }

  }
}

When this code is run, a simple actor that sends a message to itself is created and started. It sleeps for 1 second to pace the execution and to simulate a task that takes time to perform (of course, normally you shouldn’t sleep in real actors as you tie up the thread).

When the code is run, the following is displayed:

Actor started.
Sun Jun 29 15:57:16 CEST 2008 - Got ping: 1 Total pings: 1
Sun Jun 29 15:57:17 CEST 2008 - Got ping: 2 Total pings: 2
Sun Jun 29 15:57:18 CEST 2008 - Got ping: 3 Total pings: 3
Sun Jun 29 15:57:19 CEST 2008 - Got ping: 4 Total pings: 4
Sun Jun 29 15:57:20 CEST 2008 - Got ping: 5 Total pings: 5
Sun Jun 29 15:57:21 CEST 2008 - Got ping: 6 Total pings: 6
...

Nothing too fancy, but a decent test case for our actor distribution. An important aspect of this actor is that it defines a local variable pings and prints a message in the initialization part, before the loop and react. The value of the local var must be maintained and the initialization code must only be run once, and not when an actor is started on a new node after a failure.

Let’s make it a distributed Actor:

package coherencetest1

import java.util.Date

import scala.actors.coherence.CoActor._

@serializable
object DactorTest {
  def main(args : Array[String]) : Unit = {

    dactor {
      var pings : Int = 0
      
      println("Actor started.")
      
      self ! ('ping, 1)
      
      loop {
        react {
        case ('ping, i : Int) =>
          pings = pings + 1
          println(new Date + " - Got ping: " + i + " Total pings: " + pings)
          
          Thread.sleep(1000)

          self ! ('ping, i+1)
        }
      }
    }

  }
}

What have we done here? Three things:

  1. Import `scala.actors.coherence.CoActor._` instead of `scala.actors.Actor._`
  2. Made the application object serializable
  3. Create the actor using `dactor { … }` instead of `actor { … }`

The first point is simple - we need access to the new functionality, so we import the new CoActor object instead of the standard Actor object.

For number two - this is slightly nasty. If I interpret things correctly; as the code block created as a parameter to react needs to be serializable (so that the actor can be distributed over the network), all enclosing types needs to be serializable. I struggled with this for a while and the only option seems to be creating a proper named serializable type… But since I want to be able to create an actor in-line, we need to do it this way.

For the last point - dactor { ... } is simply the function used to create a distributed actor instead of a normal actor.

Let’s run it:

2008-06-29 16:11:18.779 Oracle Coherence 3.3.1/389 <Info> (thread=main, member=n/a): Loaded operational configuration from resource "jar:file:/opt/coherence-3.3.1/lib/coherence.jar!/tangosol-coherence.xml"
2008-06-29 16:11:18.785 Oracle Coherence 3.3.1/389 <Info> (thread=main, member=n/a): Loaded operational overrides from resource "jar:file:/opt/coherence-3.3.1/lib/coherence.jar!/tangosol-coherence-override-dev.xml"
2008-06-29 16:11:18.786 Oracle Coherence 3.3.1/389 <D5> (thread=main, member=n/a): Optional configuration override "/tangosol-coherence-override.xml" is not specified

Oracle Coherence Version 3.3.1/389
 Grid Edition: Development mode
Copyright (c) 2000-2007 Oracle. All rights reserved.

2008-06-29 16:11:19.042 Oracle Coherence GE 3.3.1/389 <Info> (thread=main, member=n/a): Loaded cache configuration from resource "file:/crypt/dev/scala/CoherenceTest1/config/scalacoherence.xml"
2008-06-29 16:11:19.331 Oracle Coherence GE 3.3.1/389 <Warning> (thread=main, member=n/a): UnicastUdpSocket failed to set receive buffer size to 1428 packets (2096304 bytes); actual size is 714 packets (1048576 bytes). Consult your OS documentation regarding increasing the maximum socket buffer size. Proceeding with the actual value may cause sub-optimal performance.
2008-06-29 16:11:19.459 Oracle Coherence GE 3.3.1/389 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a
2008-06-29 16:11:22.662 Oracle Coherence GE 3.3.1/389 <Info> (thread=Cluster, member=n/a): Created a new cluster with Member(Id=1, Timestamp=2008-06-29 16:11:19.343, Address=192.168.54.1:8088, MachineId=24065, Location=process:31397@dellicious, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=1) UID=0xC0A836010000011AD4A9DCAF5E011F98
2008-06-29 16:11:22.834 Oracle Coherence GE 3.3.1/389 <D5> (thread=DistributedCache, member=1): Service DistributedCache joined the cluster with senior service member 1
Actor started.
Sun Jun 29 16:11:23 CEST 2008 - Got ping: 1 Total pings: 1
Sun Jun 29 16:11:24 CEST 2008 - Got ping: 2 Total pings: 2
Sun Jun 29 16:11:25 CEST 2008 - Got ping: 3 Total pings: 3
Sun Jun 29 16:11:26 CEST 2008 - Got ping: 4 Total pings: 4
Sun Jun 29 16:11:27 CEST 2008 - Got ping: 5 Total pings: 5
Sun Jun 29 16:11:28 CEST 2008 - Got ping: 6 Total pings: 6
...

After the Coherence initialization (which happens automatically and which I’ve disabled in the outputs below) the actor starts up as expected. However, if we start this on two nodes - there will be two actors created, and no way for a new JVM to get hold of a reference to a specific existing actor. To handle this, let’s specify a name for the actor that we create using the dactor(name : Symbol) { ... } function:

...
object DactorTest {
  def main(args : Array[String]) : Unit = {

    dactor('pingActor) {
      var pings : Int = 0
...

This simply means: Give me a reference to pingActor, but if it doesn’t exist - create it with the following body. This mechanism makes it easy to have a single instance of an actor even if the same application is running on multiple nodes, without having to explicitly check if an actor has already been created or not.

Now we can run the program on two different nodes. After the actor has started and is running on one node, I’ll kill that node:

Node 1Node 2
Actor started.
Sun Jun 29 16:30:41 CEST 2008 - Got ping: 1 Total pings: 1
Sun Jun 29 16:30:42 CEST 2008 - Got ping: 2 Total pings: 2
Sun Jun 29 16:30:43 CEST 2008 - Got ping: 3 Total pings: 3
Sun Jun 29 16:30:44 CEST 2008 - Got ping: 4 Total pings: 4
Sun Jun 29 16:30:45 CEST 2008 - Got ping: 5 Total pings: 5
Sun Jun 29 16:30:46 CEST 2008 - Got ping: 6 Total pings: 6
Sun Jun 29 16:31:02 CEST 2008 - Got ping: 19 Total pings: 19
Sun Jun 29 16:31:03 CEST 2008 - Got ping: 20 Total pings: 20
Sun Jun 29 16:31:04 CEST 2008 - Got ping: 21 Total pings: 21
Sun Jun 29 16:31:05 CEST 2008 - Got ping: 22 Total pings: 22
Sun Jun 29 16:31:06 CEST 2008 - Got ping: 23 Total pings: 23
Sun Jun 29 16:31:07 CEST 2008 - Got ping: 24 Total pings: 24
Sun Jun 29 16:31:08 CEST 2008 - Got ping: 25 Total pings: 25
Sun Jun 29 16:31:09 CEST 2008 - Got ping: 26 Total pings: 26
Sun Jun 29 16:31:10 CEST 2008 - Got ping: 27 Total pings: 27
Sun Jun 29 16:31:11 CEST 2008 - Got ping: 28 Total pings: 28
...
Sun Jun 29 16:30:47 CEST 2008 - Got ping: 6 Total pings: 6
Sun Jun 29 16:30:48 CEST 2008 - Got ping: 7 Total pings: 7
Sun Jun 29 16:30:49 CEST 2008 - Got ping: 8 Total pings: 8
Sun Jun 29 16:30:50 CEST 2008 - Got ping: 9 Total pings: 9
Sun Jun 29 16:30:51 CEST 2008 - Got ping: 10 Total pings: 10
Sun Jun 29 16:30:53 CEST 2008 - Got ping: 11 Total pings: 11
Sun Jun 29 16:30:54 CEST 2008 - Got ping: 12 Total pings: 12
Sun Jun 29 16:30:55 CEST 2008 - Got ping: 13 Total pings: 13
Sun Jun 29 16:30:56 CEST 2008 - Got ping: 14 Total pings: 14
Sun Jun 29 16:30:57 CEST 2008 - Got ping: 15 Total pings: 15
Sun Jun 29 16:30:58 CEST 2008 - Got ping: 16 Total pings: 16
Sun Jun 29 16:30:59 CEST 2008 - Got ping: 17 Total pings: 17
Sun Jun 29 16:31:00 CEST 2008 - Got ping: 18 Total pings: 18
Sun Jun 29 16:31:01 CEST 2008 - Got ping: 19 Total pings: 19^C

First Node 1 started up and ran the actor until Node 2 started. At this point the actor was distributed to Node 2 (determined by the automatic cache partitioning done by Coherence) and started there. As can be seen, the local state (the total pings) was persisted and transferred over. When Node 2 was killed the actor was migrated back and started on Node 1. Note that the state of the actor is persisted for each message, so a sudden shutdown of a JVM is not a problem.

One might wonder why the message for ping number 6 and 19 can be seen in both outputs - this happens as the actor was migrated while the actor thread was sleeping - before the react body was complete. This causes the new node to rerun the message (since the processing of the message didn’t complete on the old node) and the support code in the old node makes sure all messages sent by the old actor are discarded as it’s been terminated. It’s a bit tricky coding actors to be fully idempotent as not everything is handled in a transaction, but limiting side effects to sending messages at the end of the processing makes it fairly reliable.

Here’s a slightly more complex example:

package coherencetest1

import java.util.Date

import scala.actors.coherence.CoActor._

@serializable
object DactorTest2 {
  def main(args : Array[String]) : Unit = {
    init
    
    readLine
    
    val numActors = 80

    val actors = for(id <- 1 to numActors)
      yield dactor {
        loop {
          react {
          case 'ping =>
            println(new Date + " - Actor " + id + " got ping.")
            reply(('pong, id))
          }
        }
      }
    
    actors.map(_ ! 'ping).force
    
    var pongs = 0
    
    while(pongs < numActors) {
      receive {
      case ('pong, x : Int) =>
        pongs = pongs + 1
      }
    }
    
    println("Got " + pongs + " pongs.")
    
    readLine
  }
}

In this example 80 distributed actors are created and sent a ping message. After that the main thread receives all the pong replies. The output of this, when run on 4 nodes look like so:

Node 1Node 2
Sun Jun 29 09:19:34 CEST 2008 - Actor 1 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 12 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 15 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 18 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 22 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 25 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 27 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 29 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 41 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 42 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 47 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 50 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 54 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 56 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 61 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 75 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 79 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 80 got ping.
Got 80 pongs.
Sun Jun 29 15:19:34 CEST 2008 - Actor 9 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 13 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 19 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 20 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 21 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 23 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 24 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 28 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 33 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 36 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 46 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 52 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 57 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 58 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 62 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 67 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 68 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 71 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 77 got ping.
Node 3Node 4
Sun Jun 29 15:19:34 CEST 2008 - Actor 2 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 4 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 5 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 6 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 7 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 8 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 10 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 11 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 30 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 31 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 32 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 34 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 35 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 37 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 39 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 40 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 43 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 44 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 45 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 53 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 59 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 60 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 63 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 64 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 66 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 69 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 70 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 72 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 74 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 3 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 14 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 16 got ping.
Sun Jun 29 15:19:34 CEST 2008 - Actor 17 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 26 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 38 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 48 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 49 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 51 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 55 got ping.
Sun Jun 29 15:19:35 CEST 2008 - Actor 65 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 73 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 76 got ping.
Sun Jun 29 15:19:36 CEST 2008 - Actor 78 got ping.

The program was started on all 4 nodes, but only the first node passes the first readLine. The other nodes just init the distributed actor framework and wait. As can be seen, the actors were distributed over the running nodes as expected - however, with small numbers like this the distribution can be a bit uneven (compare Node 3 and Node 4).

The Solution

The solution to allow the distribution of serializable actors is based on the Coherence backing map listeners. These can be used to get notifications of which specific node is the master for a certain object. As there only is one master for an object at any point, and a new master is allocated automatically if a master fails, we can use this to determine where the actor should run.

The object returned from dactor { ... } is a Proxy object - very similar to the Proxy used in the standard Scala remote actors. In fact, this whole thing is built in a way similar to the standard Scala remote actors, with Proxy objects acting on behalf of the sender on the receiver side and receiver on the sender side.

Additionally, the Coherence grid invocation mechanism allows us to deliver messages to running actors directly to the node where it is running.

When a new dactor is created, the following happens:

  1. `dactor { … }` creates a new anonymous `CoActor` class with the dactor body set as the `init` method.
  2. The CoActor.distribute method is called which in turn saves the anonymous class instance to the Coherence cache. The object gets serialized when this happens. The key to the object in the cache is either the passed in symbol name, or a name created from a `java.util.UUID`
  3. The `dactor` function returns a Proxy object with the name of the newly created distributed actor.
  4. Meanwhile, in the node which Coherence designates the master of the object, the backing map listener gets an insert event and starts the actor.
  5. The default `act()` method is called, which in a CoActor calls the `init` method first.
  6. The `init` method contains the `dactor { … }` body, which gets executed.
  7. The body executes as normally up until the `loop` block. `loop` in the CoActor first saves the body of the loop into a variable in CoActor so that it can continue executing just this body after it’s started in a new node, without running the initialization code again. After this the loop body is run as normal.
  8. When `react` is hit, the thread breaks and the actor waits for incoming message using the normal actor scheduling mechanism.

When a message is sent to a distributed actor using a Proxy object the following happens:

  1. The message is sent using the Coherence invocation mechanism which transports the message to the master node where the cached actor runs.
  2. In the master node, a Proxy actor representing the sender (which does not need to be a distributed actor) is created - this is because the actor framework always needs a sending actor which the receiver can reply to.
  3. The sender Proxy is asked to send the actual message to the real distributed actor receiver.
  4. The normal Scala actor framework handles the passing of the message and scheduling of the actors.
  5. An overriden `send` in CoActor locks the actor and stores the message and a Save message in the mailbox.
  6. The Save message gets the actor to first persist itself to the Coherence cache with the real message still in the mailbox. This is to ensure the message isn’t lost in case of a node failure.
  7. After this the real message is processed by the actor and an overridden `react` saves the actor again after the message has been processed. This to update the cache with the new state of the actor.

If the distributed actor does a reply or sends a message to the sender, the Proxy which represents the non-distributed actor gets called as follows:

  1. As the recevier (the original sender) isn’t a distributed actor handled by Coherence we cannot use the invocation mechanism. Instead the message is just put into a message cache.
  2. A `MapListener` on every node checks to see if the newly inserted message in the message cache is intended for an actor running on that specific node.
  3. If so, the message is deleted from the cache and delivered to the local actor through a Proxy representing the sender - just as in the previous case.

The Limitations

The distributed actors are a bit limited in what they can do, as they always needs to be serializable and I didn’t want to change any of the standard Scala code. For instance - when a synchronous invocation is made the return channel is stored in the Scala actor. The return channel implementation used in Scala isn’t serializable, so I decided to not implement this feature for now.

Basically, only message sending (!), reply, sender, loop and react are allowed in the distributed actors. However, they can interoperate with normal actors as can be seen in this example:

val distActor = dactor('distActor) {
      loop {
        react {
        case ('ping) =>
          reply('pong)
        }
      }
    }
    
    actor {
      distActor ! 'ping
      loop {
        react {
        case x =>
          println("Actor got message: " + x)
        }
      }
    }

The Proxy objects can be used by actors as they serialize correctly.

The Conclusion

Distributing (or GridEnabling(tm) or whatever the word du jour is) actors to easily use the processing power and resilience multiple computers give but at the same time hiding the complexity from the developer is a nice way to fairly easily scale up an actors based application. To add more processing power or redundancy - just fire up new nodes.

The proof of concept I made here just scratches the surface, but it’s interesting to see that it can be done with Coherence while maintaining the syntax and behavior expected by Scala developers.

The Source

The highly experimental and hackish source code for the proof of concept is available in the git repository at:

http://martin.elwin.com/git/scala-coherence.git

Dependencies are Scala (2.7.1 recommended) and Oracle Coherence. There are some scripts for Linux which are trivial to adapt to other operating environments.

/M

DCOP and Xmobar

| Comments

For me, the computer is a work tool that should get out of the way as much as possible to allow me to perform the task at hand. I spend many hours every day working with, primarily, my laptop, and seemingly insignificant improvements to the desktop environment and development tools add up over time to improve my overall efficiency. Because of this I like to spend time looking into how the tools I use can be tuned to better help in what I need to do.

One of the fairly recent chances I’ve made is to switch to a tiling window manager. For those too lazy to read the Wikipedia entry: A tiling window manager automatically arranges the open windows to fill the available space of the screen - saving you the job of arranging the windows yourself. That sounds simple, but can be quite complex in practice.

I started out using Ion3 in mid 2007. Ion3 is nice and smooth and worked great and really got me hooked on tiled window managers. However, as a big fan of open source, I was a bit concerned by the author’s approach to controlling the development of Ion (see this or this for instance) so I switched to wmii. wmii was also quite nice, but I never got into it as well as Ion, for some reason.

That’s when I found xmonad - a very nice window manager written in Haskell. Haskell is also used as the configuration language, which makes the configuration possibilities close to endless (actually, they probably are endless, as Haskell is Turing equivalent (although there are people who doubt it!) :).

Today, some 6 months later, I use xmonad 0.7 together with KDE 3.5.9 (similar to the setup described on the Haskell wiki) on Kubuntu 8.04 and am very happy with it.

Status Bar

However, xmonad is just a tiling window manager, but I also want a status bar to allow me to see chosen pieces of information at a glance. I use KDE3 and have the KDE3 panel kicker running, but it’s hidden unless I put the mouse in the bottom left corner. I don’t use it normally and only keep it around to access the system tray (which I almost never need). Instead of the ugly and bulky kicker I want a smaller, slimmer alternative…

Enter xmobar - a text based status bar, also written in Haskell. xmobar comes with a number of plugins to show different aspects of the system on the status bar - date, cpu, memory, network activity, etc. A few things it doesn’t do out of the box, which I want to see, are number of new mails, current keyboard layout (I use US mainly, but switch to Swedish for the national chars) and speaker mute state. Luckily, xmobar provides a plugin to execute shell commands, and using this mechanism we can put anything on the status bar, as long as we can figure out a command to run to produce the wanted text.

New Mails

I use a Maildir mail storage (together with dovecot to provide an IMAP interface to my mail), and since I don’t automatically split mails into groups/folders it’s enough to just check the number of mails in my Maildir’s new folder. The xmobar command for this ends up as:

Run Com "sh" ["-c", "\"ls ~/Maildir/new | wc -l\""] "mailcount" 600

This command invokes the sh shell and passes a command string using the -c switch. The command string contains the actual shell command we want to execute: ls ~/Maildir/new | wc -l

Executing sh and passing a command string allows us to use the pipe to pass the output of the ls to the wc to get the actual file count back.

The command is mapped to the mailcount alias, and the interval is set to 600 tenths of a second = 60 seconds.

Keyboard Layout

Since I use KDE3 the keyboard layout switching is handled by the kxkb application (shows up as a flag in the tray if multiple layouts are configured). KDE3 uses DCOP (unlike KDE4 which uses D-Bus) for IPC. Available DCOP services can easily be interogated from the command line using the… you guessed it… dcop command.

Running dcop without parameters shows a list of available applications. On my system it shows:

% dcop
konsole-9099
kdebluetooth
kicker
kxkb
guidance-6223
kded
adept_notifier
kmix
knotify
kio_uiserver
klauncher
khotkeys
kwalletmanager
digikam-6484
klipper
ksmserver
knetworkmanager

Woohoo! kxkb is there, so let’s dig deeper. By a bit of trial and error we can probe the internals of the dcop-exposed application:

% dcop kxkb
qt
MainApplication-Interface
kxkb
% dcop kxkb kxkb
QCStringList interfaces()
QCStringList functions()
bool setLayout(QString layoutPair)
QString getCurrentLayout()
QStringList getLayoutsList()
void forceSetXKBMap(bool set)
% dcop kxkb kxkb getCurrentLayout
us

Sweet! Now we have the dcop “path” to find the current keyboard layout. Let’s add it to the xmobar configuration:

Run Com "dcop" ["kxkb", "kxkb", "getCurrentLayout"] "kbd" 20

Speaker Mute State

Using dcop again we can find the mute state:

% dcop kmix
qt
MainApplication-Interface
Mixer0
kmix
kmix-mainwindow#1
% dcop kmix Mixer0
QCStringList interfaces()
QCStringList functions()
void setVolume(int deviceidx,int percentage)
void setMasterVolume(int percentage)
void increaseVolume(int deviceidx)
void decreaseVolume(int deviceidx)
int volume(int deviceidx)
int masterVolume()
void setAbsoluteVolume(int deviceidx,long int absoluteVolume)
long int absoluteVolume(int deviceidx)
long int absoluteVolumeMin(int deviceidx)
long int absoluteVolumeMax(int deviceidx)
void setMute(int deviceidx,bool on)
void setMasterMute(bool on)
void toggleMute(int deviceidx)
void toggleMasterMute()
bool mute(int deviceidx)
bool masterMute()
int masterDeviceIndex()
void setRecordSource(int deviceidx,bool on)
bool isRecordSource(int deviceidx)
void setBalance(int balance)
bool isAvailableDevice(int deviceidx)
QString mixerName()
int open()
int close()
% dcop kmix Mixer0 masterMute
true

Perfect - now we can create the last xmobar run command:

Run Com "dcop" ["kmix", "Mixer0", "masterMute"] "mute" 20

Final Configuration

So, here’s the final .xmobarrc:

Config { font = "xft:Consolas-8"
       , bgColor = "black"
       , fgColor = "grey"
       , position = Bottom
       , commands = [ Run Network "eth0" ["-L","0","-H","32","--normal","green","--high","red"] 50
                    , Run Network "wlan0" ["-L","0","-H","32","--normal","green","--high","red"] 51
                    , Run Cpu ["-L","3","-H","50","--normal","green","--high","red"] 52
                    , Run Memory ["-t","Mem: <usedratio>%"] 54
                    , Run Date "%a %b %_d %H:%M:%S" "date" 10
                    , Run StdinReader
                    , Run Com "dcop" ["kxkb", "kxkb", "getCurrentLayout"] "kbd" 20
                    , Run Com "sh" ["-c", "\"ls ~/Maildir/new | wc -l\""] "mailcount" 600
                    , Run Com "dcop" ["kmix", "Mixer0", "masterMute"] "mute" 20
                    ]
       , sepChar = "%"
       , alignSep = "}{"
       , template = "<fc=#ee9a00>%date%</fc> | %cpu% | %memory% | %eth0% - %wlan0% } %StdinReader% { Mail: %mailcount% | Kbd: %kbd% | Mute: %mute%"
       }

And a small screenshot for your viewing pleasure:

(Yes, my laptop’s hostname is dellicious… ;)

Scala XML and Java DOM

| Comments

How do we go from and DOM to Scala XML the most efficiently? Well… That’s what I asked myself the other day. The Scala API scala.xml.XML object has helper functions to create Scala XML structures from the usual suspects: Strings, InputStream, Reader, File, etc. But DOM Document or Element is missing.

Going via a Byte array, Char array or String is simple. For instance, the following outputs the DOM to a char array, which is then used as the source of the Scala XML creation:

Using Char array
//dom is the DOM Element
val charWriter = new CharArrayWriter()
TransformerFactory.newInstance.newTransformer.transform(new DOMSource(dom), new StreamResult(charWriter))
val xml = XML.load(new CharArrayReader(charWriter.toCharArray))

This works fine, and is reasonably fast. However, it does allocate some unnecessary memory (the char array) and performs some unnecessary parsing (of the char array) - both of which we’d really like to avoid.

How do we do this? Well, here’s one option that I came up with:

Using SAX
val saxHandler = new NoBindingFactoryAdapter()
saxHandler.scopeStack.push(TopScope)
TransformerFactory.newInstance.newTransformer.transform(new DOMSource(dom), new SAXResult(saxHandler))
saxHandler.scopeStack.pop
val xml = saxHandler.rootElem

What’s going on here? Well, the Scala XML library uses SAX to parse XML and create the XML structure. One way of generating SAX events is to walk a DOM tree, which is handled by the javax.xml.transform.Transformer with a DOMSource as input and a SAXResult as output. The extension of the DefaultHandler needed for handling the SAX events is implemented by the scala.xml.parsing.FactoryAdapter, which is extended by the NoBindingFactoryAdapter used to construct the XML structure. Because of this, we can do violence on the API and use the NoBindingFactoryAdapter directly as a SAX DefaultHandler - nice! The scopeStack calls are done to maintain the scope information, which I stole from the loadXML method in the AdapterFactory class.

However, let’s take a moment to reflect on this. Using the Scala XML library in this way is not really good. Even if it’s possible to do it this way, I’ve not seen it described as a supported way of using it and therefore it should be done only after considering that the next release of Scala might remove this possibility.

[Update 2008-07-02: Burak Emir kindly added a comment; “Don’t worry, the SAX factory adapter is not going to go away.” - good to know!]

That said - let’s consider this an exploration of possibilities which could potentially lead to an update of the Scala XML API to allow a DOM to be used as a source instead…!

A quick test gives the following result.

Test data is a 6897 bytes XML file containing 118 elements with some 4 different namespaces.

I ran each test in 1000 iterations, with a full garbage collection before the first iteration. For every 100 iterations I printed the delta of the free memory and then timed the time for the complete 1000 iterations.

Char array: 100 iterations use around 28 MB, full test: 1414 ms SAX: 100 iterations use around 18 MB, full test: 970 ms

So, in conclusion, not overwhelming difference, but around 1/3 faster and 1/3 less memory consumed. Can we do better? I’m not sure. :)

The next step is to do Scala XML to DOM… This could be more interesting. I see two options:

  1. Implement the DOM API wrapping Scala XML
  2. Generate SAX events based on the Scala XML and use that to build a DOM

Option 1 would be more efficient - but the DOM API isn’t fun implementing. Option 2 would be much simpler, but probably would be less efficient and require more allocations. Gotta think about this one…

/M

UML Use Case Diagrams & Graphviz

| Comments

First post!

Oh, right, this isn’t Slashdot…

Anywho, here goes:

While at a customer, a few colleagues and I were joking about Use Case diagrams, which in their most basic form tend to be somewhat meaningless. However, in more complex requirements definitions, they certainly can help define actors and sections of functionality to go into certain releases, for instance.

As I at the time was pondering how to best visualize the customer’s SOA architecture and service dependencies by generating Graphviz diagrams, I thought - hey, why not use Graphviz to quickly produce some snazzy Use Case Diagrams?

Well… the snazziness is debatable, and I didn’t manage it as quickly as I had hoped. But all considered, I think they turned out quite well.

To begin with, I have to figure out how to render a stick figure for a node. When using the postscript output, you can create custom shapes, but since I want to create just a PNG image, I need to use an image as the custom shape file.

With Inkscape I managed to produce the following:

Use Case Stick Figure

Very nice.

Saving this as stick.png, we can create a simple diagram:

1.dot
digraph G {
    "User" [shapefile="stick.png"];
    "Log In" [shape=ellipse];
    "User"->"Log In" [arrowhead=none]
}

This can be run through the Graphviz dot program thus:

cat 1.dot | dot -Tpng > 1.png

Which produces:

Use Case 1

Hmmm… Not quite what we wanted. Let’s make a few changes:

  • Make the graph left-right.
  • Get rid of the box.
  • Move the actor label to below the stick figure.

The most difficult of this is moving the label. This seems to require creating a cluster subgraph containing a label and the node with the custom shape. An example of this can be seen in this Graphviz tutorial. Another alternative is using HTML in the label - but this looks ugly. So, doing this, and refactoring a bit to create nodes explicitly, we get:

2.dot
digraph G {
   rankdir=LR;

    subgraph clusterUser {label="User"; labelloc="b"; peripheries=0; user};
    user [shapefile="stick.png", peripheries=0, style=invis];

    login [label="Log In", shape=ellipse];

    user->login [arrowhead=none];
}

… which gives us:

Use Case 2

Not too bad. Let’s add some more use cases:

3.dot
digraph G {
    rankdir=LR;
    labelloc="b";
    peripheries=0;

    /* Actor Nodes */

    node [shape=plaintext, style=invis];

    subgraph clusterUser {label="User"; user};
    user [shapefile="stick.png"];

    subgraph clusterAdmin {label="Administrator"; admin};
    admin [shapefile="stick.png"];


    /* Use Case Nodes */

    node [shape=ellipse, style=solid];

    log_in [label="Log In"];

    log_in_pwd [label="Log In Password"];
    log_in_cert [label="Log In Certificate"];

    manage_user [label="Manage User"];
    change_email [label="Change Email"];
    change_pwd [label="Change Password"];
    

    /* Edges */

    edge  [arrowhead="oarrow"];

    admin->user;

    edge [arrowhead=none];
    
    user->log_in;
    admin->manage_user;

    edge [arrowtail="vee", label="<<extend>>", style=dashed];

    log_in->manage_user;
    log_in->log_in_pwd;
    log_in->log_in_cert;

    manage_user->change_email;
    manage_user->change_pwd;
}

Run through dot this produces the following image:

Use Case 3

The obvious problem with this is that the nodes aren’t placed were we really want them to be. Automatic node placement in a directed graph is one the strong points of dot, but the default placement doesn’t really correspond to how we want our Use Case to look. So, let’s add some hints to help dot place the nodes in the way we like it:

4.dot
digraph G {
    rankdir=LR;
    labelloc="b";
    peripheries=0;

    /* Actor Nodes */

    node [shape=plaintext, style=invis];

    subgraph clusterUser {label="User"; user};
    subgraph clusterAdmin {label="Administrator"; admin};

    {
        rank=min;

        user [shapefile="stick.png"];
        admin [shapefile="stick.png"];
    }


    /* Use Case Nodes */

    node [shape=ellipse, style=solid];

    {
        rank=same;

        log_in [label="Log In"];
        manage_user [label="Manage User"];
    }

    log_in_pwd [label="Log In Password"];
    log_in_cert [label="Log In Certificate"];

    change_email [label="Change Email"];
    change_pwd [label="Change Password"];
    

    /* Edges */

    edge  [arrowhead="oarrow"];

    admin->user;

    edge [arrowhead=none];
    
    user->log_in;
    admin->manage_user;

    edge [arrowtail="vee", label="<<extend>>", style=dashed];

    log_in->manage_user;
    log_in->log_in_pwd;
    log_in->log_in_cert;

    manage_user->change_email;
    manage_user->change_pwd;
}

Which produces:

Use Case 4 Sweet!

Now, I don’t see most people working with requirements modeling getting too excited by this. The ones I’ve worked with tend to prefer graphical tools. But Graphviz is an excellent option when one wants to visualize data sets by automatically generating diagrams - and the more that can be done automatically, the better!

/M

Compressed and Encrypted Backup With SquashFS and LUKS

| Comments

Hardy Heron (KDE flavor) has been out for a while now and this weekend I finally decided to upgrade my (sweet sweet Dell M90) laptop from Gutsy. I used this as justification to get a new Hitachi SATA 200GB 7200RPM disk to replace the old 250GB 5200RPM in an effort to boost performance a little. Getting a new disk also makes the upgrade a lot less risky - I can keep the old disk as it is while setting up the new system.

This time around I decided to get rid of the Windows partition previously used for dual booting (haven’t booted into Windows in 6 months). I also wanted to switch from the extremely useful and amazing Truecrypt to using Linux native LUKS encryption for my work and private data. I originally used Truecrypt on Linux with an NTFS file system to make the encrypted drive compatible with both Windows and Linux, but since then I’ve switched to an ext3 file system and don’t need the capability to mount it both under Linux and Windows. If you do, make sure to try - nay, use! - Truecrypt. It’s very very nice.

So, the 200GB disk ended up being partitioned like so:

$sudo fdisk -l /dev/sda

Disk /dev/sda: 200.0 GB, 200049647616 bytes
255 heads, 63 sectors/track, 24321 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000cbcfd

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1        3647    29294496   83  Linux
/dev/sda2            3648        7294    29294527+  83  Linux
/dev/sda3            7295       23707   131837422+  83  Linux
/dev/sda4           23708       24321     4931955   82  Linux swap

That is, 30GB for root, 30GB to be encrypted, 135GB for data and the rest for swap.

Of course, with Hardy you can encrypt the root file system just by selecting to do so in the alternate installer, but I don’t feel a need to go that way right now - maybe later.

Backups

Now, slowly approaching the main topic of this post, while playing around with the new disk and the old data I was thinking about putting in place a new backup strategy. At the same time (I can do anything while simultaneously going through my subscriptions in Google Reader:) I happened upon the JWZ post about backups. JWZ, who obviously is a smart guy, is onto something. I’m already doing backups (of part of my data) using rsync to my server at home - not full disk image, admittedly, which was JWZ’s prescription. However, I wanted to try something different and achieve a few other things:

  • Retain multiple snapshots of data at a time
  • Compress the data
  • Encrypt the data

For a while I was considering ZFS (which I in a moment of temporary megalomania started porting to a native Linux kernel module, strictly for private use! - maybe something for another post) with its nice snapshot and compression support, but this idea was quickly discarded for being a bit too unnatural for most Linux setups. I added the additional requirement that the solution should work without having too many non-standard dependencies.

CromFS has a nice feature set, but it’s a FUSE file system and it’s not available in Hardy. The comparison chart on the CromFS page however led me to SquashFS. SquashFS is available out-of-the-box in Hardy and by apt-getting thesquashfs-tools package, everything needed is installed. The nice HowTo gives a good introduction to the commands required to use it. Note - SquashFS, as most compressed file systems, is read-only. This suites me perfectly as I want to make a snapshot for backup purposes, but it might not be what you want.

The idea I got was to use SquashFS as the compressed snapshot file system and wrap it in LUKS for encryption. By doing this I get a single compressed file which I can mount on a modern Linux distribution to access the data. It’s securely and (somewhat) efficiently stored.

Luckily, what I wanted to do is very similar to what the Gentoo guide for how to burn and encrypted CD image describes - namely wrapping an already existing file system image in a LUKS encrypted container. Most of the following is based on the Gentoo guide.

There are a few steps to the process:

  1. Create SquashFS image from source data
  2. Create LUKS container
  3. Put SquashFS image inside LUKS container

The trickiest (which isn’t really that tricky) part is figuring out how large to make the LUKS container. As we don’t know beforehand how large the SquashFS image is, we need to create it and then calculate how large the LUKS container should be.

Step 1: SquashFS

Let’s start with creating the SquashFS image:

$sudo mksquashfs /crypt /tmp/cryptbackup.sqsh -e /crypt/stuff
Parallel mksquashfs: Using 2 processors
Creating little endian 3.1 filesystem on /tmp/cryptbackup.sqsh, block size 131072.
[==========================================================================] 2297/2297 100%
Exportable Little endian filesystem, data block size 131072, compressed data, compressed metadata, compressed fragments, duplicates are removed
Filesystem size 13606.69 Kbytes (13.29 Mbytes)
        21.57% of uncompressed filesystem size (63077.89 Kbytes)
Inode table size 38419 bytes (37.52 Kbytes)
        32.62% of uncompressed inode table size (117780 bytes)
Directory table size 33629 bytes (32.84 Kbytes)
        58.10% of uncompressed directory table size (57880 bytes)
Number of duplicate files found 115
Number of inodes 3176
Number of files 1877
Number of fragments 45
Number of symbolic links  945
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 354
Number of uids 5
        ....
Number of gids 12
        ....

This creates a new compressed SquashFS image from the data in the /crypt directory (the contents of this directory becomes the contents of the image root). The -e flag excludes all files in the given directories - here, everything in /crypt/stuff. Note that it might be more efficient to store this image on a separate disk, if available. Even an external USB 2.0 mounted disk might be faster to write to while reading the data from the main disk.

The SquashFS image was compressed by about 50% in my case - 10GB of data stored in a 5GB image. Of course, the compression ratio achieved depends on the type data stored.

Step 2: LUKS Container

Step 1 done, we now need to create a LUKS container to store the SquashFS image in. How big should it be? Well… The Gentoo guide linked to above calculates the size of the LUKS overhead by checking the difference between a LUKS container and the mapped block device. It turns out that this overhead is 1032 blocks (each block being 512 bytes), no matter what the block size is. Googling this seems to confirm it, so for now I’m assuming that LUKS always adds 1032 blocks of overhead.

The size in 512 byte blocks of the SquashFS image can be found by doing:

$ls -l --block-size=512 /tmp/cryptbackup.sqsh
-rwx------ 1 root root 27216 2008-05-11 20:55 /data/cryptbackup.sqsh

Which in the above case indicates that the file is 27216 512 byte blocks large (this is a test file…).

Adding 1032 blocks gives us the size needed for the LUKS container - 28248 blocks - let’s create it (while letting the shell handle the calculation for us):

$sudo dd if=/dev/zero of=/tmp/cryptbackupluks.img bs=512 count=1 seek=$((27216+1032))

Note that this creates a sparse file on most modern file systems, so it’s quite quick. We don’t need to fill it with random numbers or anything as the whole container will be updated when we write the SquashFS image to it.

Now, let’s map it. First locate an available loop device:

$sudo losetup -f
/dev/loop0

loop0 is available - no loops on this system.

Set up the container file as a loop device:

$sudo losetup /dev/loop0 /tmp/cryptbackupluks.img

Then make it a LUKS volume:

$sudo cryptsetup luksFormat /dev/loop0

WARNING!
========
This will overwrite data on /dev/loop0 irrevocably.

Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase:
Verify passphrase:
Command successful.

Make sure you remember the password… ;P

And open the device:

$sudo cryptsetup luksOpen /dev/loop0 cryptbackup
Enter LUKS passphrase:
key slot 0 unlocked.
Command successful.

Now the device is available as /dev/mapper/cryptbackup, ready to accept our SquashFS image.

Step 3: Put SquashFS Image into LUKS Container

Let’s validate the overhead of LUKS:

$echo $((`sudo blockdev --getsize /dev/loop0`-`sudo blockdev --getsize /dev/mapper/cryptbackup`))
1032

Sweet! So, the size of the mapped device should be the same as our SquashFS image:

$sudo blockdev --getsize /dev/mapper/cryptbackup
27217

Hmmm… Close enough… ;P Perhaps there is some rounding to a full KB or something like that going on. Anywho, at least it is big enough for our 27216 block image. Let’s transfer it:

$sudo dd if=/tmp/cryptbackup.sqsh of=/dev/mapper/cryptbackup bs=512
27216+0 records in
27216+0 records out
13934592 bytes (14 MB) copied, 0.0544451 s, 256 MB/s

Done and done. To verify that it works we can mount the file system:

$sudo mkdir /mnt/cryptbackup
$sudo mount /dev/mapper/cryptbackup /mnt/cryptbackup

If all went well, ls /mnt/cryptbackup should now give the contents of the original directory.

To unmount, do:

$sudo umount /mnt/cryptbackup
$sudo cryptsetup luksClose cryptbackup
$sudo losetup -d /dev/loop0

Now remove the old SquashFS image /tmp/cryptbackup.sqsh and store the LUKS container /tmp/cryptbackupluks.img in a safe location. I use a portable external hard disk and the server at home to save the backup images. To mount later, just run a few of the, slightly modified, above commands again:

$LOOP=`sudo losetup -s -f /tmp/cryptbackupluks.img`
$sudo cryptsetup luksOpen $LOOP cryptbackup
Enter LUKS passphrase:
key slot 0 unlocked.
Command successful.
$sudo mount /dev/mapper/cryptbackup /mnt/cryptbackup

Now, all that remains is to create some helper scripts to avoid having to write all this every time I want to make a backup…

A drawback of this method is that it takes quite a while to perform the backups. It’s not incremental either, so the backups will presumably take longer and longer to make each time (as more data is accumulated). Next thing might be to try to create a base image with SquashFS and then do incremental backups with UnionFS or something… Hmmm……

/M