Saturday, November 10, 2012

Auto-threading

Auto-threading is when the compiler decides when to thread. I decided that my language would allow the compiler to automatically use light-weight threads where it can. Light-weight threads are those that do not limit their memory access. They can read and write to any part of the app's memory. It is up to the compiler to prevent any conflict by only thread when the threads will not access the same memory. To support this, the language itself has the following features.

Copy on write

To speed up subroutine calls, parameters are passed by reference but should the subroutine change them, a copy is made. To do this, a reference count is kept¹. If it is more than 1, a copy is made before the write occurs. This allows each thread to work with its own copy of memory and avoid collisions.

¹ The reference count is also used for garbage collection. If it reaches zero, the memory is reclaimed and reused.

No Globals

There are no global variables. Since languages like Java already do not allow globals, this restraint shouldn't be a great burden to programmers. Subroutines can only write to their own variables, which include their arguments.

Read-only Module Configuration Variables

When a module is loaded, it goes through an initiation phase. During this phase, the module can write to its module-scoped variables. This allows the module to adapt to the system it is in. But once the phase is over, the variables become read-only. They cannot be changed during the running of the app.

Complex Data Records: Binary Trees

In order to see how these restraints work, below is an example of a binary tree. Two subroutine are provided: insert a new value; and walk through the tree returning a list of its values. Both of them use the down-and-up technique of passing the thingy to modify down and then back up.

This is the record of a node in the binary tree:

    record Node:
        Any     value
        Node    left
        Node    right

You will notice that unlike other languages, there are no references (aka pointers). The language places a special marker in the variable to indicate the slot is empty, which can be tested for.

Insert

This is the subroutine to insert a value:

    subroutine insert
        given:
            Node tree
            Any  value
        returns:
            Node tree

        if empty tree:
            tree.value ← value
        else if value < tree.value:
            tree.left ← insert tree.left, value
        else
            tree.right ← insert tree.right, value

    return tree

The first part of the code declares the subroutine's interface. Note that the type Any is a special type that denotes any other type.

In the code, this line: tree.value ← value uses auto-vivification to create a new Node. All the elements of the new Node are created as empty. Then its value is set.

Finally, subroutines ends with a return statement. Although there must be a return at the end of every subroutine, additional returns may be present inside.

Walk

This subroutine walks through the tree and returns a list of values.

    subroutine walk
        given:
            Node tree
        returns:
            List values

        if not empty tree:
            push values, walk tree.left
            push values, tree.value
            push values, walk tree.right

    return values

A fairly straight-forward subroutine. But it can be rewritten as:

    subroutine walk
        given:
            Node tree
        returns:
            List values

        if not empty tree:
            List left  ← walk tree.left
            List right ← walk tree.right

            push values, left, tree.value, right

    return values

Written this way, the compiler can auto-thread to walk the left and right sides of the tree.