Anatomy of the debugging (and verification) process in Xr0

When you first fire up 0db on a simple program like

#include <stdlib.h>

void *
alloc() ~ [ return malloc(1); ]
{
	return malloc(1);
}

you are greeted with a fairly inscrutable prompt:

path init abstract state

What does this mean, and what does it have to do with the code above? Our goal here is to explain the phases that Xr0 moves through, which will helpfully make 0db more usable.

Anatomy of a Xr0-annotated function

Xr0 adds annotations to C, such as

~ [ return malloc(1); ]

above. The annotation attached to a function is called its abstract. Abstracts contain1 all the information that callers need to safely interact with a function. For example, this abstract says to callers of alloc, “this function allocates memory on the heap and returns it to you”. More complicated functions will have more complicated abstracts:

#include <stdlib.h>

void
myfree(void *p) ~ [
        setup: p = malloc(1);
        free(p);
]{
        free(p);
}

The “setup” statement in myfree’s abstract communicates to callers that p must be something returned from a call to malloc, so that there is no possibility of a double free or otherwise invalid invocation of free.

Phases of verification

Xr0 verifies a function by checking that the final states of the abstract and the body match, and that there are no memory leaks. It passes through the following phases in constructing these final states:

Additionally, if there are setup statements in the abstract, there is an additional SETUP phase introduced at the beginning of both ABSTRACT and ACTUAL.

Example

Let’s see how these phases play out by debugging myfree. At first we get

path init abstract state

This somewhat-cryptic message means that the state for ABSTRACT has been initiated.2

Typing “n” or “next”, we get our first look at the state:

phase:	ABSTRACT

text:
-->     setup: p = malloc(1);
        free(p);

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

This now tells us clearly where we are and that we’re in the ABSTRACT phase. Typing “next” twice, we come to the end of the ABSTRACT phase:

phase:	ABSTRACT

text:
        <end of frame>

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<ptr:heap:0>}} (p, π)
        ---------------------------- myfree

The <end of frame> simply indicates that we’re at the end of myfoo’s abstract.

Next the actual state is initiated:

path init actual state

Because the setup command is situated in the abstract, it executes immediately in a special frame that appears before we see the ACTUAL state:

phase:  SETUP (ACTUAL)

text:
        <end of frame>

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        ---------------------------- setup
        0: {void * := {0:<ptr:heap:0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

Hitting next again, we finally see the ACTUAL state:

phase:  ACTUAL

text:
-->     free(p);

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<ptr:heap:0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

Since there is only one line, we reach <end of frame> by typing “next”:

phase:  ACTUAL

text:
        <end of frame>

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<ptr:heap:0>}} (p, π)
        ---------------------------- myfree

Finally, the program exits, indicating that the AUDIT phase was successful, because the state produced in ACTUAL matches that from ABSTRACT.

Stepping and “inter

Something very peculiar happens if instead of using “next” we try “step” or “s”. Returning to the moment

phase:  ABSTRACT

text:
-->     setup: p = malloc(1);
        free(p);

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

and stepping we get

phase:  SETUP (ABSTRACT)

text:
-->     p = malloc(1);

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

This reflects that we’ve stepped into the statement

setup: p = malloc(1);

as the frame labelled “setup” on the stack shows. If we type “s” again things get stranger still:

phase:  SETUP (ABSTRACT)

text:
-->     call malloc(1);
        movret <t0>;
        p = <t0>;

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

This is because we’re now inside the execution of the expression

malloc(1)

We call this the “inter” frame of the expression or statement. Xr0 models it as the invocation

call malloc(1);

followed by commands to capture the returned value inside a temporary variable <t0> and then assign this value to p. But if we step again things get even stranger:

phase:  SETUP (ABSTRACT)

text:
-->     return .malloc(size);

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {int := {0:<int:{1}>}} (size, π)
        ---------------------------- malloc
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

We are now inside malloc’s abstract, which is defined axiomatically in libx as

axiom void *
malloc(int size) ~ [ return .malloc(size); ];

So the state above captures that malloc has been invoked with size equal to 1. The .malloc is the Xr0 internal function that allocates memory on the heap, so if we step we see

phase:  SETUP (ABSTRACT)

text:
-->     call .malloc(size);
        movret <t0>;
        return <t0>;

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        ---------------------------- inter
        0: {int := {0:<int:{1}>}} (size, π)
        ---------------------------- malloc
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

This is as deep into (or as high upon) the stack we get, because the fact that .malloc is a Xr0 primitive means the next step doesn’t take us into anything, but onto the next line:

phase:  SETUP (ABSTRACT)

text:
        call .malloc(size);
-->     movret <t0>;
        return <t0>;

context:
        typedef int size_t

return: <ptr:heap:0>

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        ---------------------------- inter
        0: {int := {0:<int:{1}>}} (size, π)
        ---------------------------- malloc
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

Note that a heap location has been created, and a pointer to it is stored in the return register. Stepping again this value is saved in <t0>:

phase:  SETUP (ABSTRACT)

text:
        call .malloc(size);
        movret <t0>;
-->     return <t0>;

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<ptr:heap:0>}} (<t0>)
        ---------------------------- inter
        0: {int := {0:<int:{1}>}} (size, π)
        ---------------------------- malloc
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

Now if we step we’re at the end of the frame, with the pointer is in the return value:

phase:  SETUP (ABSTRACT)

text:
        <end of frame>

context:
        typedef int size_t

return: <ptr:heap:0>

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<ptr:heap:0>}} (<t0>)
        ---------------------------- inter
        0: {int := {0:<int:{1}>}} (size, π)
        ---------------------------- malloc
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

This is the end of the frame corresponding to the evaluation of .malloc(size). Stepping again we find ourselves at the end of the frame for the statement

return .malloc(size);

so that we’re really at the end of the evaluation of call malloc(1):

phase:  SETUP (ABSTRACT)

text:
        <end of frame>

context:
        typedef int size_t

return: <ptr:heap:0>

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {int := {0:<int:{1}>}} (size, π)
        ---------------------------- malloc
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

Thus with another step we find ourselves back in myfree, in the midst of the evaluation of malloc(1):

phase:  SETUP (ABSTRACT)

text:
        call malloc(1);
-->     movret <t0>;
        p = <t0>;

context:
        typedef int size_t

return: <ptr:heap:0>

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        ---------------------------- inter
        ---------------------------- setup
        0: {void * := {0:<rconst:$0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

With a few more steps we finally get past the original setup command:

phase:  ABSTRACT

text:
        setup: p = malloc(1);
-->     free(p);

context:
        typedef int size_t

rconst:
        $0: ptr:{MIN:MAX}       (p)

stack:
        0: {void * := {0:<ptr:heap:0>}} (p, π)
        ---------------------------- myfree

heap:
        0:

Think about the fact that this is what is involved in the simulation of a simple statement like

p = malloc(1);

Return to contents page.


  1. I.e. in v1.0.0 they will contain all this information. For now they contain enough to prevent the kinds of bugs listed on our homepage↩︎

  2. It mentions “path” because in functions that contain conditions Xr0 branches into these according to some splitting logic that examines all possible execution paths. ↩︎