Sunday, January 29, 2017

A simple language with indentation based blocks. Part 1: Parsing combinators

In this post, I'm going to start with the implementation of the parser using the F# programming language. There are several tools for creating parsers for example FParsec http://www.quanttec.com/fparsec/ or FsYacc/FsLex https://en.wikibooks.org/wiki/F_Sharp_Programming/Lexing_and_,arsing. However for this experiment I wanted to create a small set of simple parser combinators, just to learn more about them.

The contents of this post is loosely based on the definition of the Parser Monad . There are many great articles on the net about them. One of this papers is the incredibly nice Monadic Parsing in Haskell http://www.cs.nott.ac.uk/~pszgmh/pearl.pdf .

Let's start by defining the type of the state of our parser:

   type ReaderState = { Data : string;
                        Position: int;
                        Indentation: int list}

This state contains the following:

  • Data : the input string containing the program to parse
  • Position : the current position of the parser inside the Data string
  • Indentation : an indentation stack (more on that in future posts)

And now we're going to define the type of a "parser" :

ReaderState ->  (('a * ReaderState) option )

This means that a "parser" is a function that takes the reader state and returns a tuple of some parsed value ('a) and the new parsing state. Notice that parsers could fail, so the result it's wrapped in an option type https://docs.microsoft.com/en-us/dotnet/articles/fsharp/language-reference/options.

A very simple example of a parser is the following:

   let readChar(state : ReaderState) : (string * ReaderState) option =
       if state.Data.Length > state.Position then
          Some (state.Data.[state.Position].ToString(),
                { state with Position = state.Position + 1 } )
       else
          None

We can test this parser using the following function:

   let parse (input : string) (parser : (ReaderState ->  (('a * ReaderState) option ))) =
       parser {  Data = input ; Position = 0; Indentation = [0] }

For example:

> parse "hello" readChar;;        
val it : (string * ReaderState) option = Some ("h", {Data = "hello";
                                                     Position = 1;
                                                     Indentation = [0];})

Here we can see that the result is a successful optional Some with a string with the single recognized char . Also part of the result is the new reader state .

We can also specify operations for several characters for example:

   let readZeroOrMoreChars (pred:char -> bool) (state : ReaderState) : (string * ReaderState) option =
       let
         secondPosition = readingChar state.Position pred state.Data
         in
           Some ( state.Data.Substring(state.Position,
                                       secondPosition - state.Position),
                 { state with Position = secondPosition })

We can use this function as follows:

> parse "1231abc" (readZeroOrMoreChars System.Char.IsDigit);;
val it : (string * ReaderState) option = Some ("1231", {Data = "1231abc";
                                                        Position = 4;
                                                        Indentation = [0];})

Using this function we can define a parser for whitespace:

   let whitespace = readZeroOrMoreChars System.Char.IsWhiteSpace

Connecting parsers

The parsers I've been showing so far look very simple, and they are!. The goal is to have small parsers and combine them to create more interesting ones. The next operation is called concatParsers2 and it's goal is create a new parser that will execute two parsers in sequence. That is, apply the first one and with the resulting state, execute the second one.

For example, the following parser for recognizing symbols use the readWithConditionOnChar parser with the readZeroOrMoreChars in order to create a PSymbol with the recognized element:

   let symbol =
       concatParsers2
          (readWithConditionOnChar  (fun c -> System.Char.IsLetter(c, 0)))      
          (fun initialChar ->
               concatParsers2
                  (readZeroOrMoreChars (fun c -> System.Char.IsLetter(c) || System.Char.IsDigit(c)))
                  (fun suffixString -> (preturn (PSymbol (initialChar + suffixString))))
           )

In our little language a symbol is represented by :

  • a letter
  • zero or more letters or numbers

We can use the readWithConditionOnChar and readZeroOrMoreChars to archive these goals, for example:

> parse "hello" (readWithConditionOnChar  (fun c -> System.Char.IsLetter(c, 0)))- ;; 
val it : (string * ReaderState) option = Some ("h", {Data = "hello";
                                                     Position = 1;
                                                     Indentation = [0];})
> parse "hi1and2and3" (readZeroOrMoreChars (fun c -> System.Char.IsLetter(c) || - System.Char.IsDigit(c)));;
val it : (string * ReaderState) option =
  Some ("hi1and2and3", {Data = "hi1and2and3";
                        Position = 11;
                        Indentation = [0];})

Notice that we cannot use just readZeroOrMoreChars since it will recognize symbols starting with numbers (like '13hello').

We can create a small parser for recognizing a letter (readWithConditionOnChar) and we can use readZeroOrMoreChars to read the rest. We could combine these two blocks to get another parser. The key for composing our parsers is the concatParsers2 function. This function has the following signature:

> concatParsers2;;
val it :
  ((ReaderState -> ('a * ReaderState) option) ->
     ('a -> ReaderState -> ('b * ReaderState) option) -> ReaderState ->
     ('b * ReaderState) option) = <fun:clo@21-1>
                     

This signature means that concatParsers2 receives a parser that produces a value of type 'a. The second argument is a function that receives a value of type 'a (the result of the first parser) and returns a parser of another type 'b. The result of concatParsers2 is itself another parser that produces a value of type 'b.

The implementation looks like this:

   let concatParsers2 (parser1 : (ReaderState ->  (('a * ReaderState) option )))
                      (parser2N : ('a ->  (ReaderState ->  (('b * ReaderState) option )))) =
       fun input -> match (parser1 input) with
                    | Some (matchedTxt, restState) -> (parser2N matchedTxt) restState
                    | _ -> None

This small function lets us create parsers using small blocks. But before we do that, we need to create another useful operation.

let preturn aValue (state : ReaderState ) = Some (aValue, state)

This operation is very simple, but really important!. It lets us prepare a result. By taking a look at the signature, we can see that it matches our parser signature. We can now used it in conjunction with concatParsers2 to produce results:

>  parse "AXXXC" 
-        (concatParsers2 recognizeABC                     
-                      (fun firstChar ->
-                           (concatParsers2
-                                recognizeXs              
-                                (fun xs -> 
-                                     (concatParsers2
-                                         recognizeABC
-                                         (fun lastChar ->
-                                              preturn (firstChar, xs, lastChar)- ))))));; 
val it : ((string * string * string) * ReaderState) option =
  Some (("A", "XXX", "C"), {Data = "AXXXC";
                            Position = 5;
                            Indentation = [0];})

Given that we define recognizeABC and recognizeXs as follows:

> let recognizeABC = readWithConditionOnChar (fun c -> c = "A" || c = "B" || c= "C");;

val recognizeABC : (ReaderState -> (string * ReaderState) option)

> let recognizeXs = readZeroOrMoreChars (fun c -> c = 'X' ) ;;    

val recognizeXs : (ReaderState -> (string * ReaderState) option)

Using the concatParsers2 function seems a little verbose. We can borrow inspiration from the Haskell's Monad type class https://hackage.haskell.org/package/base-4.9.1.0/docs/Control-Monad.html#t:Monad by defining versions of the >>= operator as follows:

   let inline (>>=) (parser1 : (ReaderState ->  (('a * ReaderState) option )))
                    (parser2N : ('a ->  (ReaderState ->  (('b * ReaderState) option )))) = concatParsers2 parser1 parser2N

Now using this operator we can write:

parse "AXXXC" ( recognizeABC >>= (fun firstChar -> 
                recognizeXs  >>= (fun xs -> 
                recognizeABC >>= (fun lastChar ->
                preturn (firstChar, xs, lastChar)))))

Ignoring output from previous parsers

Another useful operation allows us to connect two parsers but

   let concatParsers (parser1 : (ReaderState ->  (('a * ReaderState) option )))
                     (parser2 : (ReaderState ->  (('b * ReaderState) option ))) =
       fun input -> match (parser1 input) with
                    | Some (_, restState) -> parser2 restState
                    | _ -> None

Notice that it's similar to the concatParsers2 but it ignores the result of parser1. This is useful for discarting whitespace:

let whitespaceNoNl =
    readZeroOrMoreChars
        (fun c -> System.Char.IsWhiteSpace(c) && c <> '\n')

let colon  =
    concatParsers whitespaceNoNl (readSpecificChar ':')
    

As with the concatParsers2 we can use another well known operator for this operation:

   let inline (>>) (parser1 : (ReaderState ->  (('a * ReaderState) option )))
                   (parser2 : (ReaderState ->  (('b * ReaderState) option ))) = concatParsers parser1 parser2

Useful parsing operations

We need to create more operations that help us create complex parsers. For example, we can create a small operation for optional elements:

   let optionalP (parser : (ReaderState ->  (('a * ReaderState) option ))) (defaultValue:'a) =
       fun input -> match (parser input) with
                    | result & Some _ -> result
                    | _ -> (Some (defaultValue, input))

We can use this new operator to parse numbers as follows:

   let number =
              ( (optionalP (readSpecificChar '-') "") >>= (fun neg -> 
                digitP  >>= (fun firstChar -> 
                (readZeroOrMoreChars (fun c ->  System.Char.IsDigit(c))) >>= (fun chars ->
                (optionalP decimalPartP "") >>= (fun dec ->                                                                               
                preturn (PNumber (neg + firstChar + chars + dec))))

Choosing from two options

The following operation helps us to try to use one parser and fallback to another if it didn't work:

   let disjParser (parser1 : (ReaderState ->  (('a * ReaderState) option )))
                  (parser2 : (ReaderState ->  (('a * ReaderState) option ))) =
       fun input -> match (parser1 input) with
                    | result & Some _ -> result
                    | _ -> parser2 input

A simple scenario using this parser looks like this:

> parse "39" (disjParser symbol number);; 
val it : (PExpr * ReaderState) option =
  Some (PNumber "39", {Data = "39";
                       Position = 2;
                       Indentation = [0];})
> parse "hello" (disjParser symbol number);;
val it : (PExpr * ReaderState) option =
  Some (PSymbol "hello", {Data = "hello";
                          Position = 5;
                          Indentation = [0];})

Repetitions

Identifying sequences of elements identified by a parser is very useful. We can define the following operations

   let rec oneOrMore parser accumulated =
       parser >>=
           (fun lastResult ->
              let result = lastResult::accumulated
              in disjParser (oneOrMore parser result) (preturn (List.rev result)))

   let zeroOrMore parser accumulated =
       disjParser (oneOrMore parser accumulated) (preturn accumulated)

A simple example of using this operation looks like this:

> parse "1   2 3  4" (zeroOrMore (whitespace >> number) []);;
val it : (PExpr list * ReaderState) option =
  Some
    ([PNumber "1"; PNumber "2"; PNumber "3"; PNumber "4"],
     {Data = "1   2 3  4";
      Position = 10;
      Indentation = [0];})

The next post will show how to create more interesting things with the pieces we have so far.

A simple language with indentation based blocks (part 0)

One of the first things I noticed when reading about Python, is the use of indentation for defining code blocks. For example:

def foo(x):
   while x < 20:
      if x > 10:
         print "argument is greater than 10"
         print "value: " + x
      else:
         print "the argument is less than or equal to 10"
         print "its value is : " + x

If you're only familiar to C-based language this seems a bit strange. Not using characters like '{}' or words like BEGIN/END seems fragile at first. For example one could expect something like:

def foo(x) {
   while x < 20 {
      if x > 10 {
         print "argument is greater than 10"
         print "value: " + x
      } else {
         print "the argument is less than or equal to 10"
         print "its value is : " + x
   }
}

I've always wondered how do you create a parser for this kind of a language. In the following series of posts I'm going to record my experiences writing a parser for a simple little language. This language will use a style similar to Python. I'm going to write the code using F# .

Posts

  1. Parsing combinators

Saturday, August 13, 2016

Solving a small primary school problem with Prolog

A couple of days ago my small son came home with math homework from school. The problem: add parenthesis to the following arithmetic expression so it makes sense.

14 * 3 - 8 / 2 = 17

When I saw that, I thought it was a nice little programming exercise. Also Prolog seems like an appropriate language to write the a solution for this problem.

To solve this problem we need at least to:

  1. Choose a representation for the input formula and the results
  2. A way to generate all possible combinations of arithmetic expressions
  3. Something to evaluate the arithmetic expression so we can get the result
  4. Let Prolog find the answer we need!

First, we need to generate all possible expressions from given the problem .

Input representation

We're going to represent the input formula as a list of the parts of the expression.

For example, given the following expression:

14 * 3 - 8 / 2 

The input representation for this formula is the following:

[ 14, '*', 3, '-', 8, '/', 2 ]

To represent the output formula I'm going to use a term with the form op(operator, left, right).

For example, to represent the following possible groupings:

(9*(6+(6/(6-9))))

It will be represented as:

 op(*, 9, op(+, 6, op(/, 6, op(-, 6, 9))))

Generating expression groupings

Given the representation of the problem we can write a predicate to generate all possible groupings of these operations.

After some unsuccessful attempts I came with the following predicate:

arith_op([X], X) :- number(X),!.
arith_op(Arr, op(Op, X, Y)) :-
    append(First, [Op | Second], Arr),
    arith_op(First, X),
    arith_op(Second, Y).

What I really like about Prolog is that with relative few words we can find a solution for problems like this.

Now I can take advantage from Prolog's backtracking mechanism and find all possible solutions for the following input.

?- arith_op([ 1, '*', 2, '+', 3, '/', 4]  ,X).
X = op(*, 1, op(+, 2, op(/, 3, 4))) ;
X = op(*, 1, op(/, op(+, 2, 3), 4)) ;
X = op(+, op(*, 1, 2), op(/, 3, 4)) ;
X = op(/, op(*, 1, op(+, 2, 3)), 4) ;
X = op(/, op(+, op(*, 1, 2), 3), 4) ;
false.

Evaluating the arithmetic expressions

Having a way to evaluate the expression is useful so we can verify the result of the operation. A simple way to implement it looks like this:

eval(op(Op,X,Y),Result) :-
     eval(X,R1),eval(Y,R2),
     ( (Op = '+',  Result is (R1 + R2))
     ; (Op = '-', Result is (R1 - R2))
     ; (Op = '*', Result is (R1 * R2))
     ; (Op = '/', Result is (R1 / R2))), !.
eval(X, X).

With this predicate we can get the result of an operation. For example:

?- eval(op('+', op('*', 34, 23), 34), R).
R = 816.

Solving the problem

With these two predicates we can solve the problem like this:

?- arith_op([ 14, '*', 3,'-', 8, '/', 2 ]  ,Operation), eval(Operation, 17).
Operation = op(/, op(-, op(*, 14, 3), 8), 2) ;
false.

Now it is useful to present the results using infix notation with parenthesis. To do this we can write the following predicate:

forprint(op(Op,X,Y)) :-
    writef("("),
    forprint(X),
    writef(Op),
        forprint(Y),
    writef(")"),!.
forprint(X) :-
    write(X),!.

Now we can write:

arith_op([ 14, '*', 3,'-', 8, '/', 2 ]  ,Operation), eval(Operation, 17), forprint(Operation).
(((14*3)-8)/2)
Operation = op(/, op(-, op(*, 14, 3), 8), 2) ;
false.

I can also use this predicate to generate samples of results for other groupings. For example:

?- arith_op([ 14, '*', 3,'-', 8, '/', 2 ]  ,Operation), eval(Operation, Result), Result > 0, forprint(Operation).
((14*3)-(8/2))
Operation = op(-, op(*, 14, 3), op(/, 8, 2)),
Result = 38 ;
(((14*3)-8)/2)
Operation = op(/, op(-, op(*, 14, 3), 8), 2),
Result = 17 ;
false.

Saturday, May 21, 2016

Some things I learned while creating a small program in Mercury

Some time ago I started creating a program using the Mercury programming language to create images using the Escape Time algorithm. The goal was to learn about the language by solving a small problem.

The current result of this experiment can be found here https://github.com/ldfallas/graphicswithmercury/. Here's a list of things I learned.

Terms for configuration files

For this program I wanted to have a configuration file to specify :

  • The resolution of the final image
  • The coordinates used to render the fractal
  • The formula to use with the escape time algorithm
  • The palette to be used to render the final image

To create this configuration file I could use XML or create a special file format and parse it using Mercury's DCG. However I chose to use a different alternative, which is to use the term syntax.

Here's an example of the configuration file:

  fractal_config(
   image_resolution(320,200),
   top_left(-2.0, 1.5),
   bottom_right(1.0 , -1.5),
   formula(z*z + z + c),
   palette(
      single(10,20,30),
      range(from(10, 30, 40),  to(30, 50, 76),127),
      range(from(200, 100, 50),to(150, 0, 0),100),
      range(from(200, 100, 50),to(150, 10, 10),27),
      single(0,0,0)
   )
).      

Here I'm saying that:

  • The image will have a 320px by 200px resolution
  • The real coordinates of this image are between -2.0 and 1.0 in the X axis and 1.5 and 1.5 in the Y axis
  • The formula used in the escape time algorithm will be z*z + z + c
  • The palette will be constructed with the given ranges of colors

In order to read these term I used the term and parser modules which provides an easy interface for reading terms.

Here's a code snippet showing how the file is being loaded.

:- pred read_fractal_configuration_from_file(
            string::in,
            maybe_error(fractal_configuration)::out,
            io::di, io::uo) is det.

read_fractal_configuration_from_file(FileName, Configurati      onResult, !IO) :-
    parser.read_term_filename( FileName,  ReadTermResult, !IO),
    ((ReadTermResult = term(_, Term),
         term_to_fractal_configuration(Term, ConfigurationResult))
      ; (ReadTermResult = error(ErrMessage, _),
          ConfigurationResult = error(ErrMessage))
      ; (ReadTermResult = eof,
          ConfigurationResult = error("Empty file"))
     ).

The parser.read_term_filename reads these terms to a term data structure. The term_to_fractal_configuration predicate creates a configuration structure from these terms. An error is returned if the file doesn't have the expected structure. This is archived using the maybe_error data type.

Here's an example of how the first part of the configuration is loaded:

:- pred term_to_fractal_configuration(
            term(string)::in,
            maybe_error(fractal_configuration)::out) is det.

term_to_fractal_configuration(Term, Result) :-
    (if Term = functor(atom("fractal_config"),Args,_) then
        term_to_fractal_config_resolution(Args, Result)
     else
        error_message_with_location("Expecting 'fractal_config'",Term, Message),
        Result = error(Message)
    ).

One interesting feature of the term library is that it stores line number information. This makes it easy to report errors that occurred in a specific line of the input file:

:- pred error_message_with_location(
            string::in,
            term(string)::in,
            string::out) is det.

error_message_with_location(Msg, functor(_, _, context(_, Line)), ResultMsg) :-
     string.append(", at line:",string.int_to_string(Line),TmpString),
     string.append(Msg, TmpString, ResultMsg).
error_message_with_location(Msg, variable(_, context(_,Line)), ResultMsg) :-
     string.append(", at line:",string.int_to_string(Line),TmpString),
     string.append(Msg, TmpString, ResultMsg).

Now the main predicate for reading the documentation from terms is the following:

:- pred term_to_fractal_config_resolution(
            list(term(string))::in, 
            maybe_error(fractal_configuration)::out).

term_to_fractal_config_resolution(Terms, Result) :-
   (if Terms = [functor(atom("image_resolution"),
                     [ functor(integer(Width), _, _),
                       functor(integer(Height), _, _) ],
                     _)|Rest1] then
       (if  Rest1 = [functor(atom("top_left"),
                     [ functor(float(LeftX), _, _),
                       functor(float(TopY), _, _) ],
                     _)|Rest2] then
                (if  Rest2 = [functor(atom("bottom_right"),
                     [ functor(float(RightX), _, _),
                       functor(float(BottomY), _, _) ],
                     _)|Rest3] then
                    
                    (if Rest3 = [functor(atom("formula"), [Term], _)|Rest4], term_to_expression(Term, ok(Expr)) then
                        (if Rest4 = [PaletteConfig], term_to_palette_config(PaletteConfig, ok(Palette)) then 
                              Result  = ok(config( { Width, Height },
                                                   { LeftX, TopY },
                                                   { RightX, BottomY },
                                                   Expr,
                                                   Palette
 ))
                          
                           else
                              Result = error("Error reading palette")
                        )
                    else
                      Result = error("Error reading formula"))
                 else
                    Result = error("Error expecting: bottom_right(float,float)")
        )

        else
           Result = error("Error expecting: top_left(float,float)")
        )
    else
       Result = error("Error expecting: image_resolution(int,int)")
    ).

One improvement opportunity here is to separate this predicate into several parts to avoid this nesting structure.

As shown above our final goal is to create a result of the following type:

:- type fractal_configuration --->
    config( { int, int },              % image resolution
            { float, float },          % top left cartesian coordinates
            { float, float },          % bottom right cartesian coordinates
            expression,                % formula
            array({ int, int, int })). % palette

This structure provides the necessary information to render the fractal. One special datatype here is expression which is used to store the formula used with the escape time algorithm.

This data type looks like this:


:- type operator ---> times ; plus ; minus ; division.

:- type expression ---> 
     literal_num(float)
     ; var(string)
     ; imaginary
     ; bin_operation(expression, operator, expression).

Since the term library parser can parse arithmetic expressions, I can write simple code that translates terms to these abstract datatype.

Here's the definition of the predicate that does the translation:

:- pred term_to_expression(term(string)::in, maybe_error(expression)::out) is det.

term_to_expression(functor(atom(AtomStr),Ops,_), Expr) :-
   (if (Ops = [Op1,Op2],
        op_to_string(Operator,AtomStr),
        term_to_expression(Op1, ok(Op1Expr)),
        term_to_expression(Op2, ok(Op2Expr))) then
      Expr = ok(bin_operation(Op1Expr, Operator, Op2Expr))
    else
      (if Ops = [] then
          (if AtomStr = "i" then
             Expr = ok(imaginary)
           else
             Expr = ok(var(AtomStr)))
       else
          Expr = error("Error"))
   ).
term_to_expression(functor(float(X),_,_), Expr) :-
   Expr = ok(literal_num(X)).
term_to_expression(functor(integer(X),_,_), Expr) :-
   Expr = ok(literal_num(float(X))).
term_to_expression(functor(big_integer(_,_),_,_), error("Error")).
term_to_expression(functor(string(_),_,_), error("Error")).
term_to_expression(functor(implementation_defined(_),_,_), error("Error")).
term_to_expression(variable(_,_), error("Error")).

Reading the palette

The palette used by the escape time algorithm is just an array of colors. The configuration file allows two kinds of elements for specifying the colors :

  • single(RED, GREEN, BLUE) a single color for the current entry
  • range(from(RED1, GREEN1, BLUE1), to(RED1, GREEN2, BLUE2), COUNT) to create a range of colors of COUNT steps between the two colors.

The following code reads the palette configuration:

:- pred terms_to_palette(
            list(term(string))::in, 
            list({int, int, int})::in,
            maybe_error(array({int,int,int}))::out) is det.

terms_to_palette([],TmpResult, ok(ResultArray)) :-
   list.reverse(TmpResult, ReversedList),
   array.from_list(ReversedList, ResultArray).

terms_to_palette([Term|Rest],TmpResult,Result) :-
   (if Term = functor(atom("single"),
                     [functor(integer(R),_,_),
                      functor(integer(G),_,_),
                      functor(integer(B),_,_)],
                     _) then
       terms_to_palette(Rest, [{R,G,B}|TmpResult], Result)
     else
      (if Term = functor(atom("range"),[
                            functor(atom("from"),
                                    [functor(integer(R1),_,_),
                                     functor(integer(G1),_,_),
                                     functor(integer(B1),_,_)],_),
                            functor(atom("to"),
                                    [functor(integer(R2),_,_),
                                     functor(integer(G2),_,_),
                                     functor(integer(B2),_,_)],_),
                            functor(integer(Count),_,_)
                         ], _) then
           int_interpolate_funcs(R1, R2, 1, Count, _, R2RFunc),
           int_interpolate_funcs(G1, G2, 1, Count, _, G2GFunc),
           int_interpolate_funcs(B1, B2, 1, Count, _, B2BFunc),
           gen_colors_for_range(1, Count, R2RFunc, G2GFunc, B2BFunc, [], RangeList),
           list.append(TmpResult, RangeList,  NewTmpResult),
           terms_to_palette(Rest, NewTmpResult, Result)
       else
           Result = error("Problem reading palette configuration"))
).

Given the following configuration:

...
   palette(
       range(from(20,244,100), to(200,0,56), 15),
       single(0,0,0)
...

We can generate the following palette:

1. {20, 244, 100}
2. {32, 226, 96} 
3. {45, 209, 93} 
4. {58, 191, 90} 
5. {71, 174, 87} 
6. {84, 156, 84} 
7. {97, 139, 81} 
8. {110, 122, 78} 
9. {122, 104, 74} 
10. {135, 87, 71} 
11. {148, 69, 68}
12. {161, 52, 65} 
13. {174, 34, 62} 
14. {187, 17, 59}
15. {200, 0, 56}
16. {0, 0, 0}

Sunday, April 10, 2016

Determinism categories in Mercury

When you define a predicate in Mercury, you have to specify if it can fail or succeed more than once. This is called a determinism category.

The category is specified as part of a predicate (or function) declaration. For example:

:- pred get_partitions(
          list(int)::in,
              list(list(int))::out) is multi.

The is multi section says that this predicate belongs to the multi category.

The following tables describe the main determinism categories.

Maximum number of solutions:

Mode 1 > 1
det x
multi x
semidet x
nondet x

Failure:

Mode Can fail?
det no
multi no
semidet yes
nondet yes

(Based on information from https://mercurylang.org/information/doc-latest/mercury_ref/Determinism-categories.html#Determinism-categories ).

Other categories exist : erroneous , failure, cc_mult and cc_nondet. These are not discussed in this post, see the link above for more info.

Now a some of each category:

det

These predicates in must always succeed . For example:

:- pred list_length(list(int)::in, int::out) is det.

list_length([_|R], Length) :-
   list_length(R,SubListLength),
   Length = SubListLength + 1.
list_length([], 0).

In this case this predicate is going to calculate the size of a list. It should not fail.

semidet

These predicates can either succeed or fail. For example the following code shows the definition of a first_even_number predicate.

:- pred first_even_number(list(int)::in, int::out) 
           is semidet.

first_even_number([X|R], N) :-
   (if (X mod  2) = 0 then
       N = X
    else
       first_even_number(R,N)).

Notice that this predicate can fail for a couple of reasons:

  • The input list may be empty
  • The input list may not contain an even number

I think it's pretty nice that you these situations will be handled without explicitly writing code for it.

A use of this predicate looks like this:

...
   (if first_even_number([3,41,5,32,342], EvenNumber) then
       io.write_string("First even number: ", !IO),
       io.write(EvenNumber, !IO)
    else
       io.write("Not even number found", !IO)
    ).

Another think that's pretty nice about Mercury is that, the compiler is going to detect inconsistent determinism annotations. For example, if I change the declaration of the predicate to:

:- pred first_even_number(list(int)::in, int::out) 
           is det.

The compiler is going to fail with the following error:

testsolutions.m:093: In `first_even_number'(in, out):
testsolutions.m:093:   error: determinism declaration not satisfied.
testsolutions.m:093:   Declared `det', inferred `semidet'.

multi

The multi category is used for predicates that succeed in multiple ways. At least one solution exists.

For example the following predicate is used to get a pair of lists that, when concatenated together, result in the input list (ex. [1,2,3] result in { [1],[2,3] }) .

:- pred partitions(list(int)::in, 
           {list(int), list(int)}::out) is multi.


partitions(InputList,Output) :-
(
   Output = {[] ,  InputList}
 ;
   InputList = [A|TMP],
   partitions(TMP,{R,B}),
   Output = { [A|R], B }
).

This predicate is multi because we can get several different pairs of lists that from an input list. For example:

main(!IO) :-
   solutions(partitions([513,242,355,4]),Pairs1),
   io.write(Pairs1,!IO),
   io.nl(!IO).

Running this program result in the following output:

[{[], [513, 242, 355, 4]}, {[513], [242, 355, 4]}, {[513, 242], [355, 4]}, {[513, 242, 355], [4]}, {[513, 242, 355, 4], []}]

In the case we use the solutions/2 Mercury predicate to get a list from the generated solutions.

It's important to note that multi predicates must not fail.

nondet

Finally the nondet category is used for predicates that result on 0 or many solutions.

For example the following predicate result on an even number that is part of the input list.

:- pred even_member(list(int)::in, int::out) is nondet.

even_member([X|_], X) :-
   (X mod  2) = 0.
even_member([_|Rest], X) :-
  even_member(Rest,X).

This predicate is nondet because :

  • The list may be empty
  • It may not contain an even number

Here's an example of how to use it:

main(!IO) :-
   solutions(even_member([513,242,18,355,12,4]),Pairs1),
   io.write(Pairs1,!IO).

Running this program result in the following output:

[4, 12, 18, 242]

Sunday, February 28, 2016

A couple of quick notes on Mercury #1

I'm trying to learn about Mercury programming language. Here's a quick list of things I learned recently about it.

Getting a Windows version of the Mercury compiler

I was able to get a version of the Mercury compiler by downloading a "Release of the day" version from here: http://dl.mercurylang.org/index.html.

I just followed the instructions from INSTALL file. The only requirement is to have a Cygwin version installed.

Operations are associated with a data type

For example the predicate for performing an operation N times is defined in the int module (int.fold_up, int_fold_down).

Predicates can be partially applied

Mercury supports a mechanism similar to currying in other languages.

For example, in the following invocation to int.fold_up we're not specifying values or variables for the last three arguments(these are resolved by the application inside int.fold_up):


:- pred write_pixel_data(
        io.binary_output_stream::in,
 array(int)::in,
        int::in,  % width
        int::in,  % height
        int::in,  % padding
        int::in,  % idx
        io::di, io::uo) is det.
.
.
.
int.fold_up(write_pixel_data(Stream,
            ImgData,
            Width * 3,
            Height,
            (RowWidth - (Width * 3))),0,(RowWidth*Height) - 1,!IO)

Bitwise operators look nice!

For example bitwise "and" look like:

   io.write_byte(Stream, (IntValue >> 8) /\ 0xFF,!IO), 

Code for this post can be found here

Sunday, October 18, 2015

Using traits to reuse code in Pharo

While working on a small Tetris-like clone in Pharo, I found an opportunity to reuse some code.

It's common for Tetris implementations to show a small preview of the tetrimino that comes next. Here's how it looks:

I wanted to create a separate Morphic component to show this preview. But I didn't want to duplicate the code required to paint the tetriminos. Sharing this code will allow me to have just one place to change the way tetriminos look. Also I didn't want to create a base class on top of both the game and the preview Morphs.

While reading about Pharo and Squeak I found that it supports Traits. The Pharo collaborActive book provides a nice explanation of how to use traits. Here's a quick definition from this document:

... traits are just groups of methods that can be reused in different classes.

This is exactly what I needed for reusing the tetrimino matrix drawing code. The following code shows the code defined in the game matrix Morph component.

drawOn: canvas
   "Draws the current game state"
   |rows columns currentValue rectangle currentColor cellWidth cellHeight|

   rows := gameState size x.
   columns := gameState size y.

   super drawOn: canvas.

   cellWidth :=   ((self width) / columns) asFloat truncated.
   cellHeight :=   ((self height) / rows) asFloat truncated.
   1 to: rows do: [ :row |
      1 to: columns do: [ :column|
         currentValue := gameState at: row at: column .
         currentValue ~= 0 ifTrue: [
                 currentColor := (colors at: currentValue).
                 rectangle := Rectangle left: (self bounds left)



                 canvas frameAndFillRectangle: rectangle
                                 fillColor:  currentColor
                                 borderWidth:  1
                                 borderColor: (Color white).
                  ]
          ]
       ].

Moving this code to a trait implies that I have to pass all instance state variables as an argument of the draw method.

The definition of the trait looks like this:

Trait named: #TTetriminoDrawing
    uses: {}
    category: 'TryTrix'

And the definition of the method inside this trait looks like this:

drawGameStateOn: canvas 
      width: areaWidth height: areaHeight 
      columns: columns rows: rows 
      morphBounds: morphBounds
      matrix: contentMatrix
      colors:  colorPalette
   "Draw the contents of the specified matrix on the given canvas"
   |cellWidth cellHeight currentValue currentColor rectangle|
      cellWidth :=   (areaWidth / columns) asFloat truncated.
   cellHeight :=   (areaHeight / rows) asFloat truncated.
   1 to: rows do: [ :row |
      1 to: columns do: [ :column|
         currentValue := contentMatrix at: row at: column .
         currentValue ~= 0 ifTrue: [ 
            currentColor := (colorPalette at: currentValue).
            rectangle := Rectangle left: (morphBounds left) + ((column - 1)*cellWidth) 
                                    right: (morphBounds left) + ((column - 1)*cellWidth) + cellWidth
                                    top: (morphBounds top) + ((row - 1)*cellHeight )
                                    bottom: (morphBounds top) + ((row - 1)*cellHeight ) + cellHeight.
            canvas frameAndFillRectangle: rectangle
                  fillColor:  currentColor
                  borderWidth:  1
                  borderColor: (Color white).
             ]
          ]
       ].

Now I can use this trait inside each morph definition:

Morph subclass: #TryTrixMorph
    uses: TTetriminoDrawing
    instanceVariableNames: 'gameState colors eventHandlers'
    classVariableNames: ''
    category: 'TryTrix'


Morph subclass: #TetriminoPreview
    uses: TTetriminoDrawing
    instanceVariableNames: 'matrix'
    classVariableNames: ''
    category: 'TryTrix'

This code can be found in here.