Monday, November 23, 2009

Adding automatic semicolon insertion to a Javascript parser

A couple of weeks ago I wrote a blog post about a Javascript parser written using the Newspeak parsing combinators. As mentioned in that post, no semicolon insertion was supported. This post shows how the feature was added.

Automatic semicolon insertion



As detailed in section 7.9 of the ECMA 262 document[PDF], in Javascript you can use newline as statement separator in some scenarios. For example a semicolon is "implicitly inserted" if expression-statements are separated by line terminators:


if (condition) {
print("A")
print("B")
}


This code snippet is equivalent to:



if (condition) {
print("A");
print("B");
}


Solution



In the a original post about the parser, espin pointed me out to a paper[PDF] by A. Warth that mentions how the semicolon insertion problem was solved in a Javascript parser written in OMeta. The solution presented is this post is based on the one from the paper.


I wanted to isolate the code that performs this function. So in order to add this functionality I created a subclass that overrides the productions that get involved in this process. This way we can have both a parser with and without the feature. Here's the code:


class JSGrammarWithSemicolonInsertion = JSGrammar (
"Parser features that add automatic semicolon insertion"
|

specialStatementTermination = ((( cr | lf ) not & whitespace ) star,
(semicolon | comment | lf | cr | (peek: $})) )
wrapper: [ :ws :terminator | | t | t:: Token new. t token: $;. t].

returnStatement = return, (specialStatementTermination |
(expression , specialStatementTermination)).

breakStatement = break, (specialStatementTermination |
(identifier , specialStatementTermination)).

continueStatement = continue, (specialStatementTermination |
(identifier , specialStatementTermination)).

whitespaceNoEOL = (( cr | lf ) not & whitespace ) star,
(((peek: (Character cr)) | (peek: (Character lf))) not) .

throwStatement = throw, whitespaceNoEOL , expression , specialStatementTermination.

expressionStatement = (((function | leftbrace) not) & expression), specialStatementTermination.

variableStatement = var, variableDeclarationList, specialStatementTermination.
|
)


The result of parsing the following code:


var x = 0
while (true) {
x++
document.write(x)
if ( x > 10)
break
else continue
}


... is presented using the utility created for the previous post:



Code for this post is available here.

Monday, October 19, 2009

A quick look at J

In this post I'm going to show a small overview of the J programming language. An example of polynomial multiplication is examined.

J


J is an array programming language derived from APL which means is good for manipulating arrays and matrices. Its syntax and semantics are very different from other languages which makes it an interesting topic for studying.

There's a lot of documentation and examples available from the J software website. Two complete tutorials are "Learning J" by Roger Stokes and "J for C programmers" by Henry Rich.

The J distribution also includes examples and tutorials. The examples will be presented using J's REPL called jconsole.

The example



In order to give an overview of the language I'm going to start from the definition of a function that performs polynomial multiplication and examine how it was constructed.

The function is defined as follows:

polymulti =: dyad : '+/ (((_1 * i. #y) (|. "0 1) ((x (*"_ 0) y) (,"1 1) (((#y) - 1) $ 0))) , 0)'


Writing this example helped me understand some of the J's basic concepts (I'm completely sure there's a better/more efficient way to do this!).

Polynomial multiplication



The basic technique for polynomial multiplication consists on multiplying each term of one of the polynomials by the order and them simply the result. For example:


(4x3 - 23x2 + x + 1) * (2x2 - x - 3)

= ((4x3 - 23x2 + x + 1) * 2x2) +
((4x3 - 23x2 + x + 1) * -x) +
((4x3 - 23x2 + x + 1) * -3)

= (8x5 - 46x4 + 2x3 + 2x2) +
(-4x4 + 23x3 - x2 - x) +
(-12x3 + 69x2 - 3x - 3)

= (8x5 - 50x4 + 13x3 + 70x2 - 4x - 3)


Now I'll start creating polymulti from the bottom up to the definition.

1. Defining polynomials



As mentioned above J is a nice language for manipulating arrays. We're going to use J arrays to define polynomials. In fact J supports some operations on arrays as polynomials which will be described bellow.

In order to write a new array literal containing 1,1, -23 and 4 in J we write (here using jconsole):


1 1 _23 4
1 1 _23 4


As you can see the elements of the array are separated by space. Also negative numbers are prefixed by underscore '_'. This array is going to be used to represent the coefficients of the "(4x3 - 23x2 + x + 1)" polynomial.

We're going to define two variables with the polynomials shown above to use them for examples:


p1 =: 1 1 _23 4
p2 =: _3 _1 2


Here the '=:' operator is used to bind the specified arrays with p1 and p2.

2. Multiply each element of one of the polynomials



We can proceed to apply the first step in the process which is multiply each element of the second operand by the first operand.

In J we can operate on arrays easily, for example if we want to multiply each element of the above array by a 2 we write:


1 1 _23 4 * 2
2 2 _46 8


In J the an operation that receives two arguments is called a dyad and a operation that receives only one is called monad. Here we're using the '*' dyad to perform the multiplication.

We cannot directly go and type "p1 * p2" because we will get the following error:


p1 * p2
|length error
| p1 *p2


This happens because the length of the two arrays (4 and 3) could not be used to perform the operation. We can apply '*' to a two same size arrays and J will multiply each element. For example:


p1 * p1
1 1 529 16


Now what I want is to multiply each element of 'p2' by 'p1'. In order to do this we can change the behavior of the '*' dyad by specifying its rank (more details on how to do this can be found in "Verb Execution -- How Rank Is Used (Dyads)") . To change the rank of '*' and say that we want to multiply the complete array on the left for each of the cells of the right array we write (*"_ 0) .For example:


p1 (*"_ 0) p2
_3 _3 69 _12
_1 _1 23 _4
2 2 _46 8


By specifying (*"_ 0) we say that the left operand has "infinite rank" (_) which means it will consider the array as single unit.We also say that for the right argument we will consider every cell (by using the 0 rank). Notice that in order to modify the rank we use double quote (") which, in J, doesn't have to paired with another double quote as in most programming languages.

Also you can notice that the result of this operation is an array of arrays, one for each element of the right argument.

2. Sum each element of one of the polynomials

Now that we have an array of polynomials with the result of multiplying the coefficients, we need to change the degree of each of the polynomials. First we need to make more space to increment the degree of the polynomials. We're going to use the ',' dyad which lets you concatenate two arrays. For example:


p1 , 0 0
1 1 _23 4 0 0


We need to append an array that is the size of the second polynomial minus one. In order to create a new array of this size we use '$' which lets you create an array or matrix by specifying the size and the value of the elements of the new array. For example:


10 $ 0
0 0 0 0 0 0 0 0 0 0


The size of the array can be obtained with the '#' monad. For example:


# p1
4


Now we can combine all these elements to resize each array resulting from the multiplication of the coefficients like this:


(p1 (*"_ 0) p2)
_3 _3 69 _12
_1 _1 23 _4
2 2 _46 8
(((#p2) - 1 ) $ 0)
0 0
(p1 (*"_ 0) p2) (,"1 1) ( ((#p2) - 1 ) $ 0)
_3 _3 69 _12 0 0
_1 _1 23 _4 0 0
2 2 _46 8 0 0


Here we use (((#p2) - 1 ) $ 0) to generate an array of zeros of the desired size. Then we use (,"1 1) which is the ',' dyad with a modified rank saying that each element of the left matrix (an array) will be concatenated with the second array (since the second argument is a single dimension array we could have said (,"1 _) ).

With the arrays resized we can change the degree of our polynomial array. In order to do this we're going to use the '|.' dyad which let's you rotate an array an specified amount of positions. For example:

 
1 |. 1 2 3
2 3 1
_1 |. 1 2 3
3 1 2


As shown in the example, by specifying a positive number the array will be rotated to the left and a negative number to the right.

Now the question is, how to rotate each element of the polynomial array by a different element count? Before showing that we're going to introduce the 'i.' primitive which lets you create an array of with a sequence of numbers for example, to create an array of 10 numbers (starting with 0) we write:


i. 10
0 1 2 3 4 5 6 7 8 9


We can use this primitive in conjunction with the rotate primitive to say


m =: (p1 (*"_ 0) p2) (,"1 1) ( ((#p2) - 1 ) $ 0)
i. #p2
0 1 2
_1 * i. #p2
0 _1 _2
(_1 * i. #p2) (|."0 1) m
_3 _3 69 _12 0 0
0 _1 _1 23 _4 0
0 0 2 2 _46 8


As you can see we use the array generated by (_1 * i. #p2) to specify how many positions are we moving. We apply the change the rank of '|.' by saying (|."0 1) which means apply '|.' for each single cell of the left array to each row of the right array.

The previous step generated an array polynomials to be summed. So the only thing left is to generate the final polynomial. In order to do this we use the '+/' dyad which let's use sum the contents of an array. For example:


+/ 5 3 4 1
13


We can apply +/ to any array and J will do the operation as expected.


+/ (_1 * i. #p2) (|."0 1) m
_3 _4 70 13 _50 8


As a work around I'm appending a zero row to this matrix, just in case the p2 array is a 0 degree polynomial.


+/ ((_1 * i. #p2) (|."0 1) m) , 0
_3 _4 70 13 _50 8


And that's it we have the complete process for multiplying the array.

3. Create the function

Now putting all the elements described above we can put together the polynomial multiplication dyad:

polymulti =: dyad : '+/ (((_1 * i. #y) (|. "0 1) ((x (*"_ 0) y) (,"1 1) (((#y) - 1) $ 0))) , 0)'


The 'x' and 'y' names are the implicit names of the left and right operands.

We can use it with any pair of polynomials. For example:


2 34 3 polymulti 4 3
8 142 114 9
2 34 3 polymulti 4 0 3
8 136 18 102 9
2 34 3 polymulti 1
2 34 3


4. Use the polynomials

J already has support for polynomials inside the library. For example by using the 'p.' we can get an evaluation of a given polynomial.


3 4 5 p. 34
5919
3 + (4*34) + (5*34*34)
5919
_2 p. 34
_2
(5 3 _2 polymulti 3 0 0 _2)
15 9 _6 _10 _6 4
(5 3 _2 polymulti 3 0 0 _2) p. 3
204


Final words



It was very interesting to read about J. It offers a different perspective on programming which is definitely worth studying. I have to admit it was difficult at the beginning because of the number of concepts to learn ( only a couple are presented in this post) and the syntax . For future posts I'm going to try to explore more J features.

Tuesday, October 6, 2009

AS3 Getter/Setter support in AbcExplorationLib

Recently I added initial support for reading and writing AS3/Avm2 getters and setters to AbcExplorationLib.

Given the following ActionScript class:


class Complex {
public var radius:Number;
public var angle:Number;
public function Complex(ar:Number,aa:Number):void {
radius = ar;
angle = aa;
}

public function set imaginary(newImaginary:Number):void
{
var oldReal = this.real;
angle = Math.atan(newImaginary/oldReal);
radius = Math.sqrt(oldReal*oldReal + newImaginary*newImaginary);
}
public function get imaginary():Number
{
return radius*Math.sin(angle);
}

public function set real(newReal:Number):void
{
var oldImaginary = this.real;
angle = Math.atan(oldImaginary/newReal);
radius = Math.sqrt(newReal*newReal + oldImaginary*oldImaginary);
}
public function get real():Number
{
return radius*Math.cos(angle);
}
}


We compile it using the Flex SDK:
java -jar c:\flexsdk\lib\asc.jar complexclasstest.as


Then we can load it using the library:


Microsoft F# Interactive, (c) Microsoft Corporation, All Rights Reserved
F# Version 1.9.6.16, compiling for .NET Framework Version v2.0.50727

Please send bug reports to fsbugs@microsoft.com
For help type #help;;

> #r "abcexplorationlib.dll";;

--> Referenced 'abcexplorationlib.dll'

> open System.IO;;
> open Langexplr.Abc;;
> let f = using (new FileStream("complexclasstest.abc",FileMode.Open)) (fun s -> AvmAbcFile.Create(s));;

val f : AvmAbcFile

> let complexClass = List.hd f.Classes;;

val complexClass : AvmClass


> complexClass.Properties |> List.map (fun p -> p.Name);;
val it : QualifiedName list =
[CQualifiedName (Ns ("",PackageNamespace),"real");
CQualifiedName (Ns ("",PackageNamespace),"imaginary")]
> let realGetter = complexClass.Properties |> List.map (fun p -> p.Getter.Value) |> List.hd;;

val realGetter : AvmMemberMethod:
> realGetter.Method.Body.Value.Instructions |> Array.map (fun x -> x.Name);;
val it : string array =
[|"getlocal_0"; "pushscope"; "getlocal_0"; "getproperty";
"findpropertystrict"; "getproperty"; "getlocal_0"; "getproperty";
"callprop"; "multiply"; "returnvalue"|]





The library can be found here.

Thursday, October 1, 2009

Parsing Javascript using Newspeak parsing combinators

I've been working on a parser for Javascript/Ecmascript using Newspeak parsing combinators. The parser is based on the grammar presented in the ECMAScript Language Specification [PDF] document. It is still incomplete, however it can parse simple statements.

The grammar looks like this:


class JSGrammar = ExecutableGrammar (
"Experiment for JS grammar based on the description from http://www.ecma-international.org/publications/standards/Ecma-262.htm"
|
doubleQuote = (char: $").
backslash = (char: $\).
str = doubleQuote,((backslash, ( char: $" )) |
(backslash, ( char: $/ )) |
(backslash, backslash) |
(backslash, ( char: $r )) |
(backslash, ( char: $n )) |
(backslash, ( char: $t )) |
(charExceptFor: $")) star, doubleQuote.
string = tokenFor: str.

tilde = char: $~.
exclamation = char: $!.
starChar = char: $*.
slash = char: $/.
modulo = char: $%.
pipe = char: $|.
amp = char: $&.
cir = char: $^.
question = char: $?.
colon = char: $:.
semicolon = char: $;.

negSign = (char: $-).
plusSign = (char: $+).
digit = (charBetween: $0 and: $9).
dot = (char: $. ) .
lt = char: $<.
gt = char: $>.
eq = char: $=.
num = negSign opt, digit, digit star, dot opt,digit star, ((char: $e) | (char: $E)) opt, (plusSign | negSign) opt,digit star.
number = tokenFor: num.

tQuestion = tokenFor: question.
tColon = tokenFor: colon.
tplusSign = tokenFor: plusSign.
tnegSign = tokenFor: negSign.
tmodulo = tokenFor: modulo.
tslash = tokenFor: slash.
tstarChar = tokenFor: starChar.
texclamation = tokenFor:exclamation.
tdot = tokenFor:dot.
tLt = tokenFor: lt.
tGt = tokenFor: gt.
tEq = tokenFor: eq.
tAmp = tokenFor: amp.
tPipe = tokenFor: pipe.
tCir = tokenFor: cir.
tSlash = tokenFor: slash.
tSemicolon = tokenFor: semicolon.

tStarEq = tstarChar,eq.
tModEq = tmodulo,eq.
tSlashEq = tSlash,eq.
tPlusEq = tplusSign,eq.
tMinusEq = tnegSign,eq.
tAmpAmp = tAmp,amp.
tPipePipe = tPipe,pipe.
tLtEq = tLt,eq.
tGtEq = tGt,eq.
tleftShift = tLt,lt.
trightShift = tGt,gt.
tsRightShift = tGt,gt,gt.
tEqEq = tEq,eq.
tEqEqEq = tEq,eq,eq.
tNotEq = texclamation,eq.
tNotEqEq = texclamation,eq,eq.
tleftShiftEq = tleftShift,eq.
trightShiftEq = trightShift,eq.
tsRightShiftEq = tsRightShift,eq.
tAmpEq = tAmp,eq.
tPipeEq = tPipe,eq.
tCirEq = tCir,eq.

lineTerminator = (char: (Character lf)) | (char: (Character cr)).

regularExpressionLiteral =
tslash,
( ((backslash, ( charExceptForCharIn: { (Character lf). (Character cr). })) |
(charExceptForCharIn: { (Character lf). (Character cr). $/.})) plus),
slash, (identifierStart star).

leftparen = tokenFromChar: $(.
rightparen =tokenFromChar: $).

leftbrace = tokenFromChar: ${.
rightbrace =tokenFromChar: $}.
comma = tokenFromChar: $,.
propertyName = string | identifier | number.
propertyNameAndValue = propertyName,tColon,value.
obj = leftbrace, (propertyNameAndValue starSeparatedBy: comma),rightbrace.
object = obj.

leftbracket = tokenFromChar: $[.
rightbracket = tokenFromChar: $].
arr = leftbracket, (value starSeparatedBy: comma), rightbracket.
array = tokenFor: arr.

comment = (slash,starChar,blockCommentBody,starChar,slash) | (slash,slash, lineCommentBody).


ttrue = tokenFromSymbol: #true.
tfalse = tokenFromSymbol: #false.
null = tokenFromSymbol: #null.
function = tokenFromSymbol: #function.
tnew = tokenFromSymbol: #new.
break = tokenFromSymbol: #break.
case = tokenFromSymbol: #case.
catch = tokenFromSymbol: #catch.
continue = tokenFromSymbol: #continue.
default = tokenFromSymbol: #default.
delete = tokenFromSymbol: #delete.
do = tokenFromSymbol: #do.
else = tokenFromSymbol: #else.
finally = tokenFromSymbol: #finally.
for = tokenFromSymbol: #for.
if = tokenFromSymbol: #if.
in = tokenFromSymbol: #in.
instanceof = tokenFromSymbol: #instanceof.
return = tokenFromSymbol: #return.
switch = tokenFromSymbol: #switch.
this = tokenFromSymbol: #this.
throw = tokenFromSymbol: #throw.
try = tokenFromSymbol: #try.
typeof = tokenFromSymbol: #typeof.
var = tokenFromSymbol: #var.
void = tokenFromSymbol: #void.
while = tokenFromSymbol: #while.
with = tokenFromSymbol: #with.



letter = (charBetween: $a and: $z) | (charBetween: $A and: $Z).
identifierStart = letter | (char: $$) | (char: $_).
identifier = accept: (tokenFor: (identifierStart), (identifierStart | digit) star) ifNotIn: keywords .

value = assignmentExpression .

literal = null | ttrue | tfalse | number | string | regularExpressionLiteral.

primaryexpression = this | literal | identifier | array | object | parenthesized.

parenthesized = leftparen,expression,rightparen.

functionexpression = function , identifier opt,
leftparen,formalParameterList , rightparen ,
leftbrace,sourceElements,rightbrace.
formalParameterList = identifier starSeparatedBy: comma.

memberexpression = (simplememberexpression ),
(( leftbracket, expression, rightbracket) | ( tdot, identifier)) star.

simplememberexpression = primaryexpression |
functionexpression |
simpleNewExpression.
simpleNewExpression = tnew,memberexpression, arguments.



callExpression = (simplememberexpression ),
( arguments |
( leftbracket, expression, rightbracket) |
( tdot, identifier) ) star.
simpleCallExpression = memberexpression , arguments.
arguments = leftparen ,
(assignmentExpression, (comma, assignmentExpression) star) opt,
rightparen.

newExpression = memberexpression | simpleNewMemberExpression.
simpleNewMemberExpression = tnew, memberexpression.

plusPlus = plusSign,plusSign.
minusMinus = negSign,negSign.

leftHandSideExpression = callExpression | newExpression .
postfixExpression = leftHandSideExpression ,
((plusPlus | minusMinus) star).


unaryExpression = postfixExpression | complexUnaryExpression.

complexUnaryExpression =
(typeof, unaryExpression) |
(delete, unaryExpression) |
(void, unaryExpression) |
(plusPlus, unaryExpression) |
(minusMinus, unaryExpression) |
((tokenFor: ( plusSign | negSign | tilde | exclamation )), unaryExpression).

multiplicativeExpression =
unaryExpression, ((tstarChar | tslash | tmodulo), unaryExpression) star.

additiveExpression =
multiplicativeExpression, ((tplusSign | tnegSign), multiplicativeExpression) star.

shiftExpression =
additiveExpression, ((tsRightShift | tleftShift | trightShift), additiveExpression) star.

relationalExpression =
shiftExpression, (( tLtEq | tGtEq | tLt | tGt | instanceof | in) , shiftExpression) star.

relationalExpressionNoIn =
shiftExpression, (( tLtEq | tGtEq | tLt | tGt | instanceof ) , shiftExpression) star.

equalityExpression =
relationalExpression, ((tEqEqEq | tEqEq | tNotEqEq | tNotEq ), relationalExpression) star.

equalityExpressionNoIn =
relationalExpressionNoIn, ((tEqEqEq | tEqEq | tNotEqEq | tNotEq ), relationalExpressionNoIn) star.

bitwiseANDExpression =
equalityExpression,(tAmp, equalityExpression) star.

bitwiseANDExpressionNoIn =
equalityExpressionNoIn,(tAmp, equalityExpressionNoIn) star.

bitwiseXORExpression =
bitwiseANDExpression,(tCir, bitwiseANDExpression) star.

bitwiseXORExpressionNoIn =
bitwiseANDExpressionNoIn,(tCir, bitwiseANDExpressionNoIn) star.

bitwiseORExpression =
bitwiseXORExpression,(tPipe, bitwiseXORExpression) star.

bitwiseORExpressionNoIn =
bitwiseXORExpressionNoIn,(tPipe, bitwiseXORExpressionNoIn) star.

logicalAndExpression =
bitwiseORExpression, (tAmpAmp,bitwiseORExpression) star.

logicalAndExpressionNoIn =
bitwiseORExpressionNoIn, (tAmpAmp,bitwiseORExpressionNoIn) star.

logicalOrExpression =
logicalAndExpression, (tPipePipe,logicalAndExpression) star.

logicalOrExpressionNoIn =
logicalAndExpressionNoIn, (tPipePipe,logicalAndExpressionNoIn) star.

assignmentOperator =
tEq | tStarEq | tSlashEq | tModEq | tPlusEq | tMinusEq |
tleftShiftEq | tsRightShiftEq | trightShiftEq | tAmpEq | tPipeEq |
tCirEq.

conditionalExpression =
logicalOrExpression, (tQuestion, assignmentExpression,tColon,assignmentExpression) opt.

assignmentExpression =
conditionalExpression, (assignmentOperator,conditionalExpression) star.

conditionalExpressionNoIn =
logicalOrExpressionNoIn, (tQuestion, assignmentExpressionNoIn,tColon,assignmentExpressionNoIn) opt.

assignmentExpressionNoIn =
conditionalExpressionNoIn, (assignmentOperator,conditionalExpressionNoIn) star.

expression = assignmentExpression, (comma , assignmentExpression) star.

expressionNoIn = assignmentExpressionNoIn, (comma , assignmentExpressionNoIn) star.


statement = block | variableStatement | emptyStatement | expressionStatement |
ifStatement | iterationStatement | withStatement | switchStatement |
labelledStatement | tryStatement | throwStatement |
breakStatement | returnStatement.

block = leftbrace, statementList , rightbrace.

statementList = statement star.
variableStatement = var, variableDeclarationList, tSemicolon.

variableDeclarationList = (variableDeclaration plusSeparatedBy: comma).
variableDeclaration = identifier, (tEq,assignmentExpression) opt.

variableDeclarationListNoIn = (variableDeclarationNoIn plusSeparatedBy: comma).
variableDeclarationNoIn = identifier, (tEq,assignmentExpressionNoIn) opt.

emptyStatement = tSemicolon.

expressionStatement = (((function | leftbrace) not) & expression), tSemicolon.

ifStatement = if, leftparen,expression,rightparen,statement,(else,statement) opt.

iterationStatement = doStatement | forStatement | forStatementNoVar |
whileStatement | forInStatement | forInStatementNoVar.

doStatement = do, statement, while, leftparen,expression,rightparen,tSemicolon.

forStatement = for, leftparen,
var, variableDeclarationListNoIn ,
tSemicolon,
(expression opt),
tSemicolon,
(expression opt),
rightparen, statement.

forStatementNoVar = for, leftparen,
(expression opt),
tSemicolon,
(expression opt),
tSemicolon,
(expression opt),
rightparen, statement.


forInStatement = for, leftparen,
var, variableDeclarationListNoIn ,
in,
expression ,
rightparen, statement.

forInStatementNoVar = for, leftparen,
leftHandSideExpression ,
in,
expression ,
rightparen, statement.
whileStatement = while,leftparen,expression,rightparen,statement.

continueStatement = continue, (identifier opt), tSemicolon.

breakStatement = break, (identifier opt), tSemicolon.

returnStatement = return, (expression opt), tSemicolon.

withStatement = with, leftparen, expression ,rightparen, statement.

switchStatement = switch ,leftparen,expression,rightparen, clauseBlock.

clauseBlock = leftbrace,(clause star),(defaultClause opt),rightbrace.

clause = case, expression, tColon,statementList.

defaultClause = default,tColon, statementList.

labelledStatement = identifier,tColon,statement.

throwStatement = throw,expression,tSemicolon.

tryStatement = try, block, (catchBlock opt), (finallyBlock opt).

catchBlock = catch, leftparen, identifier,rightparen, block.

finallyBlock = finally, block.

functionDeclaration = function,identifier,
leftparen,formalParameterList,rightparen,
leftbrace,sourceElements,rightbrace.



sourceElements = (statement | functionDeclaration ) star.

program = sourceElements.
|
)

...


I really like the way you can separate the grammar from the AST creation (as described in the Executable Grammars[PDF] paper). As you can see there's no code specified for this purpose. Along with the source code there's is a 'testing AST' and a parser that inherits from this grammar which is used by the unit tests .

Almost all the grammar was written using the parser combinators from the library. Only charExceptFor: and accept: ifNotIn: were created.

There are still a lot work to do with this parser:

  1. Clean up the code
  2. Work on performance issues
  3. Find a solution for the "Automatic Semicolon Insertion" feature (see section 7.9) of the Ecma document)
  4. Get rid of some repetition (for example the 'NoVar' productions which are also present in the document)
  5. Better AST creation
  6. See if unicode support is possible
  7. More tests!


In order to see the result of using this parser I created a little program to display the 'testing AST'. For example:



The parser with tests and the other code mentioned can be found here .

Monday, August 3, 2009

Creating Dynamic JSON array finders using the DLR

One of the things that really impressed me while reading about Ruby on Rails was the use of method_missing to implement dynamic finders. The technique is described on the "How dynamic filters work". In this post I'm going to show a little experiment of creating a similar technique for querying JSON arrays using the .NET's Dynamic Language Runtime infrastructure.

Code for this post was created using Visual Studio 2010 Beta 1, IronRuby for .NET 4 beta 1 and IronPython 2.6 beta 4 for .NET 4.

JSON.NET



For this post I'm using the JSON.NET library for loading the JSON data. This library provides several ways of loading JSON data. Here I'll be using a set of predefined classes: JObject for JSON objects, JArray for arrays, JValue for literal values, etc. All these classes inherit from JToken.
Code in this post use the JSON data returned by the Twitter REST API. An example of this data:


[
{"in_reply_to_screen_name":null,
"text":"...",
"user": { "following":null,
"description":"...",
"screen_name":"...",
"utc_offset":0,
"followers_count":10,
"time_zone":"...",
"statuses_count":155,
"created_at":"...",
"friends_count":1,
"url":"...",
"name":"...",
"notifications":null,
"protected":false,
"verified":false,
"favourites_count":0,
"location":"...",
"id": ...,
...
},
"truncated":false,
"created_at":"...",
"in_reply_to_status_id":null,
"in_reply_to_user_id":null,
"favorited":false,
"id":...,
"source":"...."
},
...
]



Dynamic queries on JSON data



Finders will be implemented for JSON arrays. As with Rail's dynamic finders the names of the required fields will be encoded in the name of the invoked method.

The following C# 4.0 code shows an example of this wrapper class in conjunction with the dynamic keyword.


JsonTextReader reader = new JsonTextReader(rdr);
JsonSerializer serializer = new JsonSerializer();
JArray o = (JArray)serializer.Deserialize(reader);
dynamic dArray = new FSDynArrayWrapper(o);

string name = "ldfallas";
foreach (var aJObject in dArray.FindAllByFavoritedAlsoByUserWithScreen_Name("false",name))
{
dynamic tobj = new FSDynJObjectWrapper(aJObject);
Console.WriteLine("========");
Console.WriteLine(tobj.user.screen_name);
Console.Write("\t'{0}'",tobj.text);
}




A small definition of the syntax used for names is the following.


method-name = "FindAllBy" ,
property-name , ("With", property-name)? ,
("AlsoBy" property-name , ("With", property-name)? ) *
property-name = valid json property name


In order to be more flexible the following syntax will also be allowed:


method-name = "find_all_by_" ,
property-name , ("_with_", property-name)? ,
("_also_by" property-name , ("_with_", property-name)? ) *
property-name = valid json property name


A sample name for this query methods look like this:


array.FindAllByFavoritedAlsoByUserWithScreen_Name("true","ldfallas")


This method will accept two parameters and is going to :


Find all the object elements from the array that has a 'Favorited' property equal to 'true' and also has an object with a 'User' property associated with an object which has a 'Screen_Name' property which is equal to the 'ldfallas'


Interoperability



One of the nice things of using the DLR infrastructure to create this feature, is that it can be used by other DLR languages. The following example is an IronRuby snippet:

require 'FsDlrJsonExperiments.dll'
include Langexplr::Experiments

while true do
print "Another try\n"
str = System::Net::WebClient.new().download_string("http://twitter.com/statuses/public_timeline.json")
json = FSDynArrayWrapper.CreateFromReader(System::IO::StringReader.new(str))

for i in json.find_all_by_user_with_time_zone('Central America') do
print i.to_string()
end
sleep(5)
end



The following IronPython code shows a little example of this wrapper class.


fReader = StreamReader(GetTwitterPublicTimeline())
jReader = JsonTextReader(fReader)
serializer = JsonSerializer()

json = FSDynArrayWrapper( serializer.Deserialize(jReader) )

for i in json.FindAllByFavoritedAlsoByUserWithScreen_Name("false","ldfallas"):
print i


Implementation



In order to implement the functionality presented here, the IDynamicMetaObjectProvider interface and the DynamicMetaObject class were used. By using these we can generate the code for the call site as a expression tree. For more information on how to use this interface see
Getting Started with the DLR as a Library Author document (available here) .

The code generated to do the filtering is an expression which uses the Where method from System.Linq.Enumerable
. The generated expression written in source using a pseudo C# looks like this:


{
object tmp;
array.Where(c => (((c Is JObject) &&
CompareHelper(
GetJObjectPropertyCI(((JObject)c), "Favorited"),
"true"))
&&
((((tmp = GetJObjectPropertyCI(((JObject)c), "User")) As JObject) != null) &&
CompareHelper(
GetJObjectPropertyCI(((JObject)tmp), "Screen_Name"),
"ldfallas"))))

}


Where GetJObjectPropertyCI is a helper method that gets a property from a JObject by case-intensive name . And CompareHelper is a helper method to do the comparison.

The implementation for FSDynArrayWrapper was written in F#. Mainly because it's a nice language to implement this kind of features. However there's no easy way to consume this feature using F# since it doesn't use the DLR.

Here's the definition:


type FSDynArrayWrapper(a:JArray) =
member this.array with get() = a
static member CreateFromReader(stream : System.IO.TextReader) =
...
static member CreateFromFile(fileName:string) =
...
interface IDynamicMetaObjectProvider with
member this.GetMetaObject( parameter : Expression) : DynamicMetaObject =
FSDynArrayWrapperMetaObject(parameter,this) :> DynamicMetaObject


As you can see, the interesting part is in the implementation of FSDynArrayWrapperMetaObject. The CreateFromReader and CreateFromFile methods are only utility methods to load data from a document.

The implementation of FSDynArrayWrapperMetaObject looks like this:


type FSDynArrayWrapperMetaObject(expression : Expression, value: System.Object) =
inherit DynamicMetaObject(expression,BindingRestrictions.Empty,value)

...

override this.BindInvokeMember(binder : InvokeMemberBinder, args: DynamicMetaObject array) =
match QueryInfo.GetQueryElements(binder.Name) with
| Some( elements ) ->
(new DynamicMetaObject(
this.GenerateCodeForBinder(
elements,
Array.map
(fun (v:DynamicMetaObject) ->
Expression.Constant(v.Value.ToString()) :> Expression) args),
binder.FallbackInvokeMember(this,args).Restrictions))
| None -> base.BindInvokeMember(binder,args)


The BindInvokeMember creates the expression tree for the code that will be executed for a given invocation of a dynamic finder method. Here the QueryInfo.GetQueryElements method is called to extract the elements of the name as described above. The value returned by this method is QueryElement list option where:


type QueryElement =
| ElementQuery of string
| SubElementQuery of string * string


ElementQuery specifies the "Favorited" part in FindAllByFavoritedAlsoByUserWithScreen and the SubElementQuery belongs to the "ByUserWithScreen" part in FindAllByFavoritedAlsoByUserWithScreen .

If the name of the invoked method corresponds is a supported name for a finder, the GenerateCodeForBinder is called to generate the expression tree. The last argument of this method is a collection of the arguments provided for this invocation.


member this.GenerateCodeForBinder(elements, arguments : Expression array) =
let whereParameter = Expression.Parameter(typeof<JToken>, "c") in
let tmpVar = Expression.Parameter(typeof<JToken>, "tmp") in
let whereMethodInfo =
(typeof<System.Linq.Enumerable>).GetMethods()
|> Seq.filter (fun (m:MethodInfo) -> m.Name = "Where" && (m.GetParameters().Length = 2))
|> Seq.map (fun (m:MethodInfo) -> m.MakeGenericMethod(typeof<JToken>))
|> Seq.hd
let queryElementsConditions =
elements
|> Seq.zip arguments
|> Seq.map
(fun (argument,queryParameter) ->
this.GetPropertyExpressionForQueryArgument(queryParameter,argument,whereParameter,tmpVar)) in
let initialCondition = Expression.TypeIs(whereParameter,typeof<JObject>) in

let resultingExpression =
Expression.Block(
[tmpVar],
Expression.Call(
whereMethodInfo,
Expression.Property(
Expression.Convert(
this.Expression,this.LimitType),"array"),
Expression.Lambda(
Seq.fold
(fun s c -> Expression.And(s,c) :> Expression)
(initialCondition :> Expression)
queryElementsConditions,
whereParameter))) in
resultingExpression



The most important parts of this method is the definition of queryElementsConditions and resultingExpression. The resulting expression specifies the invocation to the Where method


The queryElementsConditions take each argument extracted from the name of the method and tries to generate the necessary conditions for the value provided as an argument. In order to do this the GetPropertyExpressionForQueryArgument method is used:


member this.GetPropertyExpressionForQueryArgument(parameter:QueryElement,argument,cParam,tmpVar) : Expression =
match parameter with
| ElementQuery(propertyName) ->
this.CompareExpression(
this.GetJObjectPropertyExpression(cParam,propertyName) ,
argument)
| SubElementQuery(propertyName,subPropertyName) ->
Expression.And(
Expression.NotEqual(
Expression.TypeAs(
Expression.Assign(
tmpVar,
this.GetJObjectPropertyExpression(cParam,propertyName)),
typeof<JObject>),
Expression.Constant(null)),
this.CompareExpression(
this.GetJObjectPropertyExpression(
tmpVar,
subPropertyName),argument)) :> Expression



This method generates a different expression depending on the kind of query element that is requested.

Considerations for IronPython



As a curious note, in IronPython the BindInvokeMember method is not called in the FSDynArrayWrapperMetaObject when a method is invoked. It seems that IronPython calls the BindGetMember method and then tries to apply the result of getting the method.

So to make this object work with IronPython a implementation of the BindGetMember method was created that returns a lambda expression tree with the generated Where invocation.


override this.BindGetMember(binder: GetMemberBinder) =
match QueryInfo.GetQueryElements(binder.Name) with
| Some( elements ) ->
let parameters =
List.mapi ( fun i _ ->
Expression.Parameter(
typeof<string>,
sprintf "p%d" i)) elements
(new DynamicMetaObject(
Expression.Lambda(
this.GenerateCodeForBinder(
elements,
parameters
|> List.map (fun p -> p :> Expression)
|> List.to_array ),
parameters),
binder.FallbackGetMember(this).Restrictions))
| None -> base.BindGetMember(binder)


Accessing using different names



The QueryInfo.GetQueryElements method is used to allow the "FindAllBy..." and "find_all_by..." method names .


module QueryInfo = begin

...

let GetQueryElements(methodName:string) =
match methodName with
| str when str.StartsWith("FindAllBy") ->
Some(ExtractQueryElements(str.Substring("FindAllBy".Length),"AlsoBy","With"))
| str when str.StartsWith("find_all_by_") ->
Some(ExtractQueryElements(str.Substring("find_all_by_".Length),"_also_by_","_with_") )
| _ -> None
end


Code


Code for this post can be found here.

Sunday, July 26, 2009

Using the DynamicObject class

In this post I'm going to show a little example of using the DynamicObject class from the .NET's Dynamic Language Runtime. This class is an easy way to provide dynamic dispatch to your objects in a DLR language.

Code samples presented here were created using Visual Studio 2010 Beta 1 so there may be differences with the final version.


The experiment



The example is a class that allows access to the properties of JSON objects using the property syntax of a DLR language. In order to do this, a wrapper to JSON.NET's JObject was created. This wrapper is called FSDynJObjectWrapper.

The following code shows an example of using FSDynJObjectWrapper with C# 4.0 :


// Load the document
JsonTextReader reader = new JsonTextReader(GetTwitterPublicTimeLine());
JsonSerializer serializer = new JsonSerializer();
JArray topArray = (JArray)serializer.Deserialize(reader);

// Access the document
dynamic aObj = new FSDynJObjectWrapper((JObject)topArray[0]);
Console.WriteLine("========");
Console.WriteLine(aObj.user.screen_name);
Console.Write("\t'{0}'", aObj.text);


Given that the JSON returned by the Twitter REST api looks like this:


{"in_reply_to_screen_name":null,
"text":"...",
"user": { "following":null,
"description":"...",
"screen_name":"...",
"utc_offset":0,
"followers_count":10,
"time_zone":"...",
"statuses_count":155,
"created_at":"...",
"friends_count":1,
"url":"...",
"name":"...",
"notifications":null,
"protected":false,
"verified":false,
"favourites_count":0,
"location":"...",
"id": ...,
...
},
"truncated":false,
"created_at":"...",
"in_reply_to_status_id":null,
"in_reply_to_user_id":null,
"favorited":false,
"id":...,
"source":"...."
}



DLR and DynamicObject



The Dynamic Language Runtime provides a common infrastructure to build dynamic languages on the CLR. The System.Dynamic.DynamicObject class is an easy way to override the behavior of access to an object from a dynamic language. A partial definition of this class looks like this:


public abstract class DynamicObject : IDynamicMetaObjectProvider {
public virtual bool TryGetMember(GetMemberBinder binder,
out object result)
public virtual bool TrySetMember(SetMemberBinder binder,
object value)
public virtual bool TryDeleteMember(DeleteMemberBinder binder)

public virtual bool TryConvert(ConvertBinder binder,
out object result)
public virtual bool TryUnaryOperation
(UnaryOperationBinder binder, out object result)
public virtual bool TryBinaryOperation
(BinaryOperationBinder binder, object arg,
out object result)
public virtual bool TryInvoke
(InvokeBinder binder, object[] args, out object result)
public virtual bool TryInvokeMember
(InvokeMemberBinder binder, object[] args,
out object result)
...
}


As you can see this class has methods to override things like what happens when a property is access or a method is invoked. A detailed description of this class is available from the Getting Started with the DLR as a Library Author document .

The FSDynJObjectWrapper class



As presented above the FSDynJObjectWrapper inherits from DynamicObject and provides the required functionality. For this example the class is implemented using F# (although any .NET language could be used) and looks like this:


open System.Dynamic
open System.Reflection
open System.Linq.Expressions
open Newtonsoft.Json.Linq
open Newtonsoft.Json


type FSDynJObjectWrapper(theObject:JObject) =
inherit DynamicObject() with
override this.TryGetMember(binder : GetMemberBinder, result : obj byref ) =
match theObject.[binder.Name] with
| null -> false
| :? JObject as aJObject ->
result <- FSDynJObjectWrapper(aJObject)
true
| theValue ->
result <- theValue
true


The TryGetMember will be called when accessing a property. It tries to lookup the name of the requested property in the JObject instance. A new instance of the wrapper is created if the property value is another JObject instance allowing expressions like: aObj.user.screen_name .

For the next posts I'm going to show the use of other DLR classes and examples using other DLR languages.

Code for this post can be found here.

Thursday, June 25, 2009

Creating a calendar using Newspeak and Hopscotch

For this post I'm going to show the code for a little calendar UI fragment created using the Newspeak programming language and the Hopscotch framework.

Calendar


Here's how the calendar looks:

Hopscotch calendar experiment

(As you can see, I'm focusing on the functionality for the moment).

The code



As described in "Hopscotch: Towards User Interface Composition" this framework promotes the separation between data (the subject) and the UI elements (presenter) . For this calendar fragment the data part will be the given date and the UI part will be a series of UI elements that represent the days of a month.

Here's an overview of the definition for the HCalendar class:


class HCalendar usingLib: platform = NewspeakObject (
|
...
|
)
(

class CalendarSubject for: date = Subject (
|
...
|
)
(
...
)

class CalendarPresenter = Presenter (
|
...
|
)
(
...
))




The subject



As mentioned above the subject only holds a given date. Some operations are added to make it easy to manipulate it.


class CalendarSubject for: date = Subject (
|
private date = date.
|
)
('as yet unclassified'
changeDayTo: newDay <Number> = (
date:: Date year: year
month: (month name)
day: newDay.
)

createPresenter = (
^CalendarPresenter new subject: self.
)

day ^ <Number> = (
^date dayOfMonth.
)

month ^ <Month>= (
^date month.
)

moveToNextMonth = (
date:: date addMonths: 1.
)

moveToPreviousMonth = (
date:: date addMonths: -1.
)

year ^ <Number> = (
^date year.
))

The presenter


The presenter class is more interesting. It takes the date from the subject and tries to create a representation of the month using Hopscotch fragments. The following snippet shows an overview of the presenter class.


class CalendarPresenter = Presenter (
"Calendar presenter, shows the days of the a month."
|

protected weeksRow
protected monthHolder
|
)
('as yet unclassified'


definition = (
monthHolder::
holder: [
row: {
link: '<' action: [ subject moveToPreviousMonth.
refreshHolders. ].
blank: 1.
column: { header .
weeks. }.
blank: 1.
link: '>' action: [ subject moveToNextMonth.
refreshHolders. ].

}.].
^monthHolder.
)

fragmentForDaysNotInCurrentMonth = (
^label: ' '.
)

header = (
|headerRow|
headerRow:: row: {
filler.
label:: subject month name, ' ' , subject year asString.
filler.
}.
^headerRow.
)


refreshHolders= (
monthHolder refresh.
)

weekDayFragmentFor: dayNumber = (
|dayNumberText|
dayNumberText:: dayNumber asString.

^link: dayNumberText
action: [ subject changeDayTo: dayNumber.
highlightSelectedDay.
].
)

highlightSelectedDay = (
...
)


weeks = (
...
)

addFirstWeekTo: result withDaysFromPreviousMonth: previousMonthDays = (
...
)

addLastWeekTo: result weeksToShow: weeksToShow lastDayAdded: lastDayAdded daysInNextMonthToShow: nextMonthDays= (
...
)

addMonthWeeksTo: result weeksToShow: weeksToShow lastDayAdded: lastday= (
...
)

columnSeparator = (
...
)

createColumnsFromWeekArray: weekArray = (
...
)

)



The weeks method is where most of the work of creating the calendar is done. For space reasons I'm not including it here, see the link at the end of the post for the complete code.

Reusing the calendar



Now that we have the definition of the calendar presenter and subject we can reuse it to create more interesting fragments. For example the following definition could be used to create a date range picker.

class HDateRange usingLib: platform = (
"Date range selector."
|
...
|
)
(

class DateRangePresenter = Presenter (
"Presenter for date range."
|
dateRangeTextHolder
|
)
(
calendar: dateSubject = (
|aCalendar|
aCalendar:: dateSubject createPresenter.
aCalendar onChange: [ dateRangeTextHolder refresh ].
^aCalendar.
)

definition = (
dateRangeTextHolder::
holder: [label: subject initialDate selectedDate asString, ' - ',
subject finalDate selectedDate asString].
^heading: dateRangeTextHolder
details: [
row: {
calendar: subject initialDate .
blank:49.
calendar: subject finalDate.
}]
)

)

class DateRangeSubject from: initial to: final= Subject (
"Data for the date range."
|
private initialDate = (HCalendar usingLib: platform) CalendarSubject for: initial.
private finalDate = (HCalendar usingLib: platform) CalendarSubject for: final.
|
)
(
createPresenter = (
^ (DateRangePresenter new) subject: self.
)))




Code for this post can be found here.

Sunday, June 21, 2009

Using libcurl with Newspeak FFI (continued)

The previous post presented a small low level interface to libcurl using the Newspeak programming language. In this post I'm going to show the HttpServiceClient class, which was created to give a simple interface to the LibCurlHelper class.

The definition for class looks like this:


Newsqueak2
'LangexplrExperiments'

class HttpServiceClient usingLib: platform withCurlPath: curlLibraryPath = (
"This class is used to access services provided by the HTTP protocol"
|
LibCurlHelper = platform LibCurlHelper.
ByteString = platform ByteString.
platform = platform.
Transcript = platform Transcript .
private curlLibraryPath = curlLibraryPath .
|
)
(

class HttpRequestResult curlErrorCode: curlErrorCode httpResponse: httpResponse data: data= (
...
)
(
...
)

createNewCurlInstance = (
...
)

get: url <String> ^ <HttpRequestResult> = (
...
)

get: url <String> withHeaders: headers <Array> ^ <HttpRequestResult> = (
...
)

private isHttpsUrl: url <String> ^ <Boolean> = (
...
)

postForm: formData <Dictionary> to: url <String> ^ <HttpRequestResult> = (
...
)

) : (
...
)


The get:, get: withHeaders: and postForm: to: methods provide the functionally to do very simple GET and POST requests.

The HttpRequestResult encapsulates the result of calling these methods which has the result of calling libcurl, the HTTP response code and the text of the requested data if successful.

The code for the GET methods looks like this:

get: url <String> ^ <HttpRequestResult> = (
| curl data tmpBuffer bufferLength response|
^ get: url withHeaders: {}.
)


get: url <String> withHeaders: headers <Array> ^ <HttpRequestResult> = (
| curl data tmpBuffer bufferLength curlCallResult response|
data:: ''.
curl:: createNewCurlInstance.
curl writeCallback:
[:args :result|
bufferLength:: ((args datasize) * (args nmemb)).
tmpBuffer:: ByteString new: bufferLength.
args data copyInto: tmpBuffer
from: 1 to: bufferLength
in: (args data) startingAt: 1.
data:: data,tmpBuffer.
result returnInteger: bufferLength.
].

headers size > 0 ifTrue: [curl headers: headers].

(isHttpsUrl: url)
ifTrue: [curl noSslVerification.].
curl url: url.

curlCallResult:: curl performOperation.

response:: curl responseCode.
curl cleanup.
^HttpRequestResult
curlErrorCode: curlCallResult
httpResponse: response
data: data.
)


The code for the POST operation looks like this:


postForm: formData <Dictionary> to: url = (
| curl data tmpBuffer bufferLength curlFormData response curlCallResult|
data:: ''.
curl:: createNewCurlInstance.

curl post: formData.
curl writeCallback:
[:args :result|
bufferLength:: ((args datasize) * (args nmemb)).
tmpBuffer:: ByteString new: bufferLength.
args data copyInto: tmpBuffer
from: 1 to: bufferLength
in: (args data) startingAt: 1.
data:: data,tmpBuffer.
result returnInteger: bufferLength.
].

(isHttpsUrl: url)
ifTrue: [curl noSslVerification.].

curl url: url.
curlCallResult: curl performOperation.

response:: curl responseCode.
curl cleanup.
^HttpRequestResult
curlErrorCode: curlCallResult
httpResponse: response
data: data.
)


Code for this post can be found here.

Saturday, June 13, 2009

Using libcurl with Newspeak FFI

In this post I'm going to show a little example of using libcurl from the Newspeak programming language .

Newspeak FFI



Newspeak provides a nice mechanism to call C code. This mechanism is described in
Newspeak Foreign Function Interface User Guide document. The AlienDemo example provided with the Newspeak prototype has some nice small examples of the FFI.

The experiment presented in this post was created using the Newspeak prototype from February 2009. Due to some limitations of this release, this code only works with the Windows version of the prototype.

libcurl



libcurl is a C library that provides client access to several networking protocols with a common interface. For this post I'm going to implement a wrapper for very small subset of the functionality provided by libcurl in order to perform simple HTTP/HTTPS GET and POST requests .

The simple.c example shows how to do a simple GET request.


int main(void)
{
CURL *curl;
CURLcode res;

curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "curl.haxx.se");
res = curl_easy_perform(curl);
curl_easy_cleanup(curl);
}
return 0;
}


The LibCurlHelper class



A class named LibCurlHelper will be used to encapsulate calls to libcurl. As you will notice the interface of this class is pretty low level. For future posts I'll try to create a better interface using more Newspeak features.


class LibCurlHelper usingLib: platform = (
"This class wraps an implementation of the libcurl library"
|
Transcript = platform Transcript.
Alien = platform Alien.
UnsafeAlien = platform UnsafeAlien.
Callback = platform Callback.
CurlWriteCallback = platform CurlWriteCallbackNs1.
CurlDebugCallback = platform CurlDebugCallback.
ByteString = platform ByteString.
OrderedCollection = platform OrderedCollection.
...
public libcurlPath = ''.
public errorBuffer = ''.
protected CURL_OPT_URL = 10002.
protected CURLOPT_WRITEFUNCTION = 20011.
...

internalDebugCallback = nil.
internalWriteCallback = nil.
formPostData = nil.
private curlInstance = nil.
private aliensToRelease = nil.


CURLFORM_NOTHING = 0 .
CURLFORM_COPYNAME = 1 .
CURLFORM_PTRNAME = 2.
...


libcurl uses a lot of constaints prefixed with "CURL" this class contains definitions for some of them.

Initialization



The initializeCurl method calls the curl_easy_init (as in the simple.c example shown above) and stores the returned pointer in a slot called curlInstance which will be used in further calls.


initializeCurl = (
|curl|
ensureLibrariesLoaded .
(Alien lookup: 'curl_easy_init' inLibrary: curlLibName )
primFFICallResult: (curl:: Alien new: 4).
curlInstance: curl.
)


The ensureLibrariesLoaded and methods.


curlLibName = (
^libcurlPath, 'libcurl.dll'
)
ensureLibrariesLoaded = (
Alien ensureLoaded: libcurlPath, 'libidn-11.dll'.
Alien ensureLoaded: libcurlPath, 'libeay32.dll'.
Alien ensureLoaded: libcurlPath, 'libssl32.dll'.
Alien ensureLoaded: curlLibName.
)


Setting the URL



In order to set the URL for the request we need to call the curl_easy_setopt function with the CURL_OPT_URL with the URL string.

The code looks like this:


url: url <String> = (
|result|
(Alien lookup: 'curl_easy_setopt' inLibrary: curlLibName )
primFFICallResult: (result:: Alien new:4)
withArguments: { curlInstance.
CURL_OPT_URL.
(addAlienToRelease: (url asAlien)) pointer. }.
^result.
)


The addAlienToRelease: method was added to in order to keep track of resources allocated in the C heap that need to be manually released when not needed. The asAlien method of the String class creates a resource of this kind.

The implementation of this method looks like this:


addAlienToRelease: anAlien = (
aliensToRelease isNil ifTrue: [ aliensToRelease:: OrderedCollection new. ].
aliensToRelease add: anAlien.
^anAlien.
)


Setting the write callback



Callback functions are used by libcurl to process the data coming from the network. The Newspeak FFI provides a nice way to add this kind of callbacks.


writeCallback: callback <Block>= (
|result|
internalWriteCallback:: Callback
block: callback
argsClass: CurlWriteCallback.

(Alien lookup: 'curl_easy_setopt' inLibrary: curlLibName )
primFFICallResult: (result:: Alien new: 4)
withArguments: { curlInstance.
CURLOPT_WRITEFUNCTION.
internalWriteCallback thunk. }.

^result.
)



The writeCallback: method sets the block in callback as the libcurl write callback. In order to do this it creates an instance of the Callback class with the block and the arguments class. An instance of this class is used to create a function pointer which is passed to the curl_easy_setopt function.

The "arguments class" is defined using the NS1 Newspeak syntax as follows:


Newsqueak1
'LangexplrExperiments'
CurlWriteCallbackNs1 = Alien (
"Class used to represent arguments of the LibCurl write function."
'as yet unclassified'
data = (
^Alien forPointer: (self unsignedLongAt: 1)
)
datasize = (
^(self unsignedLongAt: 5)
)
nmemb = (
^(self unsignedLongAt: 9)
)
writerData = (
^Alien forPointer: (self unsignedLongAt: 13)
)
) : (
'as yet unclassified'
dataSize = (
^16
))



An instance of this class is used to represent the arguments of a callback call. An example of the use of this function is presented below.

Performing the request



The curl_easy_perform function is used to start the operation. The following code shows the call to this function:


performOperation = (
|r|
(Alien lookup: 'curl_easy_perform' inLibrary: curlLibName )
primFFICallResult: (r:: Alien new: 4)
withArguments: { curlInstance. }.
^r signedLongAt: 1.
)


Cleanup



Finally the following method is used to release the resources allocated by libcurl.


cleanup = (
(Alien lookup: 'curl_easy_cleanup' inLibrary: curlLibName )
primFFICallResult: nil
withArguments: { curlInstance } .

aliensToRelease do: [:anAlien | anAlien free ].
)


Example of using the library



As mentioned above, the LibCurlHelper class provides a low level interface to libcurl, something needs to be created to encapsulate this functionality.

The following method shows a method that preforms a simple GET request and returns the downloaded data as a string.


class HttpServiceClient usingLib: platform withCurlPath: curlLibraryPath = (
"This class is used to access services provided by the HTTP protocol"
|
LibCurlHelper = platform LibCurlHelper.
ByteString = platform ByteString.
platform = platform.
Transcript = platform Transcript .
private curlLibraryPath = curlLibraryPath .
|
)

(
simpleGet: url ^ = (
| curl data tmpBuffer bufferLength response|
curl:: (LibCurlHelper usingLib: platform).
curl libcurlPath: curlLibraryPath .
curl initializeCurl.

data:: ''.
curl:: createNewCurlInstance.

curl writeCallback:
[:args :result|
bufferLength:: ((args datasize) * (args nmemb)).
tmpBuffer:: ByteString new: bufferLength.
args data copyInto: tmpBuffer
from: 1 to: bufferLength
in: (args data) startingAt: 1.
data:: data,tmpBuffer.
result returnInteger: bufferLength.
].
curl url: url.
curl performOperation.
curl cleanup.
^data
)
)


Notice that here the callback function modifies a local variable every time the data arrives. Also notice that args is an instance of CurlWriteCallbackNs1.

Final words


The experiment of using libcurl from Newspeak was a nice way to learn about its foreign function interface. Having access to libcurl access to useful things such as HTTPS requests.

There's already a nice Squeak wrapper for libcurl called CurlPlugin .

Code for this post can be found here.

Sunday, May 17, 2009

Modifying an AS3 class with AbcExplorationLib

Recently, I made a couple of changes to AbcExplorationLib to allow the modification of a compiled ActionScript 3 (AS3) class.

The following code will be used to illustrate this feature:


class Shape {
public function foo() {


print("Base foo");
}
public function paint():void {
foo();
}
}

class Rectangle extends Shape{
public override function paint():void {
super.paint();
print( "Rectangle");
}
}

class Circle extends Shape{
public override function paint():void {
super.paint();
print( "Circle");
}
}

var shapes = [new Rectangle(),new Circle()];

for each(var s:Shape in shapes) {
s.paint();
}



Compiling this program using the Flex SDK and running it using the Tamarin binaries shows the following output:


$ java -jar asc.jar -import builtin.abc Shapes.as

Shapes.abc, 678 bytes written
$ avmshell Shapes.abc
Base foo
Rectangle
Base foo
Circle


Say that we want to create a definition of the foo method in the Rectangle class that overrides the definition from Shape.

1. First we load the class file:


let abcFile = using (new FileStream(sourceFile,FileMode.Open)) (
fun stream -> AvmAbcFile.Create(stream))


2. Then we need a definition for the new foo implementation. The following function creates a foo override that prints a message to the screen:


let newFooMethod(message:string) =
AvmMemberMethod(
CQualifiedName(Ns("",NamespaceKind.PackageNamespace),"foo"),
AvmMethod(
"",
SQualifiedName("*"),
[||],
Some <|
AvmMethodBody(
2,1,4,5,
[|
GetLocal0;
PushScope;
FindPropertyStrict(
MQualifiedName(
[|Ns("",
NamespaceKind.PackageNamespace)|],
"print"));
PushString message;
CallProperty(
MQualifiedName(
[|Ns("",
NamespaceKind.PackageNamespace)|],
"print"),
1);
Pop;
ReturnVoid
|],[||],[||]
)
),
AbcTraitAttribute.Override
)


3. We need a function to add a method to a existing class. Notice that the modification consists in only creating a new instance of the AvmClass with the same values as the original but adding the new method.


let addClassMethod(aClass,newMethod) =
match aClass with
| AvmClass(name,
superclassname,
init,
cinit,
slots,
methods,
pns) ->
AvmClass(name,
superclassname,
init,cinit,
slots,
newMethod::methods,
pns)



4. The following method is used to locate the rectangle class and apply the modification:


let modifyFileToAddMethod(f:AvmAbcFile) =
AvmAbcFile(f.Scripts,
f.Classes |>
List.map (fun (c:AvmClass) ->

match c.Name with
| CQualifiedName(_,"Rectangle") ->
addClassMethod(
c,
newFooMethod("New foo for Rectange!!!"))
| _ -> c))



5. Finally we write the input file back to disk:


let modifiedAbcFile = modifyFileToAddMethod(abcFile)
let abcFileCreator = AbcFileCreator()
let file = modifiedAbcFile.ToLowerIr(abcFileCreator)
using (new BinaryWriter(new FileStream(targetFileName,FileMode.Create)))
(fun f -> file.WriteTo(f))


After running this program we can execute the bytecode again to get the new results:


$ mono modify.exe Shapes.abc
...
$ avmshell Shapes_t.abc
New foo for Rectange!!!
Rectangle
Base foo
Circle


One area that really needs works is name handling. A particular challenge is to find a good way to represent multinames ( name references in a set of name spaces ).

In general the way of defining a method from scratch(for example in newFooMethod) needs some work since it is requires lots of details that might not be interesting for the developer.

Finally, another area that really needs improvement is the output file generation. Right now it requires the user to write three instructions to write the file to disc. This will be changed to be similar to the load process.


Code for this program can be found as part of the AbcExplorationLib samples.