niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » Programming » C
Good Representation for an Abstract Syntax Tree
Post new topic   Reply to topic Page 1 of 1 [10 Posts] View previous topic :: View next topic
Author Message
Keith Thompson
*nix forums Guru


Joined: 28 Feb 2005
Posts: 5173

PostPosted: Thu Jul 20, 2006 3:21 pm    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

"Johan Tibell" <johan.tibell@gmail.com> writes:
Quote:
Simon Biber wrote:
[...]
struct Exp
{
enum Type type;
union {
struct Lam *lam;
struct Var *var;
struct App *app;
struct Lit *lit;
} form;
};
[...]

When dynamically allocating memory for such a struct using malloc will
sizeof(Exp) be enough to get sufficient memory allocated for the
biggest union member?

Yes. sizeof for a union yields at least the size of its largest
member. (It's typically the size of its largest member rounded up to
the alignment of its most strictly aligned member.)

Quote:
Will this work:

struct Exp *exp;
exp = malloc(sizeof(struct Exp));
if (!exp)
fprintf(stderr, "out of memory\n");

exp->form.lam = malloc(sizeof(struct Lam));
if (!exp->form.lam)
fprintf(stderr, "out of memory\n");

(Also, is it good style?)

Yes, and mostly.

Rather than
exp = malloc(sizeof(struct Exp));
...
exp->form.lam = malloc(sizeof(struct Lam));
I'd write:
exp = malloc(sizeof *exp);
...
exp->form.lam = malloc(sizeof *(exp->form.lam));

(The parentheses on the second malloc aren't strictly necessary, but I
find it easier to read with them.)

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Back to top
johan.tibell@gmail.com
*nix forums beginner


Joined: 18 Jul 2006
Posts: 3

PostPosted: Thu Jul 20, 2006 9:06 am    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

Simon Biber wrote:
Quote:
Yes, a series of struct within a union would be reasonably efficient.
Here is one way to lay them out:

struct Exp;

struct Lam
{
char *id;
struct Exp *exp;
};

struct Var
{
char *id;
};

struct App
{
struct Exp *exp1;
struct Exp *exp2;
};

struct Lit
{
int value;
};

enum Type
{
LAM,
VAR,
APP,
LIT
};

struct Exp
{
enum Type type;
union {
struct Lam *lam;
struct Var *var;
struct App *app;
struct Lit *lit;
} form;
};

For those types of expression that only contain a single data member
(Var only contains a string and Lit only contains an integer), you may
choose to place their data directly into the union rather than adding a
level of indirection.

--
Simon.

When dynamically allocating memory for such a struct using malloc will
sizeof(Exp) be enough to get sufficient memory allocated for the
biggest union member? Will this work:

struct Exp *exp;
exp = malloc(sizeof(struct Exp));
if (!exp)
fprintf(stderr, "out of memory\n");

exp->form.lam = malloc(sizeof(struct Lam));
if (!exp->form.lam)
fprintf(stderr, "out of memory\n");

(Also, is it good style?)
Back to top
neutron*star
*nix forums Guru


Joined: 21 Feb 2005
Posts: 2039

PostPosted: Wed Jul 19, 2006 11:09 am    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

jacob navia said:

Quote:
Chris Torek wrote:
snip

Some non-C languages (including Plan 9 C) do have a way to insert
anonymous items that "bubble out" to an outer enclosing struct,
but neither C89 nor C99 support this.

lcc-win32 supports this.

That might be relevant in an lcc newsgroup, but what matters in comp.lang.c
is that the construct is non-standard and non-portable, and should thus be
avoided by those who need to write portable code.

Please stop confusing "Navia's implementation" with "the real world of
portable programming".

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Back to top
Richard Bos
*nix forums Guru


Joined: 21 Feb 2005
Posts: 1031

PostPosted: Wed Jul 19, 2006 8:37 am    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

jacob navia <jacob@jacob.remcomp.fr> wrote:

Quote:
Chris Torek wrote:

[ SNIP! over 60 lines ]

Quote:
Some non-C languages (including Plan 9 C) do have a way to insert
anonymous items that "bubble out" to an outer enclosing struct,
but neither C89 nor C99 support this.

spam> supports this.

You just quoted _seventy lines_ just to add a oneliner that says that
<spammy spammy spam> is non-conforming (which, btw, we already know all
too well)? Grief, have you so few customers that you need such tactics?

Richard
Back to top
Chris Dollin
*nix forums Guru


Joined: 24 Feb 2005
Posts: 373

PostPosted: Wed Jul 19, 2006 7:59 am    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

Chris Torek wrote:

Quote:
In article <e9ip53$e3s$1@malatesta.hpl.hp.com
Chris Dollin <chris.dollin@hp.com> wrote:
struct ExpStruct
{
enum { LAM, VAR, APP, LIT } type;
union
{
struct LamStruct lam;
struct VarStruct var;
struct AppStruct app;
struct LitStruct lit; /* or perhaps `int lit` */
}
};

Note that the union-member of a "struct ExpStruct" needs a name.
For instance, the second-to-last line above should perhaps read
"} un;".

ARGH. Thanks, Chris. Sorry, Johan.

It was a typo. Well, a braino.

--
Chris "Brain-O, Bray-ay-ay-ayn-O, daylight come & ..." Dollin
"Who are you? What do you want?" /Babylon 5/
Back to top
jacob navia
*nix forums Guru


Joined: 13 Mar 2005
Posts: 596

PostPosted: Tue Jul 18, 2006 8:30 pm    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

Chris Torek wrote:
Quote:
In article <e9ip53$e3s$1@malatesta.hpl.hp.com
Chris Dollin <chris.dollin@hp.com> wrote:

struct ExpStruct
{
enum { LAM, VAR, APP, LIT } type;
union
{
struct LamStruct lam;
struct VarStruct var;
struct AppStruct app;
struct LitStruct lit; /* or perhaps `int lit` */
}
};


Note that the union-member of a "struct ExpStruct" needs a name.
For instance, the second-to-last line above should perhaps read
"} un;".

I occasionally wish that C had un-named union-members of outer
"struct"s. So I sometimes fake them, using code like this:

struct exp {
enum { LAM, VAR, APP, LIT } exp_type;
union {
struct lam un_lam;
struct var un_var;
struct app un_app;
struct lit un_lit;
} exp_un;
};
#define exp_lam exp_un.un_lam
#define exp_var exp_un.un_var
#define exp_app exp_un.un_app
#define exp_lit exp_un.un_lit

Then, given a "struct exp *ep" pointing to a valid "exp", I can
write:

switch (ep->exp_type) {
case LAM:
do something with ep->exp_lam;
...
case VAR:
do something with ep->exp_var;
...
...
}

The "#define"s make these expand to ep->exp_un.un_lam, ep->exp_un.un_var,
and so on.

Without the intermediate "#define"s (and using the names in Chris
Dollin's example code, with the small correction), this becomes:

switch (ep->type) {
case LAM:
do something with ep->un.lam;
...
...
}

You have to name the union member, then write the name of the
union member.

Some non-C languages (including Plan 9 C) do have a way to insert
anonymous items that "bubble out" to an outer enclosing struct,
but neither C89 nor C99 support this.

lcc-win32 supports this.
Back to top
Chris Torek
*nix forums Guru


Joined: 26 Feb 2005
Posts: 481

PostPosted: Tue Jul 18, 2006 7:01 pm    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

In article <e9ip53$e3s$1@malatesta.hpl.hp.com>
Chris Dollin <chris.dollin@hp.com> wrote:
Quote:
struct ExpStruct
{
enum { LAM, VAR, APP, LIT } type;
union
{
struct LamStruct lam;
struct VarStruct var;
struct AppStruct app;
struct LitStruct lit; /* or perhaps `int lit` */
}
};

Note that the union-member of a "struct ExpStruct" needs a name.
For instance, the second-to-last line above should perhaps read
"} un;".

I occasionally wish that C had un-named union-members of outer
"struct"s. So I sometimes fake them, using code like this:

struct exp {
enum { LAM, VAR, APP, LIT } exp_type;
union {
struct lam un_lam;
struct var un_var;
struct app un_app;
struct lit un_lit;
} exp_un;
};
#define exp_lam exp_un.un_lam
#define exp_var exp_un.un_var
#define exp_app exp_un.un_app
#define exp_lit exp_un.un_lit

Then, given a "struct exp *ep" pointing to a valid "exp", I can
write:

switch (ep->exp_type) {
case LAM:
do something with ep->exp_lam;
...
case VAR:
do something with ep->exp_var;
...
...
}

The "#define"s make these expand to ep->exp_un.un_lam, ep->exp_un.un_var,
and so on.

Without the intermediate "#define"s (and using the names in Chris
Dollin's example code, with the small correction), this becomes:

switch (ep->type) {
case LAM:
do something with ep->un.lam;
...
...
}

You have to name the union member, then write the name of the
union member.

Some non-C languages (including Plan 9 C) do have a way to insert
anonymous items that "bubble out" to an outer enclosing struct,
but neither C89 nor C99 support this.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Back to top
Chris Dollin
*nix forums Guru


Joined: 24 Feb 2005
Posts: 373

PostPosted: Tue Jul 18, 2006 1:53 pm    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

johan.tibell@gmail.com wrote:

Quote:
I'm in the process of writing an interpreter for lambda calculus (i.e.
a small functional programming language) in C. I've previously written
one in Haskell so I understand at least some of the concepts involved
in writing an interpreter and I know how to define a grammar and use
lex/yacc to parse it. What I don't know is how to represent the
abstract syntax tree (AST) of my language in C. In Haskell I used
what's called an algebraic data type like so:

data Exp = Lam Id Exp
| Var Id
| App Exp Exp
| Lit Int

Legend:
Exp - expression (i.e. one of Lam, Var, App or Lit)
Lam - lambda abstraction (i.e. a function)
App - function application
Lit - integer literal
Id - variable name (i.e. a string)

How would I represent such a structure in C, perhaps with a union with
some sort of tag?

I would (and, for a different language, have) do it in two layers.

I'd have a bunch of functions expressing the expression structure, eg,

isLambda(Exp), lambdaId(Exp), lambdaBody(Exp)
isVar(Exp), varId(Exp)
isApp(Exp), appFun(Exp), appArg(Exp)
isLit(Exp), litInt(Exp)

In the C file that defines those functions, I'd have whatever representation
I choose. It might be

struct ExpStruct
{
enum { LAM, VAR, APP, LIT } type;
union
{
struct LamStruct lam;
struct VarStruct var;
struct AppStruct app;
struct LitStruct lit; /* or perhaps `int lit` */
}
};

(And this /wouldn't/ be in the public header file, which would have
the minimum needed to give the function signatures. As a first
cut, it would say things like:

typedef struct ExpStruct *Exp;

While some people argue against typedefing struct pointers in this
way, it allows you, if necessary, to /completely/ change your
representation later, eg:

typedef int Exp;

and have expressions encoded as integers, which might be indexes
into an array of expression objects or immediate values or
whatever. The access-via-functions means that the code doesn't
care.
)

If this representation turned out to be naff, I'd pick a different one,
but all being well, the rest of the code would be protected from the
representation change by the functional API.

If performance of the API turned out to be a problem, I might be able to
judiciously inline or macroise the API.

[For the language I did, I had a fiendish representation that played
not-everso-portable games with pointers-as-integers so that I could
reduce the space taken up by nodes, since there was some assumption
that the code would be running in smallish devices with "sane" pointer
semantics. Don't do that unless you /have/ to, and I bet you won't.]

--
Chris "seeker" Dollin
"Reaching out for mirrors hidden in the web." - Renaissance, /Running Hard/
Back to top
Simon Biber
*nix forums Guru Wannabe


Joined: 19 May 2005
Posts: 216

PostPosted: Tue Jul 18, 2006 1:52 pm    Post subject: Re: Good Representation for an Abstract Syntax Tree Reply with quote

johan.tibell@gmail.com wrote:
Quote:
I'm in the process of writing an interpreter for lambda calculus (i.e.
a small functional programming language) in C. I've previously written
one in Haskell so I understand at least some of the concepts involved
in writing an interpreter and I know how to define a grammar and use
lex/yacc to parse it. What I don't know is how to represent the
abstract syntax tree (AST) of my language in C. In Haskell I used
what's called an algebraic data type like so:

data Exp = Lam Id Exp
| Var Id
| App Exp Exp
| Lit Int

Legend:
Exp - expression (i.e. one of Lam, Var, App or Lit)
Lam - lambda abstraction (i.e. a function)
App - function application
Lit - integer literal
Id - variable name (i.e. a string)

How would I represent such a structure in C, perhaps with a union with
some sort of tag? I guess I would use a class hierarchy in an
object-oriented language but I really want to stay in C since I
consider this an educational experience for me (i.e. the goal is not to
write an interpreter as quickly as possible but rather learn how to do
it in an efficient way in C).

Yes, a series of struct within a union would be reasonably efficient.
Here is one way to lay them out:

struct Exp;

struct Lam
{
char *id;
struct Exp *exp;
};

struct Var
{
char *id;
};

struct App
{
struct Exp *exp1;
struct Exp *exp2;
};

struct Lit
{
int value;
};

enum Type
{
LAM,
VAR,
APP,
LIT
};

struct Exp
{
enum Type type;
union {
struct Lam *lam;
struct Var *var;
struct App *app;
struct Lit *lit;
} form;
};

For those types of expression that only contain a single data member
(Var only contains a string and Lit only contains an integer), you may
choose to place their data directly into the union rather than adding a
level of indirection.

--
Simon.
Back to top
johan.tibell@gmail.com
*nix forums beginner


Joined: 18 Jul 2006
Posts: 3

PostPosted: Tue Jul 18, 2006 1:28 pm    Post subject: Good Representation for an Abstract Syntax Tree Reply with quote

I'm in the process of writing an interpreter for lambda calculus (i.e.
a small functional programming language) in C. I've previously written
one in Haskell so I understand at least some of the concepts involved
in writing an interpreter and I know how to define a grammar and use
lex/yacc to parse it. What I don't know is how to represent the
abstract syntax tree (AST) of my language in C. In Haskell I used
what's called an algebraic data type like so:

data Exp = Lam Id Exp
| Var Id
| App Exp Exp
| Lit Int

Legend:
Exp - expression (i.e. one of Lam, Var, App or Lit)
Lam - lambda abstraction (i.e. a function)
App - function application
Lit - integer literal
Id - variable name (i.e. a string)

How would I represent such a structure in C, perhaps with a union with
some sort of tag? I guess I would use a class hierarchy in an
object-oriented language but I really want to stay in C since I
consider this an educational experience for me (i.e. the goal is not to
write an interpreter as quickly as possible but rather learn how to do
it in an efficient way in C).

Once I have an AST I can begin applying transformations such as closure
and CPS (continuation-passing style) conversions to it and I think I
can achieve that on my own but right now finding a good AST
representation is holding me back.

P.S. For those who are interested my plan is to write an all C
interpreter for a strict functional language and use the Boehm garbage
collector.

Thanks in advance,

Johan Tibell
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 1 of 1 [10 Posts] View previous topic :: View next topic
The time now is Fri Nov 21, 2008 7:30 pm | All times are GMT
navigation Forum index » Programming » C
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts Can any body guide me a for a good tutorial for BIND (DNS) SHERDIL networking 3 Sat Aug 12, 2006 9:40 am
No new posts Bug#379104: ITP: complearn-mpi -- parallel quartet tree s... Rudi Cilibrasi devel 0 Fri Jul 21, 2006 11:30 am
No new posts Invalid syntax with STD() function when more than one fie... William Bronsema MySQL 1 Thu Jul 20, 2006 2:18 pm
No new posts dk-milter: syntax error in signature data Tuan Van Postfix 1 Wed Jul 19, 2006 9:58 pm
No new posts text representation of HTML Ksenia Marasanova python 5 Wed Jul 19, 2006 10:09 am

Advertising | Myspace Layouts | El libro de los nombre | Mobile Phone deals | Big Brother 9
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.4330s ][ Queries: 20 (0.3061s) ][ GZIP on - Debug on ]