Narsi: Not A Real Scheme Interpreter
Britt H. Park
April 2, 2004
Contents
Chapter 1
Introduction1
Narsi is a not quite Scheme interpreter. By “not quite” I mean that it basically
smells, tastes and feels like Scheme but doesn’t follow precisely the R5RS
specification. This is on purpose; R5RS makes a number of bad syntactic and
semantic choices. Chapter 3 deals with the ways in which Narsi diverges from
the standard. It is compatible with The Structure and Interpretation of
Computer Programs. That is, one should be able to work one’s way through the
book without running up against more than a few Narsi shortcomings. These
have nothing to do with the standard, per se, but simply the lack of
implementations of the picture language, threading, and the amb operator. A
motivated individual should be able to write two of these extensions in
Narsi. Threading, however, would take considerable modification of the
Narsi interpreter.
The guiding design principle for Narsi over the years has been to give me the
most fun possible. For the most part it has done that. However, it has reached a
stage where I think it might just be useful to others. So recently its guiding
design principle has been to make it as useful as possible. For instance I’ve added
perl style regular expressions and changed string parsing and printing to make
using those regular expressions as easy as possible. Keep tuned and see
what else I do, or better yet let me (britt@sciencething.org) know what
you’d like to see. The latest addition is a set of primitives for using TCP
sockets. As a long term and certainly unreachable goal I’d like to see
Narsi become as powerful as perl without introducing similar levels of
ugliness. Stop laughing. Perl style regular expressions have been a first
step.
Chapter 2
How to Run Narsi
To run Narsi you first need to compile it. To compile it you need to get the tar
ball which can be found at
http://www.sciencething.org/geekthings/index.html.
Narsi was first written for the Atari ST but later moved through Minix to linux
as its primary platform. Now I do most of my development on OS X.
Narsi should compile and run on most unixoid platforms (including Windows,
with a full cygwin installation). First make sure that you have a working libgd,
libjpeg, libpng, libreadline, and libgmp (version 4 or greater). This should be true
of any recent linux installation. Type make, copy bnscheme somewhere in your
execution path, copy startup.scm anywhere you want, and set the “SCHEME”
environment variable to point to the directory that contains it. To run, simply
type bnscheme.
If you have any problems along the way they should be fairly simple to fix.
The Makefile is trivial and easy to modify. You can compile Narsi without
readline support by simply commenting out the READLINE define in scheme.h.
With a little minor surgery you can remove gdprim.c from the compile and
eliminate Narsi’s dependency on libgd, libjpeg, and libpng. It’s essentially
impossible to remove the dependency on libgmp since Narsi’s entire arithmetic
system is built around it. Feel free to consult me at britt@sciencething.org.
Once you have Narsi running, take note of the following things. If an
expression goes hay-wire or you just want to stop a long running program hit
Ctrl-c. Narsi, if you compiled it with readline, has line editing and history. To
terminate Narsi evaluate (exit). To get the best performance out of your
scheme use (macro-expand-all).
The command line syntax for Narsi is:
bnscheme [schemefile]* [-a arg*]? [<schemeinput]
|
Schemefiles are loaded at startup and land you at the interactive prompt,
unless schemeinput is specified. Then Narsi is run in batch mode, reading its
input silently and exiting at end schemeinput. Alternatively schemeinput can
be piped into Narsi. If there is a -a argument, all following arguments are
collected into the scheme variable ARGV.
Narsi, on startup will also load files listed in a .narsirc file in the user’s home
directory. .narsirc is simply a list of paths to scheme files, one to a line. Each
file will be loaded in order. The paths are relative to the user’s home
directory.
Chapter 3
R5RS Differences
The differences between Narsi and real Scheme can be divided into two classes:
major and minor. The major divergences are few but fundamental and the minor
differences are numerous but insignificant.
3.1 Major Divergences
-
Continuations
- Narsi does support continuations, as in call/cc. It is
implemented as a function for people who insist. I won’t vouch for it’s
complete correctness. It passes a handful of tests I’ve thrown at it.
Call/cc is a mechanism for a non-linear execution model that only a
masochist could like. Narsi supports coroutines, a superior means for
achieving the same ends as call/cc. Here’s a trivial example of Narsi’s
coroutines:
(define (f1)
(define c1 (make-empty-context))
(define c2 (make-new-context '(f2 c2 c1) (current-environment)))
(define (iter n)
(if (>= n 1000)
'done
(begin
(display "f1: ")
(print n)
(print (transfer c1 c2 (sqrt n)))
(iter (+ n 1)))))
(iter 1))
(define (f2 me other)
(define (iter n)
(if (>= n 1000)
'done
(begin
(display "f2: ")
(print n)
(print (transfer me other (* n n)))
(iter (+ n 1)))))
(iter 1)
(transfer me other 0))
|
One thing to remember about coroutines is that after any coroutine has
returned to the top level the context associated with it is no longer valid.
You can use such an invalid context in a transfer call but the results are
undefined. It won’t crash the interpreter, but you’ll possibly get into an
infinite loop (which can be stopped with a ^c) or get other inexplicable
results and error messages.
-
Booleans
- There is no boolean primitive data type in Narsi. Instead Narsi uses
the old fashioned convention that the empty list is false and anything else is
true. This is analogous to the C language convention of anything
that casts to 0 being false. Booleans are a waste of effort and can
only degrade performance in a purely interpreted environment like
Narsi’s.
-
Characters
- There is no character primitive data type in Narsi. Narsi only
knows about strings. For the purpose of the char operations that
Narsi implements it expects to get strings of length one for its character
arguments. (Actually the character argument can be any string;
only the first character is used.) I chose to leave out the character
data type because R5RS character constants are grotesque and
unnecessary.
-
Numbers
- Complex number is not a primitive data type in Narsi. Integers,
rationals and reals are supported. Integers and rationals are always
considered exact and reals inexact. The absurdly involved numeric scheme
specified in R5RS serves no useful purpose in practical applications and
would just drag down Narsi’s performance.
-
Arithmetic Operators
- The arithmetic operators (+, -, *, and /) are strictly
binary as opposed to R5RS’s variadic operators. This is done for efficiency’s
sake. If you want the R5RS behavior you can write replacements for the
operators in lisp using the _+, _-, _*, and _/ primitives, which are just
aliases for the binary operators. I chose this route again for efficiency’s
sake.
-
define
- Defines can come anywhere in Narsi. Their placement is restricted in
R5RS to make compiler writers’ lives easier, I imagine. Narsi is strictly an
interpreter and doesn’t care where a define appears. The semantics of
internal defines is different from R5RS also. Defines are executed
sequentially rather than simultaneously. If you want the R5RS behavior use
letrec instead.
-
Macros
- Macros in Narsi are decidedly non-hygienic. This is because I’m lazy
and because hygiene is overrated. Narsi’s macro system is based on macro
constructs which are completely analogous to lambda expressions. Here’s an
example macro definition in Narsi:
(define let
(macro (vars . body)
`((lambda ,(map car vars)
,@body) ,@(map cadr vars))))
|
The only difference between a macro and a lambda is that macros
must take one and only one argument which is bound to the list
of unevaluated arguments at call time; the body of the macro is
evaluated just as a lambda body is but the result of the evaluation is
then evaluated again. First pass: syntax transformation; second
pass: execution. One advantage of this system is that macros are
truly first class and can be handed to higher order procedures like
map.
You may have noted in previous versions of this document the absence of ,
in the example macro above. That was simply because I hadn’t gotten
around to implementing it properly. Well now I’ve sorta’ gotten around to
it by implementing quasiquote (actually backquote in Narsi) in scheme.
The result is a significant slow down. However there is a way around the
performance penalty. There are a macro and a procedure, (macro-expand
symbol) and (macro-expand-all) respectively which expand either one
named procedure or macro’s macros in place. (Is that an awful
sentence or what?) Advantage: generally much quicker execution.
Disadvantage: you lose the ability to change macros and have the changes
automatically propagate to those procedures and macros that have been
expanded.
Of course some of you are laughing at my stone-age implementation
technique of not expanding macros in place. But having macros be as first
class as lambdas has its uses.
-
Case Sensitivity
- Narsi identifiers are case sensitive which is not true of R5RS.
Case insensitivity is unforgivable in any programming language (or
filesystem for that matter). Perhaps scheme has a history of case
insensitivity; I see no reason to carry on the tradition.
-
I/O
- Input and output in Narsi are pretty much completely different from and (I
believe) better than that in R5RS. The syntax for I/O is mostly derived
from C and should be easy to master by anyone familiar with that
language. All that being said I have implemented a number of R5RS I/O
functions for those who actually like them.
-
Mutability
- In Narsi, as opposed to R5RS, all data objects created by constant
expressions like '(a b c) or "foobar", are mutable. My rationale for
this decision is that first, in most of the scheme programming I’ve
done the issue has never come up, and, second, that checking for
“constness” for data access would take up run time for Narsi’s
implementation, which, as a strict interpreter, needs every cycle it can
find.
3.2 Minor Divergences
The minor differences of Narsi from R5RS are covered in Chapter 5
which is a listing of all primitives and basic syntax. There are a few
R5RS expressions that are left out and a few Narsi-specific expressions
included.
Chapter 4
Data Types
Narsi has 18 primitive data types. That may sound scary, but many are opaque
or even invisible from the programmer’s point of view. Here is a listing of each
of the 18 with short descriptions. One thing to note is that Narsi as
opposed to R5RS has no constant data types. For example, (define a
'(a b c)) defines a mutable list. Constant data types make sense in a
compiled scheme but don’t add anything useful to and interpreted scheme,
IMHO.
-
PAIR
- A PAIR is a cons cell as in any lisp. An expression can be tested for
PAIRness by calling pair?. Pairs in Narsi are pretty much identical
to PAIRs in any other lisp.
-
SYMBOL
- A SYMBOL is a name, usually used for referring to variables.
SYMBOLS in Narsi are essentially identical to those in other lisps.
The rules for whether a string of characters is a legal symbol name
are unique to Narsi but shouldn’t cause any confusion for those
familiar with other lisps. The best way of describing what is a legal
SYMBOL name in Narsi is by a regular expression which is roughly
"[^ \t0-9\(\[{][^ \t)\]}]*". In addition to this, a legal SYMBOL
name can’t begin with #(. For a just exactly perfect understanding
of legal SYMBOL names you’ll need to look at the source code in
lexer.c.
-
STRING
- A STRING in Narsi, when you type it, behaves very much like a
string in C. A string is delimited by double quotes. You can use several
escape sequences for special characters. \" stands for a " character,
\t stands for a tab, \n for a newline, \r for a carriage return, and
\\ for a backslash character. Any other \ followed by a character is
treated literally as a \ followed by that character. This does have the
odd effect of showing, on printing, the string "\\b\\c\\d" when the
original string you typed was "\b\c\d". I chose to implement string
reading this way in order to make typing regexes easier. If you’ve typed
"\b" in a non regex context, seeing "\\b" is an immediate indicator
that \b is not a special escape sequence.
-
INT
- An INT is an integer. It can be as many digits as memory allows.
It is always considered exact?. When you divide two INTs you end
up with an INT as result if the numerator is exactly divided by the
denominator or a RATIONAL if not.
-
RATIONAL
- A RATIONAL is a rational number, e.g. 32/3. Internally
they are always represented in lowest terms. If you type 30/4 you
get 15/2 in response. RATIONALs are always considered exact. An
arithmetic operation on a pair of RATIONALs will return an INT if
the result is an exact integer and a RATIONAL otherwise.
-
FLOAT
- A FLOAT is a floating point number. It is always considered
inexact and has a precision that varies. Any floating point number
that you type will be as precise as you type it give or take decimal
conversion imprecision. Divisions present a special situation, however.
When you divide one floating point number by another the result
is sometimes not precisely representable and is therefore cutoff at a
certain number of bits precision. By default that precision is 320 bits
but can be changed by set-precision!.
-
LOCALENV
- A LOCALENV is a context in which scheme code can be
executed. It is created whenever a closure is created. You’ll see them
dangling along at the end of a closure if you print one out. When
programming in Narsi you’ll never need to pay attention to it. So
“pay no attention to the man behind the curtain.” You can play tricks
with LOCALENVs and closures but I’ll let you figure that out for
yourself. The printed representation of a LOCALENV is <ENV:i j>,
for example, if you have a LOCALENV with two local variable i and
j. There is no simple programmatic way of creating a LOCALENV.
Again I’ll leave that to the curious and persistent.
-
GLOBALENV
- A GLOBALENV is also a context within which scheme
code can be executed; it’s the top level environment, the one that top
level expressions are evaluated relative to. It is an opaque data type
and there is nothing you can do with one except pass it to eval. Once
again “pay no attention to the man behind the curtain”.
-
PRIMITIVE
- A PRIMITIVE is just that, a primitive built in procedure.
These to are opaque. Just use them wherever you would use a
procedure.
-
VECTOR
- A VECTOR is a constant time access list. See Chapter 5
for procedures that create, and manipulate VECTORs. In Narsi as
opposed to R5RS vectors are self-evaluating.
-
GD
- A GD is a drawing frame. You create them with gd-create and can
allocate colors, draw lines, draw dots, and export them as jpeg or png
files. See Chapter 5.
-
GDFONT
- A GDFONT is a font for drawing strings on a GD. See
Chapter 5. GDFONTS are not user creatible. There are a fixed
number of predefined GDFONTS.
-
PORT
- A PORT is a file descriptor. They can be created by fopen and
there are three predefined PORTs, stdin, stdout, and stderr Several
I/O primitives use them. See Chapter 5.
-
CONT
- Completely user invisible. Just here for completeness
-
REGEX
- A REGEX is a compiled regular expression created by the
make-regex primitive. They can be used for string matching and
manipulation. See Chapter 5.
-
CONTEXT
- A CONTEXT is the state of execution of a coroutine thread.
CONTEXTS are created by calls to make-empty-context which
produces, not surprisingly, an empty CONTEXT, ready to be the
“from” argument in a transfer call, or to make-new-context which
produces a CONTEXT ready to be a “to” argument in a transfer
call.
-
STACK
- Completely user invisible. Just here for completeness.
-
SOCKET
- A standard TCP socket. They come in three flavors, server
sockets, created by make-server-socket , client sockets, created
by make-client-socket, and server connection sockets, created by
socket-accept when a client makes a new connection to a server
socket.
Chapter 5
Primitives and Basic Syntax
The following is a listing of all the primitives and basic syntax of Narsi.
-
(! n)
- Calculates the factorial of a non-negative integer. Undefined behavior
for other arguments.
-
(!= a b)
- Numerical not equals operator.
-
#f
- Evaluates to '().
-
#t
- Evaluates to the symbol t which is true in a boolean sense.
-
(
x y) - Raises its first argument to the second argument power.
-
(_* a b)
- Multiplies two numbers. Alias for *.
-
(_+ a b)
- Adds two numbers. Alias for +.
-
(_- a b)
- Subtract two numbers. Alias for -.
-
(_/ a b)
- Divides two numbers. Alias for /.
-
(abs x)
- Returns the absolute value of its argument.
-
(accumulate f init l)
- Takes a function of two arguments, a starting value
and a list as arguments and returns the result of applying the function
repeatedly to the values in the list. e.g.
(accumulate + 0 '(3 4 5 6)) => 18
|
This is a lisp procedure.
-
(add-streams s1 s2)
- Produces a new stream which is the sum of two streams.
See SICP for information about delayed evaluation and streams. This is a
lisp procedure.
-
(alarm seconds)
- Takes one argument which is a timeout in seconds. When the
alarm runs out, the Narsi interpreter is interrupted and sent back to the
top level prompt.
-
(and exp1 exp2 ...)
- Returns the logical and of its arguments. This is a macro
and does short circuit evaluation.
-
(append l1 l2)
- Takes two lists as arguments and appends the second to the
first non-destructively.
-
(append! l1 l2)
- Takes two lists as arguments and appends the second to the
first. The first list is modified in the process. It’s the equivalent
of:
-
(apply f args)
- Applies the first function argument to the arguments contained
in the second list argument. Narsi does not support the variadic form of
apply.
-
(arccos x)
- Calculates the cos-1 of its argument.
-
(arcsin x)
- Calculates the sin-1 of its argument.
-
(arctan x)
- Calculates the tan-1 of its argument.
-
(assoc m alist)
- Searches the second argument, which is an alist, for a pair
that matches it’s first argument and returns it. If there is no match in the
alist then '() is returned. Equivalence is determined by equal?. An alist is
simply a list of pairs.
-
(assq m alist)
- Searches the second argument, which is an alist, for a pair that
matches it’s first argument and returns it. If there is no match in the alist
then '() is returned. Equivalence is determined by eq?. An alist is simply a
list of pairs.
-
(atom? exp)
- Returns true if its argument is a number, symbol, or string.
-
(backquote exp) usually abbreviated `exp
- Just like a ' except that it
recursively searches the backquoted expression for unquotes whose
arguments are evaluated and placed into the result returned by backquote.
Unlike R5RS backquote does not recurse through vectors. I don’t have a
good justification for this and am reconsidering it
-
(begin exp1 exp2 ...)
- Evaluates all of its arguments in sequence and returns
the result of the last argument. This is a special form that does not
create a new environment frame. Kind of weird but there you have
it.
-
(block)
- Takes no arguments and allocates a new block of 2000 nodes for the
interpreter. It’s never necessary to call this but sometimes you can get
improved performance by allocating extra blocks over and above the blocks
automatically allocated by Narsi.
-
(call/cc func)
- I won’t try to explain this. I can’t even vouch for its
correctness, although it has passed all the tests I could find to throw at
it.
-
(call-with-current-continuation func)
- Longhand for call/cc.
-
(call-with-input-file string proc)
- As in R5RS.
-
(call-with-output-file string proc)
- As in R5RS.
-
(capitalize string)
- Returns a string with the first character capitalized.
-
(cxr exp) (cxxr exp) (cxxxr exp)
- Narsi provides all car and cdr
combinations up to three deep.
-
(cd directory)
- Changes the current working directory to directory.
-
(ceiling x)
- Returns
x
of its argument x.
-
(char->integer c)
- Accepts a character and converts it into an integer
representing its ascii code.
-
(char-read port)
- Reads a character from a port. Ports are objects returned by
a call to fopen.
-
(char-write port c)
- Writes a character to a port. Argument 1 is the port and
argument 2 is the character.
-
(chomp string)
- Removes the last character of string if the last character is a
newline.
-
(chop string)
- Removes the last character of string.
-
(close-input-port port)
- As in R5RS.
-
(close-output-port port)
- As in R5RS.
-
(command . args)
- Executes a command composed of args which must all be
strings.
-
(cond ((pred exp ...) ...)
- Behaves exactly as in R5RS.
-
(cons a b)
- Behaves exactly as in R5RS.
-
(cons-stream a d)
- Behaves as described in SICP.
-
(context? c)
- Returns true if c is a CONTEXT.
-
(copy-context c)
- Performs a deep copy of a context. It allows you to save a
context for later reuse. It cannot be called on the currently active context
(the current thread of execution). Contexts in Narsi are mutated on use.
This is unlike the behavior of call/cc in R5RS in which a continuation is
immutable.
-
(cos x)
- Calculates the cosine of it’s single argument.
-
(create-gd w h)
- Creates a gd drawing frame. It accepts two arguments which
specify the width and height of the frame.
-
(current-environment)
- Returns the value of the current evaluation
environment.
-
(cwd)
- Return the current working directory as a string.
-
(define a b) or (define (a x ...) exp ...)
- Works as in R5RS.
In Narsi this is a macro. The real defining primitive syntax is
set-env!.
-
(define-macro (name args ...) body)
- Syntactic sugar for (define name
(macro (args ...) body). Identical to non-hygienic macro facilities in
most other scheme implementations.
-
(delay exp)
- Returns its argument expression wrapped up for later execution by
force. See SICP.
-
(display val) (display val port)
- Prints its argument to the screen or to
port with no trailing newline. An alias of princ.
-
(display-stream stream)
- Print out its one stream argument. See SICP.
-
(do ...)
- This is a looping construct. See R5RS for it’s precise syntax. This is a
macro which uses the symbol *do* internally. Watch out for name
collisions.
-
(enclosing-environment env)
- Returns the environment one scope
out from env. If env is already the-environment it just returns
the-environment.
-
(environment? exp)
- Returns t if its argument evaluates to an environment.
-
eof
- Special symbol used to designate end of file. Typing eof at top level
terminates Narsi.
-
(escape message)
- This escapes from the current point of execution to an
arbitrary place further up the execution stack. Heres’s a simple
example:
(define (test-escape)
(let [(r (make-escape))]
(if (null? r)
(target)
r)))
(define (target)
(if (< (random 10) 5)
(escape "Abnormal Termination")
"Normal Termination"))
|
In this example r would be replaced with error handling code.
-
(eq? a b)
- Returns t if its two arguments are the same object.
-
(equal? a b)
- Returns t if its two argument are equivalent. It works by
recursively comparing compound objects. See R5RS.
-
(error string obj)
- Takes two arguments, a string and an object. It
prints those arguments and returns to the top level interpreter
loop.
-
(eval exp env)
- Evaluates its first argument, an arbitrary expression, in the
context of its second, an environment. See the-environment and
current-environment.
-
(even? n)
- Returns t if its argument is divisible by 2.
-
(exact? n)
- Of no real use in Narsi. Integers and rationals are alway exact and
floats are always not.
-
(exec program arg ...)
- Execs program, a string value, passing all of the
string valued args to it.
-
(exit)
- Terminates Narsi.
-
(exp x)
- Returns ex where x is the first and only argument.
-
(expmod x i m)
- If x is the first argument, i is the second argument, and m
is the third argument, returns xi mod m. Only works for integer
i.
-
(fclose port)
- Closes port. Note, unreferenced ports get closed on garbage
collected.
-
(fflush port)
- Flushes the file associated with its single port argument.
-
(filter f l)
- Returns a list of the elements of its second argument (a list) for
which it’s first argument (a function taking one argument returning true or
false) is true.
-
(find-var res)
- Finds all global variables whose names match the regular
expression in the res string.
-
(flatten l)
- Recursively turns an arbitrarily nested list into a flat (no nested
sublists) list.
-
(floor n)
- Takes one argument x and returns
x
.
-
(flush)
- Flushes stdout.
-
font-giant
- Variable that points to the giant font for gd drawing.
-
(font-height font)
- Returns the height of a gd font in pixels.
-
font-large
- Variable that points to the large font for gd drawing.
-
font-medium-bold
- Variable that points to the medium-bold font for gd
drawing.
-
font-small
- Variable that points to the small font for gd drawing.
-
font-tiny
- Variable that points to the tiny font for gd drawing.
-
(font-width font)
- Returns the width of a gd font in pixels.
-
(fopen name mode)
- Returns a port or '() if the file could not be opened with
the given mode. The mode argument is the same as for the standard C
library fopen call. There is no fclose function. Files are closed when they
are garbage collected.
-
(force delayed)
- Evaluates a delayed expression.
-
(for-each f l1 ...)
- Works like map except for side-effects only. Does not
return a list of the function evaluation results. (It returns the symbol done
actually.)
-
(foreach item l exp ...)
- Not to be confused with for-each. This executes
exp ... with item bound to each member in l in order. Kind of like perl’s
foreach.
-
(fork)
- Executes a fork and returns the child pid to the parent process and 0 to
the child process.
-
(fpart n)
- Returns the fractional part of its argument.
-
(fprinc port obj)
- Prints obj to the given port.
-
(fprint port obj ...)
- Prints its obj arguments to the given port and
appends a newline.
-
(fprintf port format exp ...)
- Works like printf but prints to the given
port.
-
(fread port)
- Does the same as read except from a port.
-
(fread-block port count)
- Reads a block of count bytes from port. Returns a
vector of ints that may be shorter than count or eof on end of
file.
-
(fread-byte port)
- Reads a byte from port and returns it as an integer. -1 is
returned on end of file.
-
(fterpri port)
- Print a newline to port.
-
(fwrite-block out vector)
- Writes the vector of ints (bytes), vector, to
out.
-
(fwrite-byte port int)
- Write a byte specified by the integer int to the
specified port.
-
(gc)
- Forces garbage collection. Returns the number of free nodes available.
-
(gcd a b)
- Calculates the greatest common denominator of its two arguments.
-
(gd->jpeg gd name quality)
- Creates a jpeg file from a gd drawing
object. Quality runs from 0 to 100. 95 is about the largest useful
setting.
-
(gd->png gd name)
- Creates a png file from a gd drawing object.
-
(gd-alloc-color gd r g b)
- Allocates a color with r, g, and b value for the gd
drawing object gd. Returns the color index.
-
(gd-line gd color x1 y1 x2 y2)
- Draws a line in a gd drawing object.
-
(gd-point gd color x y)
- Draws a pixel of the given color at the given
position.
-
(gd-string gd font color x y)
- Draws a string at the specified locations.
-
(getenv s)
- Gets the value of the environment variable specified by
s.
-
(get-file name)
- Slurps up the entire contents of name into a single
string.
-
(grep regular-expression-string lines)
- Works the way the command line
grep works but takes and returns a list of strings.
-
(grep-file regular-expression-string namein nameout)
- Works the way
the command line grep works taking input from the file named namein and
writing output to the file named nameout.
-
(heap-sort-list l cmp)
- Sorts the list, l non-destructively using the
comparison function of two arguments, cmp.
-
(inexact? x)
- Returns true if x is inexact, i.e., a float number.
-
(info . args)
- Runs info on its string arguments.
-
(integer->char i)
- Converts an integer into a character.
-
(integer? n)
- Returns true if its argument is an integer.
-
(ipart n)
- Returns the integer part of a number.
-
(isqrt i)
- Returns the integer square-root of an integer.
-
(iview . args)
- Runs Imagemagick’s display on its string arguments.
-
(lcasl string)
- Returns a string that has all upper case characters converted to
lowercase.
-
(length l)
- Returns the length of a list.
-
(let ((a exp) ...) exp ...)
- Let construct. Binds a set of variables
locally and executes a sequence of expressions. Named let is also
supported.
-
(let* ...)
- Just like let, but guarantees that variables are bound left to right.
No named variant.
-
(letrec ...)
- Just like let, but binds the local variables simultaneously. No
named variant.
-
(list a b ...)
- Converts its arguments into a list.
-
(list->vector l)
- Returns a vector containing the members of its list
argument
-
(list? exp)
- Returns true if its argument is a list.
-
(load name)
- Loads the file in its argument as lisp. Can be used recursively.
-
(ls . args)
- Runs ls on its string arguments..
-
(macro-expand name)
- This is a macro that expands all macros in a
top-level defined procedure or macro. Speeds heavily macro’d code up
significantly.
-
(macro-expand-all)
- This macro-expands all macros and procedures at top
level. Running this command great improves Narsi performance for an
application.
-
(make-client-socket servername port)
- Creates a new socket with a
connection to servername on port port.
-
(make-empty-context)
- Makes an empty context object ready to be the “from”
argument of a transfer call. Basically this creates the dummy context
necessary for the coroutine initiating thread.
-
(make-escape)
- Creates a point in an execution to which calls further down the
call chain can escape. See the entry for escape for an example.
make-escape always returns '(). Therefore escape calls should always
return a non-nil value to distinguish an error return.
-
(make-new-context 'expression environment)
- Creates a new context ready
to be the “to” argument in a transfer call. This is the call that makes a
new coroutine thread.
-
(make-regex s)
- Compiles a regular expression (pcre) for use by match-regex or
match-regex-spans.
-
(make-server-socket port)
- Create a server socket ready to accept connections
on port port.
-
(make-string len opt-char)
- Makes a string of the given length containing the
given character or ‘ ’ if the character is left out.
-
(make-string-reader s)
- Creates a function which can be used to read
s-expressions from a string.
-
(make-vector len opt-value)
- Creates a new array object and fills it with
whatever value is passed in or null if no value is passed.
-
(man . args)
- Runs man on its string arguments.
-
(map func l1 ...)
- Passes each member of each of the lists (first list,
first argument, etc. to func and returns a list composed of the
results.
-
(match-regex regex s)
- Tries to match regex to s. If it is successful it returns
a list containing as its first member the full match and as its further
members matches to any parenthesized sub-expressions of regex. Returns
nil if there is no match.
-
(match-regex-g regex s)
- Returns all the substrings matching regex in
s.
-
(match-regex-spans regex s)
- Tries to match regex to s. If it is successful it
returns a list containing as its first member the span of the full match and
as its further members the spans of any parenthesized sub-expressions of
regex. A span is a pair of integers, the first of which is the position of the
starting character of the match and the second of which is the
position of the last character in the match. Returns nil if there is no
match.
-
(match-regex-spans-g regex string)
- Works the way match-regex-spans
does but globally, finding all matches to regex.
-
(max a b ...)
- Returns the maximum value of all of its arguments.
-
(mem)
- Prints out memory usage information. Specific to Narsi.
-
(memq val l)
- Returns the tail of l starting at the first member of l which is
eq? to val or null if it doesn’t find it.
-
(memv val l)
- Just like memq but uses equal? for matching.
-
(min a b ...)
- Returns the minimum value of all its arguments.
-
(modulo n m)
- Returns n mod m.
-
(more . args)
- Runs more on its string arguments.
-
(n-times n exp1 ...)
- Executes the sequence of expressions exp ... n
times.
-
(newline) or (newline port)
- Prints a newline. As in R5RS with (terpri).
-
nil
- Variable that evaluates to '().
-
(not exp)
- Inverts the value of a logical expression.
-
(notrace)
- Turns off debugging.
-
(null? exp)
- Returns true if exp is null. Identical to not in Narsi.
-
(number? exp)
- Returns true if exp is a number.
-
(number->string n)
- Returns a string representation of a number.
-
(odd? exp)
- Returns true if its argument is odd.
-
(one-map f l)
- Just like map but only takes monadic function. This is slightly
faster than map for this common case.
-
(open-input-file name)
- As in R5RS.
-
(open-output-file name)
- As in R5RS.
-
(optimize-one name)
- Optimizes the global function named name. Right now
this simply optimizes unnecessary begin clauses. I hope to add more good
stuff soon.
-
(optimize-all)
- Optimizes all global functions and macros.
-
(or exp ...)
- Returns the logical or “of” its arguments. This is a short
circuiting operator.
-
(pair? exp)
- Returns true if its argument is a pair.
-
(pload name)
- Loads a scheme file using the path variable by trying to load
the specified file for each string valued path in path.
-
(port? exp)
- Returns true if its argument is a port.
-
(primitive? exp)
- Returns true if its argument is a primitive procedure.
-
(princ arg)
- Prints its argument to the console. A synonym for display.
-
(print arg ...)
- Prints its arguments. No newlinesxs.
-
(printf format ...)
- Prints its arguments using format for pictured output.
There are for special symbol that can be used in format:
a,
t,
n,
and 
, which print an argument, a tab character, a newline, or a
,
respectively.
-
(popd)
- Changes the working directory to the most recently pushded
directory.
-
(pushd dirname)
- Changes the working directory to dirname and saves the
previous working directory on a stack which can be returned to by using
popd
-
(put-file name s)
- Writes the contents of string s to file named name.
-
(quotient a b)
- Calculates the quotient a/b dropping the remainder
-
(random n)
- Calculates a random integer between 0 and n - 1.
-
(read) or (read port)
- Reads an S-expression.
-
(read-file->strings name)
- Reads a (text) file from the file, name, into a list
of strings.
-
(readdir dirname)
- Returns a list of strings containing the names of all files
(and directories) in the directory dirname.
-
(real-random-bits bits)
- Returns an n-bit random number. The random
source is /dev/random and on some platforms suitable for generating true
random numbers usable by cryptographic functions.
-
(regex? exp)
- Returns true if its argument is a REGEX.
-
(regex-substitute regex string sub)
- regex is a previously compiled
regular expression, string is the string to match against, and sub is a list
of substitution commands, which are composed of either strings, which are
added to the substitution verbatim, integers which correspond to
submatches, a function of one argument, returning a string, expecting a list
of all the matching strings, or a pair whose car is a function of one string
argument, returning a string, and whose cdr is an integer corresponding to
one of the submatches.
-
(regex-substitute-g regex string sub)
- Just like regex-substitute but
acts globally and substitutes for all matches.
-
(remainder a b)
- Returns the remainder of a/b.
-
(rename pattern sub)
- A bulk regular expression based file renamer. Renames
each file in the current working directory that matches pattern to the
string resulting from sub.
-
(reverse l)
- Reverses a list.
-
(round x)
- Rounds a float or a rational to the nearest integer.
-
(sed pattern subs lines)
- Kind of like its namesake command-line command.
Lines is a list of strings, pattern is a regular expression string, and subs is
a list of substitution commands as in regex-substitute (p. 32). Sed
performs a global substitution on each line of the list.
-
(sed-file pattern subs namein nameout)
- Kind of like its namesake
command-line command. namein is the input file name, nameout is the
output file name, pattern is a regular expression string, and subs
is a list of substitution commands as in regex-substitute (p.
32). Sed performs a global substitution on each line of the input
file.
-
(set! a val)
- Assigns a value to a variable, creates the variable in the
global environment if no enclosed environment contains the variable
name. Note that this is different from R5RS which requires that a
variable already be defined for set! to work. This is a special
form.
-
(set-car! pair value)
- Sets the car of a pair to the given value.
-
(set-cdr! pair value)
- Sets the cdr of a pair to the given value.
-
(set-env! a definition)
- Sets its variable, a, to definition in the current
environment. If the variable doesn’t exist it creates it in the current
environment. This is the real defining special syntax. Define is a macro
built around set-env!.
-
(set-precision! bits)
- Sets the precision of floating point operations in bits.
Narsi starts with a default of 320 bits which is approximately 100 decimal
digits.
-
(setenv name value)
- Sets the environment variable specified by name to
value.
-
(setpgrp)
- Makes the current process a process group leader, and therefore
makes the current process’s parent not have to call wait.
-
(shell)
- Runs an interactive shell. It defaults to bash but can be changed to
whatever shell you prefer. The definition is in startup.scm.
-
(sin theta)
- Calculates sin(
) to approximately 50 decimal places.
-
(socket-accept sock)
- Waits for a new connection on the server socket sock.
Returns a new socket.
-
(socket-read-byte sock)
- Reads a single byte from sock and returns it as an
integer.
-
(socket-read-string sock)
- Reads a newline terminated string from a
socket.
-
(socket-readn sock n)
- Reads n bytes from socket sock returning them as a
vector of integers. This will return a vector shorter than n on an error or
end of stream. Returns eof on a read count of 0.
-
(socket-write-byte sock byte)
- Writes a single byte to sock. Byte is an
integer.
-
(socket-writen sock vec)
- Writes the contents of vec to socket sock. The
members of vec must be integers or characters. Returns t if successful or
nil if not.
-
(socket-write-string sock string)
- Writes a string to sock.
-
(spawn . args)
- Runs a command specified by string arguments, args, in the
background. The child process does not have to be waited for. “This is
nonsense up with which I will not put.”
-
(sqrt x)
- Calculates
.
-
stdin, stdout, stderr
- These point to ports corresponding to standard input,
output and error.
-
(strcat . strings)
- Concatenates all of its STRING arguments.
-
(string->number s)
- Returns a number from a string representation of
one.
-
(string-append s1 s2)
- Returns s1 with s2 appended. s1 is unchanged.
-
(string-copy s)
- Returns a copy of its string argument.
-
(string-fill! s char)
- Replaces all characters in s with char.
-
(string-length s)
- Returns the length of its string argument.
-
(string-read port)
- Reads a full line of text from its port argument and
returns it as a string including a newline.
-
(string-ref string offset)
- Returns the character at the given position in
the given string. Error on out of range indeces. Indeces start at
0.
-
(string-set! s offset char)
- Sets the offsetth character to char. Error on
out of range indeces.
-
(string-write port string)
- Writes the string to the given port.
-
(string<? s1 s2) (string<=? s1 s2) (string=? s1 s2)
- Returns
true if s1 is less than, less-than or equal to, equal to, respectively,
s2.
-
(string>=? s1 s2) (string>? s1 s2)
- Returns true if s1 greater-than or
equal to, or greater-than, respectively, s2.
-
(string? arg)
- Returns true if arg is a string.
-
(strings->write-file sl name)
- Writes a list of strings to the given file
name.
-
(substring string start end)
- Returns the substring of string starting at
start and ending at end. To work gracefully in certain situations
substring will return an empty string if given an end argument that is
precisely one less than its start argument. Otherwise it gives an argument
range error.
-
(symbol->string symbol)
- Converts symbol to a string.
-
(symbol<? symb1 symb2)
- Returns true if symb1 is < symb2
-
(symbol>? symb1 symb2)
- Returns true if symb1 is > symb2
-
(symbol? arg)
- Returns true if arg is a symbol.
-
(system string)
- Runs the command line in string.
-
(system->strings string)
- Runs the command line in string and reads the
output into a list of strings.
-
(tan theta)
- Returns tan(
).
-
(terpri)
- Prints a newline. Synonymous with (newline)
-
the-environment
- Is a variable that always evaluates to the top level
environment. For use with eval.
-
(time exp)
- Prints the elapsed time taken to execute exp and returns the value
returned by exp.
-
(trace)
- Turns on debugging.
-
(transfer from to return-value)
- Transfers control from the current thread
of execution to the context designated by to, saving the current context in
from. The destination context receives return-value as the return value of
its call to transfer, if it had made one.
-
(truncate number)
- Truncates a number to its integer part.
-
(ucase string)
- Returns a string which has each character converted to upper
case.
-
(unquote exp) usually written ,exp
- Within a backquoted construct
evaluates exp.
-
(vector a b ...)
- Makes a vector of its arguments.
-
(vector-ref vector offset)
- Returns the offsetth member of a vector.
-
(vector-set! vector offset value)
- Sets the offsetth member of vector to
value.
-
(vector->list vector)
- Converts a vector to a list.
-
(vector-length vector)
- Returns the length of a vector.
-
(vector? arg)
- Returns true if arg is a vector.
-
(vim filename)
- Runs the VIM editor on filename.
-
(wait)
- Waits for a child process to exit. Returns a pair whose car is the child
pid and whose cdr is the exit status.
-
(while pred ret-exp body ...)
- While pred evaluates to true, executes body.
Returns ret-exp.
-
(write obj) or (write obj port)
- As in R5RS. Pretty much the same as
print and fprint in function.
-
(xx filename)
- Runs GNU Emacs in the terminal on filename.
-
(x filename)
- Runs GNU Emacs in its own window on filename.
Chapter 6
Strange and Amusing Features of Narsi
Narsi has a number of unique and interesting/horrifying features (depending on
your point of view).
6.1 First Class Macros
As mentioned earlier macro closures are just as first class as lambda closures. The
first practical benefit is that macro closures behave identically to lambda closures
with regard to higher order functions.
(map macro1 '(a b c d e))
|
is just as valid as
(map lambda1 '(a b c d e))
|
and behaves as expected, that is, as if macro closures were simple procedures.
6.2 First Class Environments
Environments in Narsi are first class. They can be created by the constant top
level environment the-environment, and the functions (current-environment)
and (enclosing-environment env). Take a look at the wacky implementation
of backquote in startup.scm to see an example application of this feature, used
to avoid inadvertent variable capture.
One can also create an environment with arbitrary bindings by using the
idiom:
(define env
(let ((a (+ 2 3)) (b 'a) (c (cons 1 2)))
(current-environment)))
|
6.3 Really First Class Closures
Since Narsi is strictly an interpreter, closures are constructed by consing the
lambda (or macro) expression with the definition time environment. Closures are
therefore s-exprs and can be manipulated as such. Among other things this
allows the straightforward implementation of the macro expander and simple
optimizer found in startup.scm.
Chapter 7
Using The Debugger
Narsi has a primitive but useful debugger. To start it, evaluate (trace). To turn
it off, evaluate (notrace). When debugging is on, every time you evaluate an
expression, you are presented with a prompt. You have two choices: type s with a
return and step into the evaluation, or type anything else and step over the
evaluation (and all its sub-evaluations). Here’s a transcript of a session debugging
“!” (the factorial function.)
(! 10)
Working...
EVAL [1] EXP= (! 10)
?> s
EVAL [2] EXP= !
?> n
EVAL [2] VAL= ((lambda (n) (define (iter p n) (if (= n 0) p ...
EVAL LIST [2] UNEV= (10)
?> s
EVAL [3] EXP= 10
?> n
EVAL [3] VAL= 10
EVAL LIST [2] VAL= (10)
EVAL SEQUENCE [1] UNEV= ((define (iter p n) (if (= n 0) p ...
?> s
EVAL [2] EXP= (define (iter p n) (if (= n 0) p (iter (* n p) ...
?> n
EVAL [2] VAL= iter
EVAL [1] EXP= (iter 1 n)
?> s
EVAL [2] EXP= iter
?> n
EVAL [2] VAL= ((lambda (p n) (if (= n 0) p (iter (* n p) (- n 1)))) ...
EVAL LIST [2] UNEV= (1 n)
?> s
EVAL [3] EXP= 1
?> n
EVAL [3] VAL= 1
EVAL [3] EXP= n
?> n
EVAL [3] VAL= 10
EVAL LIST [2] VAL= (1 10)
EVAL SEQUENCE [1] UNEV= ((if (= n 0) p (iter (* n p) (- n 1))))
?> s
EVAL [1] EXP= (if (= n 0) p (iter (* n p) (- n 1)))
?> s
EVAL [2] EXP= (= n 0)
?> n
EVAL [2] VAL= ()
EVAL [1] EXP= (iter (* n p) (- n 1))
?> s
EVAL [2] EXP= iter
?> n
EVAL [2] VAL= ((lambda (p n) (if (= n 0) p (iter (* n p) (- n 1)))) ...
EVAL LIST [2] UNEV= ((* n p) (- n 1))
?> s
EVAL [3] EXP= (* n p)
?> n
EVAL [3] VAL= 10
EVAL [3] EXP= (- n 1)
?> n
EVAL [3] VAL= 9
EVAL LIST [2] VAL= (10 9)
EVAL SEQUENCE [1] UNEV= ((if (= n 0) p (iter (* n p) (- n 1))))
?> n
EVAL [1] VAL= 3628800
3628800
|
Note that you are always presented with a prompt with lines beginning
EVAL, EVAL LIST, or EVAL SEQUENCE. These correspond to a simple
expression about to be evaluated, a list of arguments about to be evaluated or
sequence of expressions (as inside a lambda expression) respectively.
Two options may not seem like much and someday I’ll make the debugger
more sophisticated, but it’s quite useful as it stands. A good rule of thumb when
you’re starting with the debugger is to type ‘s’ if you have any doubt about
whether you want to step into an expression or over it.
Chapter 8
Extending Narsi
Extending Narsi is fairly easy. Look at bnscheme.pdf which is an annotated
listings of the Narsi codebase and comes with the distribution. That
should give you a clear enough idea of the way Narsi works to make
modifications.
Adding a new primitive to Narsi is a low-tech affair. Simply add a new
routine that takes an OBJ* list as an argument and register it by calling
add_primitive() on it. Model your code after an existing primitive. Things
become a little more complicated if you need to create a new primitive data type.
You’ll need to add your data to the OBJ union and write new accessor macros. If
your new data type holds a resource or allocates memory you’ll need to modify
the garbage collector to release the resource or memory at gc time. You will
also need to modify the printer in printer.c to print a user readable
form of your new data type, that is if the new data type is ever user
visible.
I partially apologize for not providing a shared library mechanism for
extensions, because its partly laziness on my part. It is also a justifiable design
decision, however. Narsi has a very small memory footprint, so shared
libraries wouldn’t save much virtual memory. Shared libraries add to build
complexity, and my goal with Narsi is to keep the Makefile simple enough
that a simple make will be all that is needed to compile on all supported
platforms. Personally I’m sick to death of automake, which works beautifully
if one’s platform is officially supported but leaves one in the lurch if
not.
Chapter 9
To Be Done
Here’s the current list of Things To Be Done.
-
Complete Unixoid System Call Interface
- I’m
making slow but deliberate progress. I’ve implemented readdir, chdir,
fork, exec, and wait. In addition through the system, fork, and exec
calls all unix command line tools are available. I will be adding, for
efficiency reasons, many unix system calls as primitives.
-
Socket Support
- The basics of socket support are in place for TCP. I’ll be
adding more bells and whistles in time.
-
Performance Optimization
- Narsi is already quite efficient, as fast
as some byte compiled schemes. But there’s always room for
improvement. I may even byte compile it someday, “RSN”.
-
R5RS Compatibility
- Narsi is already a subset of R5RS and there aren’t
any things that I’m aware of that I’d like, further from it, to include,
but you never know.
-
Better Reader
- Narsi’s reader is a recursive descent parser and is currently
more than a bit of a stack pig. So, on architectures with limited C run
time stacks even trivial data can cause a stack overflow. A simple 10000
element list causes the OS/X version to crash. (There is a workaround;
before running Narsi run unlimit in tcsh or ulimit -s 65536 in
bash. This only postpones the moment of pain to larger s-lists.) I plan
on fixing up the reader soon.
-
User Suggested Features
- I’m waiting.