======================================================================== >>> PLT makes it possible to easily implement new languages, ranging from ones that are similar to the PLT Scheme language to languages that are different in their semantics and their syntax. For example, the PLT documentation sources look like marked-up text files but are actually source code for the documentation language. Another example: ---- #lang planet jaymccarthy/datalog parent(eli,tomer). parent(regina,tomer). male(eli). female(regina). father(X,Y) :- male(X), parent(X,Y). mother(X,Y) :- female(X), parent(X,Y). mother(Who,tomer)? father(Who,tomer)? ---- ======================================================================== >>> Basic PLT Simple Scheme program: ----dr #lang scheme (define (random-element list) (list-ref list (random (length list)))) ---- Scheme is dynamically typed, so: ----dr #lang scheme (define (random-element container) (define (rand ref length) (ref container (random (length container)))) (cond [(list? container) (rand list-ref length)] [(vector? container) (rand vector-ref vector-length)] [(string? container) (rand string-ref string-length)])) ---- Files that begin with `#lang' are actually modules, we can make them provide bindings using `provide'. ----dr rand.ss #lang scheme (provide random-element) (define (random-element container) (define (rand ref length) (ref container (random (length container)))) (cond [(list? container) (rand list-ref length)] [(vector? container) (rand vector-ref vector-length)] [(string? container) (rand string-ref string-length)])) ---- ----mz #lang scheme (require "rand.ss") (random-element '("foo" "bar" "baz")) (random-element "blah") ---- We can of course use the (many) usual tools in PLT, everything that the `scheme' language provides. For example, we can provide a unicode name for the existing function: ----dr rand.ss #lang scheme (provide random-element) (define (random-element container) (define (rand ref length) (ref container (random (length container)))) (cond [(list? container) (rand list-ref length)] [(vector? container) (rand vector-ref vector-length)] [(string? container) (rand string-ref string-length)])) (provide ρ) (define ρ random-element) ---- Or rename the existing binding when providing it, avoiding the new definition: ----dr rand.ss #lang scheme (provide (rename-out [random-element ρ])) (define (random-element container) (define (rand ref length) (ref container (random (length container)))) (cond [(list? container) (rand list-ref length)] [(vector? container) (rand vector-ref vector-length)] [(string? container) (rand string-ref string-length)])) ---- We can also use macros: ----dr rand.ss #lang scheme (provide (rename-out [random-element ρ])) (define (random-element container) (define (rand ref length) (ref container (random (length container)))) (cond [(list? container) (rand list-ref length)] [(vector? container) (rand vector-ref vector-length)] [(string? container) (rand string-ref string-length)])) (define-syntax-rule (choose expr ...) (force (random-element (vector (delay expr) ...)))) (define-syntax-rule (let-rand ([x v ...] ...) body ...) (let ([x (choose v ...)] ...) body ...)) (define (random-noun) (let-rand ([adj "old" "red" "big"] [noun (choose "dog" "cake" "car") (random-noun)]) (string-append adj " " noun))) ---- And, of course, we get the usual benefits of hygiene. ======================================================================== >>> Macros and Modules PLT Scheme treats macros just as it does functions and values. For example, we can provide the macros in the last example and use them in a different file. Note that we don't have to provide the functions that macros expand to: ----dr rand.ss #lang scheme (define (random-element container) (define (rand ref length) (ref container (random (length container)))) (cond [(list? container) (rand list-ref length)] [(vector? container) (rand vector-ref vector-length)] [(string? container) (rand string-ref string-length)])) (provide choose let-rand) (define-syntax-rule (choose expr ...) (force (random-element (vector (delay expr) ...)))) (define-syntax-rule (let-rand ([x v ...] ...) body ...) (let ([x (choose v ...)] ...) body ...)) ---- ----dr #lang scheme (require "rand.ss") (define (random-noun) (let-rand ([adj "old" "red" "big"] [noun (choose "dog" "cake" "car") (random-noun)]) (string-append adj " " noun))) ---- ... and we can use the macro stepper to see how this code expands. But using the reflective capabilities of the language it seems that the module abstraction can be broken: ----mz #lang scheme (require "rand.ss") (syntax-e (car (syntax-e (cadr (syntax-e (expand-once #'(choose 1 2))))))) ---- Turning macros into backdoors into the implementation. But PLT protects syntax objects that are the results of expansion from such misuse: ----mz #lang scheme (require "rand.ss") (eval (car (syntax-e (cadr (syntax-e (expand-once #'(choose 1 2))))))) ---- ... We can also use the library in a macro. For example, to choose a random expression at compile-time we need to use `syntax-case' (PLT's low-level macro system): ----dr #lang scheme (require "rand.ss") (provide compile-choose) (define-syntax (compile-choose stx) (syntax-case stx () ;; `_' is a wildcard [(_ E1 E2) (choose E1 E2)] [(_ E1 E2 E3) (choose E1 E2 E3)])) ---- The first problem we run into is that PLT has "phase separation" -- the macro code lives in its own world, and there is no sharing between it and the runtime code. (This makes it possible to have the same semantics any way the code is compiled, Matthew's 2002 paper introduced this.) We therefore need to require the library at the syntax phase, which is where the functionality is needed: ----dr crand.ss #lang scheme (require (for-syntax "rand.ss")) (provide compile-choose) (define-syntax (compile-choose stx) (syntax-case stx () [(_ E1 E2) (choose E1 E2)] [(_ E1 E2 E3) (choose E1 E2 E3)])) ---- This means that `choose' is used at compile-time, but it is still a macro, therefore this code uses now three different running levels. Side note: the `scheme' language is intneded to be a "pretty big" language with lots of stuff in. There is also the minimal `scheme/base' language that is recommended for programs beyond the quick few-liners. In that language, the syntax phase has only `syntax-rules', and to use anything else you need to (require (for-syntax scheme/base)). A second problem here is that `E1' etc are not plain identifiers, they are "pattern bindings" that are bound by the pattern-matching aspect of `syntax-case', and can only be used in output patterns, denoted by `syntax' forms -- these forms are similar to `quote', and they have a similar reader shorthand -- #'E1: ----dr crand.ss #lang scheme (require (for-syntax "rand.ss")) (provide compile-choose) (define-syntax (compile-choose stx) (syntax-case stx () [(_ E1 E2) (choose #'E1 #'E2)] [(_ E1 E2 E3) (choose #'E1 #'E2 #'E3)])) ---- and this version is working fine: ----mz #lang scheme (require "crand.ss" "rand.ss") (define (foo1) (choose 1 2 3)) (define (foo2) (compile-choose 1 2 3)) (list (foo1) (foo1) (foo1) (foo1) (foo1) (foo1)) (list (foo2) (foo2) (foo2) (foo2) (foo2) (foo2)) ---- Say that we want to provide an interface similar to "rand.ss", except for `choose' which will perform the choice at compile-time. For this, we need to provide everything that "rand.ss" provides: ----dr #lang scheme (require (for-syntax "rand.ss")) (provide (all-from-out "rand.ss") compile-choose) (define-syntax (compile-choose stx) (syntax-case stx () [(_ E1 E2) (choose #'E1 #'E2)] [(_ E1 E2 E3) (choose #'E1 #'E2 #'E3)])) ---- except that we don't really have a runtime version of "rand.ss" to provide, we need to require it to the runtime phase too: ----dr #lang scheme (require (for-syntax "rand.ss") "rand.ss") (provide (all-from-out "rand.ss") compile-choose) (define-syntax (compile-choose stx) (syntax-case stx () [(_ E1 E2) (choose #'E1 #'E2)] [(_ E1 E2 E3) (choose #'E1 #'E2 #'E3)])) ---- Now drop the `choose': ----dr #lang scheme (require (for-syntax "rand.ss") "rand.ss") (provide (except-out (all-from-out "rand.ss") choose) compile-choose) (define-syntax (compile-choose stx) (syntax-case stx () [(_ E1 E2) (choose #'E1 #'E2)] [(_ E1 E2 E3) (choose #'E1 #'E2 #'E3)])) ---- and rename the new binding as `choose' ----dr #lang scheme (require (for-syntax "rand.ss") "rand.ss") (provide (except-out (all-from-out "rand.ss") choose) (rename-out [compile-choose choose])) (define-syntax (compile-choose stx) (syntax-case stx () [(_ E1 E2) (choose #'E1 #'E2)] [(_ E1 E2 E3) (choose #'E1 #'E2 #'E3)])) ---- `let-rand' still uses the original version of `choose' -- no conflicts. We could even use `choose' as the name of the new macro, which means that the code is trying to have two different versions of a runtime `choose' -- one that we require and one that we define: ----dr #lang scheme (require (for-syntax "rand.ss") "rand.ss") (provide (except-out (all-from-out "rand.ss") choose) choose) ; no renaming (define-syntax (choose stx) (syntax-case stx () [(_ E1 E2) (choose #'E1 #'E2)] [(_ E1 E2 E3) (choose #'E1 #'E2 #'E3)])) ---- But we don't need the runtime `choose', so we can remove `choose' from our own bindings to sort things out. ----dr #lang scheme (require (for-syntax "rand.ss") (except-in "rand.ss" choose)) (provide (all-from-out "rand.ss") ; no `except-in' choose) (define-syntax (choose stx) (syntax-case stx () [(_ E1 E2) (choose #'E1 #'E2)] [(_ E1 E2 E3) (choose #'E1 #'E2 #'E3)])) ---- ======================================================================== >>> `#lang' Syntax, and Tweaking Your Own Languge Programs in PLT always begin with a `#lang' line that specifies the language that is used for the module. ----mz #lang scheme (pretty-print (call-with-input-file "rand.ss" read)) ---- ----mz #lang scheme (parameterize ([read-accept-reader #t]) (pretty-print (call-with-input-file "rand.ss" read))) ---- So this reads as (module rand scheme ...) which is kind of like (module rand (require scheme) ...) except that `scheme' is an "initial import" which special in the way that it can affect further processing of the module body. Such modules are sometimes called "language modules", since they're used to define the "language" that the module is written in (they control the semantics, but not the concrete syntax -- that's where `#lang' comes in). Minor note: (require scheme) is equivalent to (require scheme/main), and the contents of that file is not surprising. Now that we know how the language is made, we can do the same thing we did above: require and re-provide everything from the `scheme' module. Actually, we don't even need to require, since it's already required as the initial import module: ----mz scheme-like.ss #lang scheme (provide (all-from-out scheme)) ---- To use it, we can write: ----mz (module foo "scheme-like.ss" '...code...) ---- but to make it more convenient there is a special `s-exp' language that can be used with `#lang'. This language reads the file using the default Scheme reader, and turns it into a module with the initial import specified after the `#lang s-exp' (unlike `#lang scheme' that always uses `scheme'): ----mz #lang s-exp "scheme-like.ss" '...code... ---- Now we can start tweaking our language, making it diverge from the `scheme' language in little steps. It is easy to extend the language by providing more bindings: ----dr scheme-like.ss #lang scheme (provide (all-from-out scheme) add) (define (add x . xs) (apply (cond [(number? x) +] [(list? x) append] [(string? x) string-append] ;; ... ) x xs)) ---- remove bindings: ----dr scheme-like.ss #lang scheme (provide (except-out (all-from-out scheme) +) add) (define (add x . xs) (apply (cond [(number? x) +] [(list? x) append] [(string? x) string-append] ;; ... ) x xs)) ---- and we can do that just as easily with macros and special forms: ----dr scheme-like.ss #lang scheme (provide (except-out (all-from-out scheme) + if when) add if*) (define (add x . xs) (apply (cond [(number? x) +] [(list? x) append] [(string? x) string-append] ;; ... ) x xs)) (define-syntax if* (syntax-rules () [(if* x y) (when x y)] [(if* x y z) (if x y z)] [(if* x y z . more) (if x y (if* z . more))])) ---- As usual, not everything needs to be defined in this file, ----dr scheme-like.ss #lang scheme (require "rand.ss") (provide (except-out (all-from-out scheme) + if when) choose let-rand ; from "rand.ss" add if*) (define (add x . xs) (apply (cond [(number? x) +] [(list? x) append] [(string? x) string-append] ;; ... ) x xs)) (define-syntax if* (syntax-rules () [(if* x y) (when x y)] [(if* x y z) (if x y z)] [(if* x y z . more) (if x y (if* z . more))])) ---- It's very common to implement parts of the new language in several files, then have the main language module only assemble the pieces. (E.g., see the `scheme' implementation in the "scheme/main.ss" module.) And of course you don't need to provide awkward names -- you can just as well provide your own bindings under the familiar names: ----dr scheme-like.ss #lang scheme (require "rand.ss") (provide (except-out (all-from-out scheme) + if when let) choose (rename-out [add +] [if* if] [let-rand let])) (define (add x . xs) (apply (cond [(number? x) +] [(list? x) append] [(string? x) string-append] ;; ... ) x xs)) (define-syntax if* (syntax-rules () [(if* x y) (when x y)] [(if* x y z) (if x y z)] [(if* x y z . more) (if x y (if* z . more))])) ---- The result is a language that can be very close or very far to the default `scheme' language, or any other language that you choose to tweak in this way: ----dr #lang s-exp "scheme-like.ss" (define (random-noun) (let ([adj "old" "red" "big"] [noun (choose "dog" "cake" "car") (random-noun)]) (+ adj " " noun))) ---- Again, the thing to note here is how hygiene makes it all just work. There's no need to worry about name clashes, or abstraction leaks. In this new language `+' is bound to our `add' function, which in its implementation uses the "builtin" addition. The quotes are intended: obviously, in this world there is no real difference between functions that are "builtins" and ones that are defined inside the language. This is directly in line with the usual Scheme tradition of complete flexibility: nothing is sacred, and everything can change -- except that this is now done robustly: no (set! + add) side-effects and worry about breaking library code, and not limited to just changing functions. ======================================================================== >>> PLT #%Magic Tricks PLT has a number of "special" macros that are implicit in the code. For example, there is a special `#%module-begin' macro that implicitly wraps a module body. For example, the `scheme' language provides a `#%module-begin' binding that prints out values of all non-definition expressions, which can be seen when we inspect what the code expands to: ----dr #lang scheme (+ 1 2) ---- One thing that this implicit macro can do is some initialization and/or finalization side-effects when needed: ----dr quiet.ss #lang scheme (provide (except-out (all-from-out scheme) #%module-begin) (rename-out [module-begin #%module-begin])) (define-syntax-rule (module-begin x ...) (#%module-begin (define old-printer (current-print)) (current-print void) x ... (current-print old-printer))) ---- ----mz #lang s-exp "quiet.ss" (+ 1 2) ---- Another common use for it is when there is some module-global transformation that is needed -- for example, some wrapping of each toplevel expression: ----dr verbose.ss #lang scheme (provide (except-out (all-from-out scheme) #%module-begin) (rename-out [module-begin #%module-begin])) (define-syntax-rule (module-begin x ...) (#%module-begin (display (banner)) (begin (printf "> ~s\n" 'x) x) ... (display "Good bye\n"))) ---- ----mz #lang s-exp "verbose.ss" (define x 100) (+ x 3) ---- Our new `#%module-begin' is expanding into a printout of the (quoted) expression, followed by the expression itself -- and all of that is done inside the `#%module-begin' that comes from the `scheme' language, which will then print out the results of evaluating non-definition expressions. Most `#%module-begin' definitions (including the one from the `scheme' language) splice expressions that appear in a `begin' form, which allows the "other role" of `begin' in Scheme as a macro-level splicing construct. Our version doesn't do that (which might be a desired feature): ----mz #lang s-exp "verbose.ss" (begin (define x 100) (define y 3)) (+ x y) ---- and doing so is usually easy to get with a simple recursive macro: ----dr verbose.ss #lang scheme (provide (except-out (all-from-out scheme) #%module-begin) (rename-out [module-begin #%module-begin])) (define-syntax-rule (module-begin x ...) (#%module-begin (wrap x) ...)) (define-syntax wrap (syntax-rules (begin) ; `begin' is a keyword [(wrap (begin x ...)) (begin (wrap x) ...)] [(wrap x) (begin (printf "> ~s\n" 'x) x)])) ---- ----mz #lang s-exp "verbose.ss" (begin (define x 100) (define y 3)) (+ x y) ---- Note: there is a `syntax/flatten-begin' library that provides a convenient function for this. *** Exercise: Make a language where each toplevel literal string gets `display'ed on the output, so: #lang s-exp "str-show.ss" "Definitions...\n" (define x 100) (define y 3) "Test: " (+ x y) "... should be " 103 will produce this output Definitions... Test: 103 ... should be 103 Hint: this should be implemented by a macro that uses `syntax-case' and fender expressions to identify syntaxes that are string literals, for example: (define-syntax (foo stx) (syntax-case stx () [(foo x) (string? (syntax-e #'x)) #''a-string] [(foo _) #''not-a-string])) *** Solution: ----dr str-show.ss #lang scheme (provide (except-out (all-from-out scheme) #%module-begin) (rename-out [module-begin #%module-begin])) (define-syntax-rule (module-begin x ...) (#%module-begin (wrap x) ...)) (define-syntax (wrap stx) (syntax-case stx (begin) [(wrap (begin x ...)) #'(begin (wrap x) ...)] [(wrap x) (string? (syntax-e #'x)) #'(display x)] [(wrap x) #'x])) ---- ----mz #lang s-exp "str-show.ss" "Definitions...\n" (define x 100) (define y 3) "Test: " (+ x y) "... should be " 103 ---- `#%module-begin' is not the only magic syntax in PLT, there are a few others. `#%top' is a syntax that gets implicitly wrapped around identifiers that are unbound in the module, and will usually lead to a syntax error when expanded. For example, use the syntax debugger with: ----dr #lang scheme (list x y z) ---- Redefining it is even easier than `#%module-begin', here's a simple example: ----dr quoted-vars.ss #lang scheme (provide (except-out (all-from-out scheme) #%top) (rename-out [top #%top])) ;; Note that `#%top' is wrapped as an implicit pair, not a list (define-syntax-rule (top . x) 'x) ---- ----mz #lang s-exp "quoted-vars.ss" (list x y z) ---- *** Exercise: Create a language where identifiers that look like `:foo' are self evaluating as in Common Lisp. Doing this via `#%top' means that in this language you can still use such identifiers as bindings -- only unbound occurrences will have this property. Hint: this should use a fender expression with `syntax-e' too, except that now the result is expected to be a symbol that matches a certain pattern (when converted into a string). The `identifier?' function is another useful syntax utility: it returns true for identifier syntaxes, where `syntax-e' is a symbol. *** Solution: ----dr keywords.ss #lang scheme (provide (except-out (all-from-out scheme) #%top) (rename-out [top #%top])) (define-syntax (top stx) (syntax-case stx () [(top . x) (and (identifier? #'x) (regexp-match? #rx"^:" (symbol->string (syntax-e #'x)))) #''x] [(top . x) #'(#%top . x)])) ---- ----mz #lang s-exp "keywords.ss" (list :foo 1 :bar 2) (let ([:foo :bar]) (list :foo 1 :bar 2)) ---- Probably the most useful magic syntax is `#%app' which is implicitly used in all function applications. Usually, an (#%app f x ...) "expands" to the built in notion of function application (actually, it's a primitive form that is compiled by mzscheme as a function call), but overriding it is useful for languages that have different semantics for function applications. ----dr traced.ss #lang scheme (provide (except-out (all-from-out scheme) #%app) (rename-out [app #%app])) (define-syntax-rule (app f x ...) ;; note that applications in the resulting pattern use our own ;; `#%app' implicitly (begin (printf " ~s\n" '(f x ...)) (f x ...))) ---- ----mz #lang s-exp "traced.ss" (+ 1 (* 2 3)) (define (fact n) (if (<= n 0) 1 (* n (fact (sub1 n))))) (fact 3) ---- *** Exercise: Lispers like when () evaluates to the empty list without a quote. "Fix" the `scheme' language by adding this feature. *** Solution: ----dr nil.ss #lang scheme (provide (except-out (all-from-out scheme) #%app) (rename-out [app #%app])) (define-syntax app (syntax-rules () [(app) '()] [(app f x ...) (f x ...)])) ; <-- implicit #%app here ---- ----mz #lang s-exp "nil.ss" (cons 1 ()) ---- Here is another language, where applications of non-procedure values returns a list: ----dr quoted-apps.ss #lang scheme (provide (except-out (all-from-out scheme) #%app) (rename-out [app #%app])) (define-syntax-rule (app f x ...) (if (procedure? f) (f x ...) (list f x ...))) ---- ----mz #lang s-exp "quoted-apps.ss" ('+ 1 ('+ 1 1) 3) ('html ('head ('title "A Title")) ('body (('bgcolor "black")) ('p "...text..."))) ---- This `app' macro has a common bug -- the `f' expression is evaluated twice, which is bad for side-effects, for runtime, and because it leads to a possible code size blowup. Fixing it is easy thanks to hygiene, and because a `let' doesn't add any runtime cost (part of Kent Dybvig's "Macro-Writer's Bill of Rights"): ----mz quoted-apps.ss #lang scheme (provide (except-out (all-from-out scheme) #%app) (rename-out [app #%app])) (define-syntax-rule (app f x ...) (let ([f* f]) (if (procedure? f*) (f* x ...) (list f* x ...)))) ---- However, the code blowup (not the runtime and the side-effects) problem holds for the `x ...' arguments too, as can be seen in the expansion of this bit: ----dr #lang s-exp "quoted-apps.ss" ('+ ('+ ('+ ('+ 1)))) ---- While there are ways to deal with this with the simple `syntax-rules' (the naive solution doesn't work, since `x* ...' will not "invent" new names), they tend to be awkward. It's much easier to switch to a procedural macro and use the `generate-temporaries' function to generate the needed names. ----mz quoted-apps.ss #lang scheme (provide (except-out (all-from-out scheme) #%app) (rename-out [app #%app])) (define-syntax (app stx) (syntax-case stx () [(app f x ...) (with-syntax ([(x* ...) (generate-temporaries #'(x ...))]) #'(let ([f* f] [x* x] ...) (if (procedure? f*) (f* x* ...) (list f* x* ...))))])) ---- And this version is now perfectly fine. ----dr #lang s-exp "quoted-apps.ss" ('+ ('+ ('+ ('+ ('+ ('+ ('+ ('+ ('+ ('+ ('+ ('+ ('+ 1))))))))))))) ---- Note that `with-syntax' is similar to a `let', except that it binds template variables on the LHS to the syntax that the RHS expressions evaluate to. *** Exercise: Combine these two tools -- the self-quoting free identifiers, and the application of values that are not procedures. Hint: No new code is needed, only plumbing. *** Solution: ----dr quoted.ss #lang scheme ;; this is only doing the identifier plumbing work (require (only-in "quoted-vars.ss" #%top) (only-in "quoted-apps.ss" #%app)) (provide (all-from-out scheme "quoted-vars.ss" "quoted-apps.ss")) ---- ----mz #lang s-exp "quoted.ss" (list x y z) (x y z) ---- *** Exercise: Something that Scheme newbies often wonder is how would the language look if multiple values were spliced into function application expressions. Implementing such a language is easy: you need a `vals->list' macro such that (vals->list E) evaluates to the list of values that `E' evaluates to, and then use it in a new `#%app'. Don't forget that the new evaluation rule should apply to all expressions in an application form, including the first one. *** Solution: ----dr vals.ss #lang scheme (provide (except-out (all-from-out scheme) #%app) (rename-out [app #%app])) (define-syntax-rule (vals->list expr) (call-with-values (lambda () expr) list)) (define-syntax-rule (app x ...) (let ([l (append (vals->list x) ...)]) (apply (car l) (cdr l)))) ---- ----mz #lang s-exp "vals.ss" (define (foo x) (values (sub1 x) x (add1 x))) (list (foo 10) (foo 20)) (define (bar) (values add1 3)) ((bar)) ---- Finally, here is a language where all functions and all applications are curried. Note that it defines its own `define' in addition to `lambda'. This is a common idiom when you want to change the meaning of functions: if the usual `define' is provided, then its syntactic sugar form will expand to its own `lambda'. This is done manually here, but note also the `syntax/define' library which provides a utility that does this transformation recursively for any number of parens, constructing the correct lambda expression, with an argument specifying the `lambda' identifier to use (this is of course not needed in this language). ----dr curried.ss #lang scheme (provide (except-out (all-from-out scheme) lambda λ define #%app) (rename-out [lambda* lambda] [lambda* λ] [define* define] [app* #%app])) (define-syntax define* (syntax-rules () [(_ (name x xs ...) e es ...) (define name (lambda* (x xs ...) e es ...))] [(_ name e) (define name e)])) (define-syntax lambda* (syntax-rules () [(_ (x) e es ...) (lambda (x) e es ...)] [(_ (x xs ...) e es ...) (lambda (x) (lambda* (xs ...) e es ...))])) (define-syntax app* (syntax-rules (+) ;; HACK! -- `+' is not curried [(_ + xs ...) (+ xs ...)] [(_ f x) (f x)] [(_ f x xs ...) (app* (f x) xs ...)])) ---- ----mz #lang s-exp "curried.ss" (define (add x y) (+ x y)) add (add 5) ((add 5) 7) (add 5 7) ; translates to a curried application (define add5 (add 5)) (add5 3) ---- This doesn't work: ----mz #lang s-exp "scheme-like.ss" (define (add x y) (+ x y)) (map (add 5) (list 1 2 3)) ---- Here is a non-solution: ----dr curried.ss #lang scheme (provide (except-out (all-from-out scheme) lambda λ define #%app) (rename-out [lambda* lambda] [lambda* λ] [define* define] [app* #%app])) (define-syntax define* (syntax-rules () [(_ (name x xs ...) e es ...) (define name (lambda* (x xs ...) e es ...))] [(_ name e) (define name e)])) (define-syntax lambda* (syntax-rules () [(_ (x) e es ...) (lambda (x) e es ...)] [(_ (x xs ...) e es ...) (lambda (x) (lambda* (xs ...) e es ...))])) (define-syntax app* (syntax-rules (+ map list) ;; MORE HACKS! [(_ + xs ...) (+ xs ...)] [(_ map xs ...) (map xs ...)] [(_ list xs ...) (list xs ...)] [(_ f x) (f x)] [(_ f x xs ...) (app* (f x) xs ...)])) ---- ----mz #lang s-exp "curried.ss" (define (add x y) (+ x y)) (map (add 5) (list 1 2 3)) ---- We could add more bindings, even all of `scheme', but even if we do that... ----mz #lang s-exp "curried.ss" (define (add x y) (+ x y)) (let ([lst list]) (map (add 5) (lst 1 2 3))) ---- A better way to do this is to make application forms use the value of the applied function rather than the identifier. Something like: (app f x ...) --> (if (memq f (list + map list ...many more...)) (f x ...) (curried-app f x ...)) ======================================================================== >>> Applicable Structs But an even better way is to somehow tag functions that are created in the "curried world", so that they can be reliably distinguished from "foreign" functions. A common way to do this is to use applicable structs: a struct that wraps a function and behaves like it when it is applied. ----dr curried.ss #lang scheme (provide (except-out (all-from-out scheme) lambda λ define #%app) (rename-out [lambda* lambda] [lambda* λ] [define* define] [app* #%app])) ;; tagged functions (define-struct curried (fun) #:property prop:procedure (struct-field-index fun)) (define-syntax define* (syntax-rules () [(_ (name x xs ...) e es ...) (define name (lambda* (x xs ...) e es ...))] [(_ name e) (define name e)])) ;; our `lambda*' creates these tagged functions (define-syntax lambda* (syntax-rules () [(_ (x) e es ...) (make-curried (lambda (x) e es ...))] [(_ (x xs ...) e es ...) (make-curried (lambda (x) (lambda* (xs ...) e es ...)))])) ;; no hacks (define-syntax curried-app (syntax-rules () [(_ f x) (f x)] [(_ f x xs ...) (curried-app (f x) xs ...)])) ;; dynamic dispatch on the type of the function (define-syntax (app* stx) (syntax-case stx () [(_ f x ...) (with-syntax ([(x* ...) (generate-temporaries #'(x ...))]) #'(let ([f* f] [x* x] ...) (if (curried? f*) (curried-app f* x* ...) (f* x* ...))))])) ---- ----mz #lang s-exp "curried.ss" (define (add x y) (+ x y)) (let ([lst list]) (map (add 5) (lst 1 2 3))) ---- [An alternative approach is to "somehow" wrap all foreign functions in a curried version, avoiding the runtime cost of a dispatch on every function call. This is difficult to do (eg, Typed Scheme). In this case it'll not help much in runtime since the curried versions of the foreign functions will consume space, but it will make it possible to use them as if they were curried too -- (map (add 5)), for example.] Note that in this implementation there is no actual need for applicable structs within the language -- when we know that a function is curried, we can just pull out its function value. However, it is still useful to provide these functions to foreign modules and still be able to use them there. (But again, this can be avoided by wrapping functions on their way out.) Applicable structs are useful in other cases for more than just simple tagging. For example, they can be used to package up functions with information about them. In fact, this is how keyworded functions are implemented in the `scheme' language (see the Scheme workshop paper). *** Exercise: Implement a language with a new `procedure-arguments' function that returns the list of argument names that procedures expect. For simplicity, ignore nested argument sugars, display the arguments as is (no need to pull out optional names), and have it apply only for definitions inside the language. *** Solution: ----dr args.ss #lang scheme (provide (except-out (all-from-out scheme) lambda λ define) (rename-out [lambda* lambda] [lambda* λ] [define* define]) procedure-arguments) (define-struct with-args (fun args) #:property prop:procedure (struct-field-index fun)) (define-syntax define* (syntax-rules () [(_ (name x xs ...) e es ...) (define name (lambda* (x xs ...) e es ...))] [(_ name e) (define name e)])) (define-syntax-rule (lambda* args e es ...) (make-with-args (lambda args e es ...) 'args)) (define (procedure-arguments proc) (and (with-args? proc) (with-args-args proc))) ---- ----mz #lang s-exp "args.ss" (define (foo x y z) 1) (procedure-arguments foo) ---- ======================================================================== >>> Breaking Hygiene This is a "short" version of the unhygienic story, and a few related tricks. Traditionally, Lisp implementations and early Schemes use an unhygienic macro system, where source code is made of plain S-expressions holding *symbols*. Newer Scheme implementations use a hygienic macro system, and in PLT this is the default mode (but there is a `mzlib/defmacro' library for legacy unhygienic code). Roughly speaking, this is done by using *identifiers* rather than plain symbols -- and an identifier is a kind of a symbol that also carries its lexical context with it (this is sometimes called its "color"). Hygienic macros avoid variable capture of two kinds: * names that are introduced by the macro do not clash with names in sub-expressions that are passed from user code to it -- for example, we use temporary variable names without worrying whether subexpressions use the name, * names that the macro references are coming from the lexical context of its definition, rather than the lexical context of its use. For example, if a macro uses `if', then the definition of `if' that was in effect at the context where the macro is defined is used. The first kind of capture is solved in an unhygienic macro system by using `gensym' to create "uninterned" symbols that are not `eq?' to any other symbol. This is a little inconvenient as you need to be careful to use it for all names that are introduced by the macro, and it can be made more convenient with a macro (which you would use to generate your macros). The second kind of capture is usually dismissed as not too important -- for example, having a different `if' at the lexical context of the macro use seems unreasonable (in fact, Common Lisp forbids redefining names that are part of the standard). However, this kind of hygiene is crucial for using macros in a module system. For example, in one of the earlier examples we have provided `choose' and `let-rand' from a module, and their implementation used an unprovided `random-element' function. If our macros were not hygienic, then we would be forced to provide `random-element' as well to have it bound where it is used -- and we would not be able to freely rename bindings as we did above. This aspect of hygiene is even more important for implementing languages as we do in this tutorial: the ability to define new meanings for the same name is one of the most important ways we defined new languages. "Same" here is in the "same concrete symbol" sense -- the actual *identifier* is different, of course, and that is exactly why hygiene solves this problem. At this point it should be clear that having hygiene as the default mode is a very desirable feature. Yet there are cases where unhygienic macros are useful. From a highlevel point of view, there are two kinds of unhygienic macros that are useful. The first kind is when a new name is based on a name given to the macro. For example, say that we want to define a `defvar' macro so that (defvar foo 9) defines a (hidden) thread cell that is initialized to 9, a `foo' function that returns the value of the parameter, and a `set-foo!' function that changes it. (Note: a thread cell is a kind of a box with a per-thread value.) Such macros must use the explicit transformer function form -- they cannot be done with just `syntax-rules'. ----mz #lang scheme (define-syntax (defvar stx) (syntax-case stx () [(_ id value) (identifier? #'id) (with-syntax ([set-id! (datum->syntax stx (string->symbol (format "set-~a!" (syntax-e #'id))))]) #'(begin (define cell (make-thread-cell value)) (define (id) (thread-cell-ref cell)) (define (set-id! new) (thread-cell-set! cell new))))])) (defvar foo 9) (foo) (set-foo! 99) (foo) ---- Note how we get the `set-foo!' name: we begin with the given identifier syntax, get its contents (which is the plain 'foo symbol), turn that into a `set-foo!' symbol, and finally we use `datum->syntax' to turn the resulting symbol back into an identifier: this function receives a first argument which is a piece of syntax to copy the lexical information from. We therefore generate a `set-foo!' identifier with the same lexical context as the identifier that we received. Note also that `cell' follows the usual hygiene rules: each use of our macro generates a new `cell' identifier, distinct from any other identifiers including other uses of `defvar'. This kind of a hygiene-breaking macro is fine: the new name that we introduce is based on a user-given name, and this should be used when the intention is to bind the new name. (Ryan Culpepper called such macros "Morally Hygienic" in a blog post.) The second kind of a hygiene breaking macro is when you want your macro to create a known name from scratch (it must be known to be accessible to the user). For example, a `while' macro that binds `abort'. A popular way to do this is similar to the above -- except that 'abort is used directly. There are, however, subtle scope problems that can come up and manifest in hard to find bugs. A much better solution is to provide the name as a keyword through the usual channels: provide it from your module too. This way, users can deal with the name as usual (e.g., rename it, etc), and the scope is as expected. PLT supports this using "syntax parameters". The idea is for us to bind `abort' as a macro that raises an appropriate error, and rebind it around the body of a `when' macro -- this is easy to do using `make-rename-transformer' which creates an alias. This rebinding is in effect while the body is expanded, which means that we get to change the meaning of the identifier. (This makes it similar to `parameterize' except that the rebinding happens at the time the body is expanded rather than at runtime.) Finally, we provide `abort' in addition to the `while' macro. ----dr while.ss #lang scheme ;; get syntax parameters to our syntax code (require scheme/stxparam) (provide abort while) (define-syntax-parameter abort (lambda (stx) (raise-syntax-error 'abort "can only be used in a `while' loop"))) (define-syntax (while stx) (syntax-case stx () [(_ cond body ...) #'(let loop () (let/ec escape (syntax-parameterize ([abort (make-rename-transformer #'escape)]) (when cond body ... (loop)))))])) ---- ----mz #lang scheme (require "while.ss") (define x 10) (while (> x 0) (printf "x = ~s\n" x) (when (= x 5) (abort)) (set! x (sub1 x))) ---- ======================================================================== >>> Custom Syntax *** Hacks using the paren-shape property *** Simple readtable extension, like "λ x y . (+ x y)" *** A full reader example, using the scribble inside reader, and the language that displays top-level strings as-is ======================================================================== >>> Other Features PLT has many features -- and any of them can be used to make defining your own language easier. Some features are more useful for this than others, and this tutorial covered the major tools. Additional features that are more relevant are: * inferred names, expansion contexts, definition contexts, and a *lot* more syntax functionality * readtables, custom readers, parser tools * units, packages, classes * continuations of all kinds, continuation marks * threads, synchronizations, cml ;; Local variables: ;; fill-column: 72 ;; mode:scheme ;; hide-local-variable-section: t ;; eval:(load (concat default-directory "tutorial.el")) ;; eval:(add-color-pattern "^=\\{72\\}\n" '*/blue4) ;; eval:(add-color-pattern "^ *>>>.*\n" 'yellow/red4-bold 0 t) ;; eval:(add-color-pattern "^ *\\*\\*\\* .*" 'underline-bold 0 t) ;; eval:(tutorial-parse-buffer) ;; mode: auto-fill ;; End: