|
|
 |
 |
 |
 |
Scheme Programming Language
|
 |
 |
 |
 |
 |
 |
 |
 |
Why R6RS is controversial
I've seen a few links about R6RS on c.l.s, LtU, reddit, and other forums, and many of the comments seem to be along the lines of "cool, it would be great if Scheme had those new features." If that is your opinion you can just choose any Scheme implementation and it will have all those features and more. If you are curious why R6RS is controversial at all, why there is a community vote, and why calls for boycotts and for alternatives have been made, read on. [Disclaimer: I dropped out of the R6RS discussions fairly early because it was hopelessly far away from what I would even consider, and because the editors were not responsive to my suggestions. I am _not_ an unbiased reporter, but I doubt very much you'll be able to find one who actually knows what s/he is talking about. And despite an initial intention of writing this as a justification for a "no" vote, I found that those more patient than I have fought on and the editors became more responsive, and by the latest draft (r5.93rs) R6RS had improved quite a bit. As the editors are still working and this isn't even the final draft, I can't even be sure at this point I will vote "no."] As a bit of background, the basis for much of the criticism can be found in the first sentence of the introduction to the report: Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary. Thus, as opposed to almost any other language where a rich feature set is considered a good thing, the goal in Scheme is to find the simplest and most general way to be able to express any given feature. For example, instead of specifying any core looping constructs like FOR or WHILE loops, Scheme made only the "proper tail recursion" requirement, allowing loops to be expressed as recursive procedure calls. This is at the same time simpler and more general than any single loop syntax, and allows us to build any number of efficient loop syntaxes with macros. Likewise, Scheme does not provide any core forms for local variables because they can be implemented as syntactic sugar for procedure application. In a sense, the point is to be the most axiomatic language possible. Like mathematicians who struggled to reduce geometry beyond Euclid's 5 postulates, Scheme tries to reduce programming to its most essential components (while still being high-level and generally useful and not reducing so far as Turing machines or SK combinators, which could be thought of as non-Euclidean programming :) This does not mean all features should be rejected out of hand, but rather that any new feature should be viewed with initial suspicion. We should attempt to break that feature into the smallest and most general essential primitives, so that the new feature can be expressed as a library. If it's not clear yet how to reduce the feature, it probably requires more research and isn't ready for the official standard that all Schemes must conform to. If it is nonetheless essential it is usually best to provide a high-level interface with the low-level details left unspecified, so that it could later be re-defined in terms of some simpler primitive. For example, a simple TCP/IP interface could be specified that may later be defined in terms of BSD sockets. Some desired features can be reduced in more than one way, none of which seems any more axiomatic than the others. A module system, for instance, can be expressed entirely with lexical scope and macros, usually with a few minor extensions to the macro system. Alternately it can be expressed as simple namespace management, with a few primitives for handling that. Alternately modules can exist outside the code, rather than being built within it. None of these are clearly better. However, a high-level module syntax can be defined which can be implemented on top of any of these, and indeed the R6RS modules are so defined. [Note R6RS makes a core language vs. library distinction, but isn't especially meaningful, as some of libraries imply semantic changes to the language as a whole.] A consequence of this axiomatic approach is not to over-specify things, because that prevents further exploration in new features and feature reduction, and restricts implementation strategy. As a result the Scheme community is rich in ideas and implementations. Many of the complaints against R6RS are precisely that it over-specifies. Below, for the record, I summarize some of the more controversial issues people have with R6RS (based on the r5.93rs draft). I include both complaints I agree with and those I don't, but have undoubtedly missed many and misstated others, so if you have an issue not mentioned please reply with it (preferably in summarized, non-ranting form if you can restrain yourself). ------------------------------------------------------------------------ * IDENTIFIER-SYNTAX Identifier syntax means that the macro system can expand a single identifier even when not the first symbol in an expression. Thus when you see an identifier, it may not actually be a real variable reference, which can be confusing both for humans and other macros which want to analyze code. It is a substantial complication to the semantics of the language, with arguable benefits. * Exceptions Many things specified simply as "errors" in R5RS (with unspecified behavior) are now required to signal exceptions. The exceptions themselves fit within a complex hierarchy. * Module system - The enforced phase separation disallows some implementation strategies and extensions. - The versioning system is complex and not obviously necessary. Something like it could always be added later. - Libraries need to be wrapped in an extra layer of parenthesis, as opposed to a single definition at the top of the file. * Unicode The standard makes Scheme (non-optionally) Unicode-specific, and defines the character data-type as Unicode scalar values. This prevents small implementations which only want to deal with ASCII (e.g. in embedded systems), implementations which want to support Unicode but want a different meaning of character (e.g. grapheme clusters), and implementations which want to support a different character altogether. There are a number of alternative proposals to Unicode including Mojikyo, eKanji, TRON, and UTF-2000. Scheme has been around for a long time, Lisp even longer, and many are hesitant to wed themselves to a single character set forever. Behavior for Unicode for those implementations which use it could always be specified optionally (or even in a separate report). * STRING-REF is recommended to be constant time This discourages a number of implementation strategies that use variable width character encodings or alternate string representations such as ropes or trees. It is easy to provide a string API that is convenient to use and efficient for both traditional and alternative string representations. * Safety R6RS makes a (possibly too) strong claim about safety, and introduce an exception type for implementation restrictions. * Square brackets Making [] identical to (), apart from all the arguments about which looks better, breaks the entire axiomatic spirit and prevents alternative extensions from using []. It also introduces a trivial stylistic distinction where none existed before, and puts Scheme among the ranks of languages where programmers need to agree on a style guideline (there are many variations already) before collaborating on a project. * IEEE-754 values (-0.0, infinities, and NaN) Many disagree whether these values belong in the language, what their exactness is, and what the behavior of various operations on them should be. The latest draft makes them optional, so this isn't likely to be terribly controversial. * CALL/CC A commonly enough used abbreviation for call-with-current-continuation used in talking about the operator, and already supported by some systems, some argue the operator is used rarely enough (and is supposed to be so) that the abbreviation isn't needed. At any rate, no other procedure in the language has two names. * Comment Syntax #; expression comments and #| ... |# block comments have been added to the language, though are not needed. * Bytevector Syntax #vu8(...) reads as a bytevector. Bytevectors themselves are not so controversial, though people disagree on the names and any external representation. ------------------------------------------------------------------------ The following are technically part of the library specification, though many of them imply core semantic changes. * Ignored SRFI's The SRFI process was specifically intended as a testbed for new features for possible inclusion in further standards. R6RS was of course under no obligation to use any SRFI's, however in a few places it seemed to deliberately ignore the progress made by SRFI's. SRFI-1, in particular, is almost universally supported, is exceedingly popular, and in fact how to access SRFI-1 in a given implementation is one of the most frequent beginner questions asked. Yet despite this the R6RS draft uses gratuitously incompatible names and API's to many SRFI-1 procedures. As R6RS claims to emphasize the community role, it seems strange that it should ignore a previous community effort, and seems to discourage future SRFI work. * STRING-NORMALIZE-* Normalization is hideously complicated, and may require many manual conversions back and forth and after any operation that may not have preserved normalization. A huge simplification very much worth consideration is a system that maintains all internal strings with a single consistent normalization, but explicitly allowing conversion to any of a number of specific normalization forms
... read more »
>From a minimalist standpoint, which of these methods used to construct
a list is axiomatic and which could theoretically be relegated to a library? (cons 1 (cons 2 (cons 3 ()))) (list 1 2 3) '(1 2 3) '(1 . (2 . (3 . ()))) '(1 2 3 . ()) An uneducated guess would be that the list function could be defined in a library using car and cdr. Being a Scheme newbie, I still haven't quite figured out how the dot pairing operator is distinct from car and cdr. And SICP seems to show us how to define car and cdr in terms of lambda, so perhaps they could be defined as library functions? On May 29, 12:56 am, Alex Shinn <alexsh@gmail.com> wrote:
> We should attempt to break that feature into the smallest and most > general essential primitives, so that the new feature can be > expressed as a library.
On May 29, 10:41 am, Chris Rathman <Chris.Rath@tx.rr.com> wrote: > An uneducated guess would be that the list function could be defined > in a library using car and cdr. Being a Scheme newbie, I still > haven't quite figured out how the dot pairing operator is distinct > from car and cdr. And SICP seems to show us how to define car and cdr > in terms of lambda, so perhaps they could be defined as library > functions?
Doh! S/car and cdr/cons/ in all the above. Anyhow, the question I had is how the Scheme community goes about drawing the line between minimalism and compromise. From the outside looking in, I don't see Scheme as purely driven towards PLT axioms - and if I'm not mistaken, one doesn't have to go very deep into the language to realize that the language designers did not take a purist approach. I think minimalism is more what you'd call "guidelines" than actual rules.
Chris Rathman wrote: >>From a minimalist standpoint, which of these methods used to construct > a list is axiomatic and which could theoretically be relegated to a > library? > (cons 1 (cons 2 (cons 3 ()))) > (list 1 2 3) > '(1 2 3) > '(1 . (2 . (3 . ()))) > '(1 2 3 . ()) > An uneducated guess would be that the list function could be defined > in a library using car and cdr. Being a Scheme newbie, I still > haven't quite figured out how the dot pairing operator is distinct > from car and cdr.
Dotted pairs are a literal syntax for (literal) pairs. On their own, they don't replace cons, since something like quasiquote would be needed to support constructing pairs from variables (e.g. `(,a . ,b)). An extreme minimal approach would typically only provide cons. Literal syntax for data types isn't necessary. > And SICP seems to show us how to define car and cdr > in terms of lambda, so perhaps they could be defined as library > functions?
Exactly! And integers can also be defined in terms of lambda (e.g. Church numerals). So we can eliminate all that pesky literal syntax for integers that Scheme supports, along with that pointless and theoretically non-primitive integer datatype. But after we do that, I suspect we'll soon start discovering the exceptions to the rule raised in another comment, "raw computation speed is very seldom important at all." Anton
> On May 29, 12:56 am, Alex Shinn <alexsh @gmail.com> wrote: >>We should attempt to break that feature into the smallest and most >>general essential primitives, so that the new feature can be >>expressed as a library.
Alex Shinn wrote:
> Below, for the record, I summarize some of the more controversial > issues people have with R6RS (based on the r5.93rs draft). I > include both complaints I agree with and those I don't, but have > undoubtedly missed many and misstated others, so if you have an > issue not mentioned please reply with it (preferably in summarized, > non-ranting form if you can restrain yourself). Wow. This is excellent work you've done here, collecting all this stuff in one place and explaining why it raises cause for concern. I agree with most of these criticisms, actually. > * IDENTIFIER-SYNTAX > Identifier syntax means that the macro system can expand a single > identifier even when not the first symbol in an expression. Thus > when you see an identifier, it may not actually be a real variable > reference, which can be confusing both for humans and other macros > which want to analyze code. It is a substantial complication to > the semantics of the language, with arguable benefits.
This is one of the things that gave me misgivings, but I wasn't able to form a cogent argument against it. It is a powerful weapon in the "obfuscated scheme" programming contestant's arsenal, but it's not clear to me that most programmers will use it that badly. > * Exceptions > Many things specified simply as "errors" in R5RS (with unspecified > behavior) are now required to signal exceptions. The exceptions > themselves fit within a complex hierarchy.
A complex and highly overspecified hierarchy. I am strongly of the opinion that a very different and much simpler method for handling such things is better. The one expressed in the R6RS candidate appears to have semantics mostly copied from other languages, and does not suit most of the other programming paradigms that Scheme otherwise supports. > * Module system > - The enforced phase separation disallows some implementation > strategies and extensions. > - The versioning system is complex and not obviously necessary. > Something like it could always be added later. > - Libraries need to be wrapped in an extra layer of parenthesis, > as opposed to a single definition at the top of the file.
Valid points all. An additional point is that the module system becomes an additional barrier to the use of scheme as a pedagogic language, because it's something that beginners have to deal with before much of anything else works, and long before it is possible to explain to them why. > * Unicode > The standard makes Scheme (non-optionally) Unicode-specific, and > defines the character data-type as Unicode scalar values. This > prevents small implementations which only want to deal with ASCII > (e.g. in embedded systems), implementations which want to support > Unicode but want a different meaning of character (e.g. grapheme > clusters), and implementations which want to support a different > character altogether. There are a number of alternative proposals > to Unicode including Mojikyo, eKanji, TRON, and UTF-2000. Scheme > has been around for a long time, Lisp even longer, and many are > hesitant to wed themselves to a single character set forever.
I think that largely covers it. I do want to point out that the behavior of grapheme-cluster characters under most linguistic operations is *far* more reasonable, consistent, and logical, from the POV of actual linguistics and what a student of those natural languages would expect, than the codepoint characters selected by the committee. Further, I strongly feel that behavior which is more reasonable, consistent and logical to users of natural languages written in those characters is much more likely to be implementable in other representations of those characters. The standard should specify binary I/O and primitives for using binary I/O to build character ports, and then have unicode I/O as a standard library - which need not be loaded for a particular implementation or application. Unicode case operations and other semantics should be another standard library, probably a superset of the unicode I/O library. > * STRING-REF is recommended to be constant time > This discourages a number of implementation strategies that use > variable width character encodings or alternate string > representations such as ropes or trees. It is easy to provide a > string API that is convenient to use and efficient for both > traditional and alternative string representations.
Agree, again. Ropes with copy-on-write nodes are more efficient as the strings grow longer. Once you're doing corpus linguistics, there really is no alternative. This guarantees all atomic string operations in either constant or logarithmic time with respect to the length of the string, *and* automatically enables shared storage for the actual character sequences when new strings are created by minor modifications from old strings. Array strings, as implied by this wording in the R6RS candidate, are more efficient only if your strings are mostly under three kilobytes long. The standard should not forbid either of these implementation strategies; It should presume that the implementors (or the users, if the implementor gives them a choice) know what they're using the language for and can make a considered choice. It should specify an API for strings, period. > * Safety > R6RS makes a (possibly too) strong claim about safety, and > introduce an exception type for implementation restrictions.
Exceptions again. Highly overspecified again. > * Square brackets > Making [] identical to (), apart from all the arguments about > which looks better, breaks the entire axiomatic spirit and > prevents alternative extensions from using []. It also introduces > a trivial stylistic distinction where none existed before, and > puts Scheme among the ranks of languages where programmers need to > agree on a style guideline (there are many variations already) > before collaborating on a project.
Agree, again. I don't like them unless they mean something. Given my druthers, they'd mean a simple vector instead of a list in data and a syntax call instead of a procedure call in code. But that would be a very fundamental change indeed and I don't know if the resulting language would really be the same language. > * CALL/CC > A commonly enough used abbreviation for > call-with-current-continuation used in talking about the operator, > and already supported by some systems, some argue the operator is > used rarely enough (and is supposed to be so) that the > abbreviation isn't needed. At any rate, no other procedure in the > language has two names.
I strongly suspect that the longer name will be disappearing with R7RS or R8RS. Moreover, both names are now incorrect: what the routine actually does could more accurately be expressed by call/wc or call-with-winding-continuation. > * Comment Syntax > #; expression comments and #| ... |# block comments have been > added to the language, though are not needed.
The "need" for expression comments, as far as I'm concerned, just points out a(nother) limitation of our macrology, ie, that one macro call can expand only into a single expression. What the expression comment does is expand to zero expressions. We ought to be able to define a macro that does that, or expands to multiple expressions, easily. The "need" for block comments, on the other hand, is not really addressable by the language. You don't need them if you have an editor that understands comment prefixes, and you do if you don't. > * Bytevector Syntax > #vu8(...) reads as a bytevector. Bytevectors themselves are not > so controversial, though people disagree on the names and any > external representation.
Actually I object to these on the grounds that they introduce de facto static typing to scheme. I think that type should be an annotation or assertion added to an otherwise correct procedure rather than something which changes or specifies semantics. > * STRING-NORMALIZE-* > Normalization is hideously complicated, and may require many > manual conversions back and forth and after any operation that may > not have preserved normalization. A huge simplification very much > worth consideration is a system that maintains all internal > strings with a single consistent normalization, but explicitly > allowing conversion to any of a number of specific normalization > forms prevents this approach. > A simpler API could just provide a single STRING-NORMALIZE > procedure, which would normalize to a preferred internal > normalization form, and in the case of an automatically > normalizing implementation would just be the identity function.
Absolutely. It hugely overcomplicates things if your internal strings are other than "a sequence of characters," full stop. By overspecifying this, the R6RS candidate is setting up users and impelementors for endless hair and bugs. I had not considered a string-normalize! procedure; my thought was simply that normalization ought to have no semantics anywhere except in the code implementing character I/O ports or converting strings to/from bitvectors. Seriously: a string is just a sequence of characters. Normalization doesn't mean anything on characters. Normalization only means something on a particular representation of characters, and nothing outside your I/O port code or conversion to/from binary code ought to have to deal with the idiosyncrasies of that particular representation. If for any reason you want to write invalid data (a non-normalized string) to a character stream, you are clearly not using them as "characters" - you are doing something that would make more sense as binary I/O. Conversely, if you read something and want the exact binary sequence, as opposed to the seqence of characters in a normalized string, you are clearly not reading
... read more »
Ray Dillinger wrote: >> * Pair and string mutation moved to separate libraries >> SET-CAR!, SET-CDR! and STRING-SET! have been moved to separate >> libraries. Pairs and strings are still mutable, so this does >> nothing to change the semantics of the language or even to help >> optimizations (it would require a global compiler to detect that >> these modules were never imported, but at that point it's trivial >> for the compiler to simply detect that these individual procedures >> aren't used). It is thus simply a gesture of moving towards a >> more functional Scheme. Some people disagree, others think the >> gesture is silly. > I think the gesture is silly. Oh, maybe there's a rationale > in that if you want guarantees that code is purely functional > you can just forbid the use of this library (and vectors, and > several other things). But it's silly. If you want a functional > lisp, you can do that. But that's not what scheme is for. > Scheme is for "any paradigm you've got, you can use scheme to > program in it."
I think this misses the real motivation. Pairs are quite central to Scheme, and as a result Scheme implementors are hamstrung by pairs being mutable by default. This is unfortunate, given that in a large proportion of cases, that mutability isn't actually needed or used. PLT Scheme is currently experimenting with making pairs immutable by default. Here's a message about this by Matthew Flatt: http://groups.google.com/group/plt-scheme/msg/482bcab20116530d The goal is not to make a pure functional language, it's to make a better language. A language which forces you to pay for features that you're not using, as default-mutable pairs do, is not an ideal platform for implementing "any paradigm you've got". Anton
Ray Dillinger wrote: >> * IDENTIFIER-SYNTAX >> Identifier syntax means that the macro system can expand a single >> identifier even when not the first symbol in an expression. Thus >> when you see an identifier, it may not actually be a real variable >> reference, which can be confusing both for humans and other macros >> which want to analyze code. It is a substantial complication to >> the semantics of the language, with arguable benefits. > This is one of the things that gave me misgivings, but I > wasn't able to form a cogent argument against it. It is a > powerful weapon in the "obfuscated scheme" programming > contestant's arsenal, but it's not clear to me that most > programmers will use it that badly.
Identifier syntax is actually a good idea IMHO. It allows you, for example, to express object-oriented extensions where variables are automatically taken from an implicit message receiver, roughly like this: (define-method print <person> () (display this.name) (newline) (display this.address) (newline)) Here, this.name and this.address are supposedly taken from the implicit this argument for such a method. This is not easily expressible without identifier syntax. The argument that this may make code obfuscated is the same argument other folks hold up against macros in general. The question is whether there are good uses of such a feature, and there are. Pascal -- My website: http://p-cos.net Common Lisp Document Repository: http://cdr.eurolisp.org Closer to MOP & ContextL: http://common-lisp.net/project/closer/
On May 29, 5:21 pm, Pascal Costanza <p@p-cos.net> wrote: > Identifier syntax is actually a good idea IMHO. It allows you, for > example, to express object-oriented extensions where variables are > automatically taken from an implicit message receiver, roughly like this: > (define-method print <person> () > (display this.name) (newline) > (display this.address) (newline))
I implemented something like this just the other day (with alot of help from Abdulaziz!). (define-class point (x y z)) (define p (make-point 10 20 30)) (with-point p) (list p.x p.y p.z) ;; expands to (list (point-x p) (point-y p) (point-z p)) ;; set the x (p.x! 1) ;; expands to (set-point-x! p 1) ;; The slots can be "typed" (define-class airplane ((pos point) (vel point))) (define a (make-airplane (make-point 1 2 3) (make-point 4 5 6))) (with-airplane a) (list a.pos a.vel) ;; expands to (list (airplane-pos a) (airplane-vel a)) ;; there is syntax for the components of the pos and vel as well: (list a.pos.x a.pos.y a.pos.z) ;; expands to (list (point-x (airplane-pos a)) (point-y (airplane-pos a)) (point-z (airplane-pos a)) ;; set the z of the vel (a.vel.z! 10) ;; expands to (set-point-z! (airplane-vel a) 10) http://dharmatech.onigirihouse.com/scheme/class/class.scm I've used it with Gambit-C. Ed
"wayo.cava @gmail.com" <wayo.cava @gmail.com> writes: > (define p (make-point 10 20 30)) > (with-point p) > (list p.x p.y p.z) > ;; expands to > (list (point-x p) (point-y p) (point-z p))
What happens when you do: (list (make-point 1 2 3).x) -- Cheers, The Rhythm is around me, The Rhythm has control. Ray Blaak The Rhythm is inside me, rAYbl@STRIPCAPStelus.net The Rhythm has my soul.
On Wed, 29 May 2007, wayo.cava @gmail.com wrote: > On May 29, 5:21 pm, Pascal Costanza <p @p-cos.net> wrote: >> Identifier syntax is actually a good idea IMHO. It allows you, for >> example, to express object-oriented extensions where variables are >> automatically taken from an implicit message receiver, roughly like this: >> (define-method print <person> () >> (display this.name) (newline) >> (display this.address) (newline)) > I implemented something like this just the other day (with alot of > help from Abdulaziz!).
<> Cool. :-) It works with sisc as well.
Pascal Costanza wrote: > Identifier syntax is actually a good idea IMHO. It allows you, for > example, to express object-oriented extensions where variables are > automatically taken from an implicit message receiver, roughly like this: > (define-method print <person> () > (display this.name) (newline) > (display this.address) (newline)) > Here, this.name and this.address are supposedly taken from the implicit > this argument for such a method. This is not easily expressible without > identifier syntax. > The argument that this may make code obfuscated is the same argument > other folks hold up against macros in general. The question is whether > there are good uses of such a feature, and there are.
That's only one of the arguments. The real problem is that it complicate the semantics of the language. Specifically, other macros cannot know when they see an identifier if it really is a variable reference or not. As an example, consider a fast-math macro: (fast-math (+ (* a b) (* a c))) => (* a (+ b c)) (fast-math (+ (* a a) (* a a a))) => (let ((t (* a a))) (+ t (* a t))) That is, it takes an arithmetic expression and refactors and simplifies and performs common subexpression elimination to achieve the most optimal equivalent expression (compilers can't do this very well - even GCC doesn't). Now, in the presence of identifier-syntax, we don't actually know which of the identifiers are simple variable references, or which may even expand into further arithmetic expressions, so our optimization assumptions are off. Worst, we don't know if any of them are actually side-effecting, so we can't safely do any rewriting at all. This is from a real example I wrote a while back. Other examples include simple optimizations you may want to include in regexp syntax or pattern matchers. So, in effect, by adding identifier-syntax you make _all_ macros less powerful because they know less about the language they are expanding. Now, also considering that anything you do with identifier-syntax can be trivially implemented with normal syntax by just wrapping the identifier in a pair of parenthesis, is it really worth including this as a required feature of all standard Scheme implementations and irreversibly complicating the core language? -- Alex
Ray Dillinger wrote: > Alex Shinn wrote: > > * Bytevector Syntax > > #vu8(...) reads as a bytevector. Bytevectors themselves are not > > so controversial, though people disagree on the names and any > > external representation. > Actually I object to these on the grounds that they > introduce de facto static typing to scheme. I think that > type should be an annotation or assertion added to an > otherwise correct procedure rather than something which > changes or specifies semantics.
Ideally perhaps yes, but we do want some common ground solution when working with binary I/O, and plain vectors would just be far too inefficient in many implementations. I threw this in because I included all syntactic changes, but this and the new comment syntax are pretty minor - I don't think anyone is that opposed to either. But the ever-increasing amount of #foo syntax sometimes worries me, and if SRFI-10 had been adopted then we could have had #,(vu8 ...) portably without altering the reader. What I forgot to mention was the #!r6rs syntax which is just hideous. > > * Binary vs. Text port distinction > > [...] > I think the standard did the right thing, here. You've got to > have text ports distinct from (or built by layering code on top > of) binary ports in order to support more than one way of > reading and writing characters.
Personally I was arguing to make this distinction from the beginning, and was quite happy when I saw it. I do think the standard should be a little more clear and say something along the lines of "it is an error to use a textual operation on a binary port or a binary operation on a textual port." Or say it's unspecified, or even specify the error. Right now it just says "a binary port [...] does not support textual I/O." -- Alex
Alex Shinn gave us an excellent explanation of why the R6RS is controversial: > Below, for the record, I summarize some of the more controversial > issues people have with R6RS (based on the r5.93rs draft). I > include both complaints I agree with and those I don't, but have > undoubtedly missed many and misstated others, so if you have an > issue not mentioned please reply with it (preferably in summarized, > non-ranting form if you can restrain yourself).
Here is Alex's paragraph on records: * Records The records library is very large and complex, and cannot be implemented as a portable library. I would add that, despite its complexity, the syntactic layer is strictly less general than the procedural layer, and there are two distinct failures of interoperability between the two layers. The editors' rationale for this is given in their response to Formal Comment #90 ( http://www.r6rs.org/formal-comments/comment-90.txt ). On a more trivial level, I would add: * disappearance of #\newline The #\newline character syntax of all previous reports is to be replaced by the #\linefeed syntax. Will
On May 30, 7:18 pm, Alex Shinn <alexsh@gmail.com> wrote: > That's only one of the arguments. The real problem is > that it complicate the semantics of the language. > Specifically, other macros cannot know when they > see an identifier if it really is a variable > reference or not.
That's a good point. My "R6 counterproposal" (imprecisely stated, though it is) suggests adding not just fexprs and environments, but also making the reader extensible. In Pascal's example, he wanted to bind an identifier like "this.speed" to an identifier macro that would generate a reference to an object field rather than to a location made lexically apparent by lambda (including let, etc.). An alternative is to do that expansion in the reader, so that programs might contain "#.speed" which is read as "(self speed)" -- with "self" defined as an ordinary macro. That's a little bit awkward. For example, one would not expect "(set! #.speed 'full-ahead)" to work since the set! special form expects a named location in the first subexpression. I wonder how Schemer's would feel about getting into the habit of using (setf #.new-under-the-sun '()) -t
Tom Lord wrote: > On May 30, 7:18 pm, Alex Shinn <alexsh @gmail.com> wrote: >> That's only one of the arguments. The real problem is >> that it complicate the semantics of the language. >> Specifically, other macros cannot know when they >> see an identifier if it really is a variable >> reference or not. > That's a good point.
No, it's not. Those macros also cannot know whether a regular macro is actually a variable reference or not. The only way out here is to provide something like macroexpand with which you can check what a respective form actually expands into. That would be a general solution because macroexpand could be applied both to identifiers and regular macro invocations. (That's, at least, the case in Common Lisp.) Pascal -- My website: http://p-cos.net Common Lisp Document Repository: http://cdr.eurolisp.org Closer to MOP & ContextL: http://common-lisp.net/project/closer/
Tom Lord wrote: > That's a little bit awkward. For example, one would > not expect "(set! #.speed 'full-ahead)" to work since > the set! special form expects a named location > in the first subexpression.
If foo is an identifier macro, then it is the responsibility of foo to make sure (set! foo ...) works. The example 12.4 shows what to do: (define p (cons 4 5)) (define-syntax p.car (make-variable-transformer (lambda (x) (syntax-case x (set!) [(set! _ e) #(set-car! p e)] [(_ . rest) #((car p) . rest)] [_ #(car p)])))) (set! p.car 15) p.car => 15 p => (15 5) Identifier macros is not a new invention. They have been in the various syntax-case systems for a long time, and to my knowledge haven't caused any problems. -- Jens Axel Sgaard
Alex Shinn wrote: > Pascal Costanza wrote: >>(define-method print <person> () >> (display this.name) (newline) >> (display this.address) (newline)) >>Here, this.name and this.address are supposedly taken from the implicit >>this argument for such a method. This is not easily expressible without >>identifier syntax. > That's only one of the arguments. The real problem is that it > complicate the semantics of the language. Specifically, other macros > cannot know when they see an identifier if it really is a variable > reference or not.
The only real solution for this is to partition the set of identifiers into mutually exclusive "base" and "extended" identifiers, where base identifiers are literal expressions in themselves, and extended identifiers require identifier- syntax macros to transform into expressions. For example, if our base identifiers could not naturally contain the period or dot character, then the reader could know, on reading an extended identifier like "this.name," that it was not looking at a base identifier - and then check it against its identifier macro patterns. Indeed, this is something like what readers now do with the prefix octothorpe. The octothorpe means it cannot be read as a base identifier, so the reader has to look at its other definitions. The proposal amounts to expanding the number of ways something can be marked as "not a base identifier" and telling the reader what to do about it. But, there is still some controversy in my mind about it. It does not extend language semantics at all, so it is in some sense "trivial" and "unnecessary." Also, murk arising from ambiguity in infix syntax-marker expansion would have to be clarified before this would be a hard enough proposal for a programming language. Consider for example if some well-meaning person defines FOO-BAR as identifier syntax for (- FOO BAR). Now what is the reader to make of FOO-BAR-BAZ or similar? The result is ambiguous depending on which *instance* of "-" the macroexpander expands first. The abbreviations could be very handy, but would the resulting language have the clarity that is the virtue of scheme? Bear
Pascal Costanza wrote: > Tom Lord wrote: > > On May 30, 7:18 pm, Alex Shinn <alexsh @gmail.com> wrote: > >> That's only one of the arguments. The real problem is > >> that it complicate the semantics of the language. > >> Specifically, other macros cannot know when they > >> see an identifier if it really is a variable > >> reference or not. > > That's a good point. > No, it's not. Those macros also cannot know whether a regular macro is > actually a variable reference or not.
You misunderstand, I'm talking about those cases where you see a lone identifier in an evaluated position. Currently this couldn't possibly be anything other than a variable reference, but with identifier-syntax you don't even know that anymore. A simpler example: (cond (foo (bar)) (foo (baz))) Right now COND could provide a warning that the second branch is unreachable. With identifier-syntax it can't. Also, it's important to understand that identifier-syntax doesn't make the language any more expressive. All it does shave off a pair of parens, letting you write this.name instead of (this.name) But I'm not interested in arguing the pros and cons of all these new features. Yes, obviously every new feature has uses, and many of them have been used previously in other languages or Schemes. But they involve tradeoffs, and people should stop and think seriously before adopting those tradeoffs into the core Scheme standard. -- Alex
On May 31, 9:41 pm, Alex Shinn <alexsh@gmail.com> wrote:
> Pascal Costanza wrote: > > Tom Lord wrote: > > > On May 30, 7:18 pm, Alex Shinn <alexsh @gmail.com> wrote: > > >> That's only one of the arguments. The real problem is > > >> that it complicate the semantics of the language. > > >> Specifically, other macros cannot know when they > > >> see an identifier if it really is a variable > > >> reference or not. > > > That's a good point. > > No, it's not. Those macros also cannot know whether a regular macro is > > actually a variable reference or not. > You misunderstand, I'm talking about those cases where you see a > lone identifier in an evaluated position. Currently this couldn't > possibly be anything other than a variable reference, but with > identifier-syntax you don't even know that anymore. > A simpler example: > (cond (foo (bar)) > (foo (baz))) > Right now COND could provide a warning that the second branch is > unreachable. With identifier-syntax it can't.
But the same argument applies to macros, and to procedures. If foo is a macro, then we can't do this warning: (cond [(foo) (bar)] [(foo) (baz)]) Because the foo's might expand differently. But even if foo was a procedure, we're still out of luck, because it might refer to mutable state, and return different values each time. > Also, it's important to understand that identifier-syntax doesn't > make the language any more expressive. All it does shave off a pair > of parens, letting you write this.name instead of (this.name)
This is not the case. All systems that have identifier macros (that I know of) allow you to define the identifier foo such that (set! foo bar) does the 'appropriate' thing. This can't be done with other features of the language (short of rebinding set!). sam th
Ray Blaak wrote: > What happens when you do: > (list (make-point 1 2 3).x)
Yeah... that's where things break down.
Ray Blaak skrev: > "wayo.cava @gmail.com" <wayo.cava @gmail.com> writes: >> (define p (make-point 10 20 30)) >> (with-point p) >> (list p.x p.y p.z) >> ;; expands to >> (list (point-x p) (point-y p) (point-z p)) > What happens when you do: > (list (make-point 1 2 3).x)
You get a "reference to undefined identifier: .x" error. The macro call (with-point p) binds the three identifiers p.x p.y and p.z to identifier macros. The identifier macro p.x expands to, say, (point-x p). In (make-point 1 2 3).x the reader will give read the first sexpr (make-point 1 2 3). And then read the identifier .x . Since .x is unbound, you get an "reference to undefined identifier: .x" error. -- Jens Axel Sgaard
Jens Axel Sgaard <use@soegaard.net> writes: > Ray Blaak skrev: > > What happens when you do: > > (list (make-point 1 2 3).x) > You get a "reference to undefined identifier: .x" error. > The macro call (with-point p) binds the > three identifiers p.x p.y and p.z to identifier macros.
Well, exactly. I guess may main point here is to show my dissatisfication with these kind of approaches for pretending scheme can have an infix field selection notation for record-like values. We either build it in properly to scheme, or we do it the "proper" sexpr way. So, what are the ways it can be properly done, and what are the tradeoffs? E.g. (point-x p) (field p x) The last is prehaps more general. -- Cheers, The Rhythm is around me, The Rhythm has control. Ray Blaak The Rhythm is inside me, rAYbl@STRIPCAPStelus.net The Rhythm has my soul.
Ray Blaak skrev: > Jens Axel Sgaard <use @soegaard.net> writes: >> Ray Blaak skrev: >>> What happens when you do: >>> (list (make-point 1 2 3).x) >> You get a "reference to undefined identifier: .x" error. >> The macro call (with-point p) binds the >> three identifiers p.x p.y and p.z to identifier macros. > Well, exactly. I guess may main point here is to show my dissatisfication with > these kind of approaches for pretending scheme can have an infix field > selection notation for record-like values.
I was of the impression that the main theme of discussion were identifer macros in general and not a specific application of them. It is not clear to me whether your argument is: "infix field notation is a bad idea" therefore "identifier macros are a bad idea". If so, I can't help compare it to the classical argument against macros: "It is possible to write bad macros, hence they are a bad idea." Of course identifier macros can be used to write unreadable code. Of course macros can be used to write unreadable code. -- Jens Axel Sgaard
Alex Shinn wrote: > * IDENTIFIER-SYNTAX > Identifier syntax means that the macro system can expand a single > identifier even when not the first symbol in an expression. Thus > when you see an identifier, it may not actually be a real variable > reference, which can be confusing both for humans and other macros > which want to analyze code. It is a substantial complication to > the semantics of the language, with arguable benefits.
Two points: 1) "Thus when you see an identifier, it may not actually be a real variable reference, which can be confusing both for humans and other macros which want to analyze code." This would make a fine first paragraph in the "Identifier Macro Style Guide". There is one thing that identifier macros are perfect for: The implementation of various types of variable references. Today, if you write a macro that doesn't play a fair game (say introduce variables unhygienicly) you blame not macros, but the author of the macro. Same thing applies to indentifier macros. 2) "It is a substantial complication to the semantics of the language, with arguable benefits." Why do you see this as a "substantial complication"? After the macro expansion phase any use of identifier macros are gone - so the standard semantics apply. Is the macro expansion process more complicated? Very little. PS: I hope you don't find this post "extremely defensive". Although you didn't say identifier macros are bad, you did write "substantial complication", which in my book is worse than bad. -- Jens Axel Sgaard
Jens Axel Sgaard quoting Alex Shinn: > 2) "It is a substantial complication to the semantics of the language, > with arguable benefits." > Why do you see this as a "substantial complication"?
The problem with identifier macros is that they break a specific invariant on which R5RS can rely, and on which some R5RS macros have relied: ================================================================ R5RS invariant: A reference to an identifier has no side effects. ================================================================ With identifier macros as proposed for the R6RS, this R5RS-enforced invariant becomes a matter of programming style, which I can only hope programmers will continue to follow. Continuing to use an R5RS hygienic macro that depends on identifier references having no side effects would therefore become a matter of hope in the R6RS. That seems odd, given the R6RS predilection for mandating the enforcement of stylistic issues that matter considerably less than whether references can have side effects. Identifier macros introduce an entirely new way for programmers to break macros that, under the R5RS, were correct in all contexts. It is not unreasonable to ask whether the benefits of identifier macros justify their costs. In my opinion, they don't. Will
|
 |
 |
 |
 |
|