If a brace - that is to say, the character { - appears in a Tcl comment inside a proc, the Tcl parser will behave in a way that you may not expect. The ''#'' character is not a special preprocessor directive (as comments are in C/C++). In Tcl, comments act a lot like a command. If a command starts with the character ''#'', then the rest of the line is ignored. When you define a proc in Tcl, you are actually running the ''proc'' command, and it expects three arguments: the name of the procedure, a list of procedure arguments and a procedure body. Both the argument list and the procedure body have to be properly formed strings, and they are usually quoted with braces. You can have nested quotes within these strings, but you can't have extra open braces or unbalanced quotation marks. So as far as the ''proc'' command is concerned, this is a perfectly valid declaration. proc foo { args } { puts "executing procedure foo" # comment out this code block { foreach a {$args} { puts "argument: $a" } } } Note that procedure body is a well formed Tcl string. All the nested braces balance. The ''proc'' command doesn't actually do anything with the procedure body. That happens later when you execute the procedure. When you execute the procedure foo, Tcl recognizes the ''#'' comment command, and ignores the rest of the characters on that line ''including the trailing open-brace''. It then processes the ''foreach'' command, which expects four arguments, one of which is a multi-line string quoted by braces. It then attempts to process the closing brace as a command, and fails. ---- '''Commenting out Blocks of Code''' You may have been trying to comment out a block of code, like this # comment out this block { proc foo {} { blah ; blah ; blah } } but that doesn't work because of the unmatched braces in the comment. The most thorough way to comment out that block of code is to place a ''#'' at the beginning of each line. But that is a lot of work. Instead, if your block is well-formed, parsable, Tcl code, remove the code with an ''if'' or ''proc'' command, like this: # comment out this block using 'if' if 0 { proc foo {} { blah ; blah ; blah } } or # comment out this block using 'proc' proc donteverrunme {} { proc foo {} { blah ; blah ; blah } } ---- Comments from Paul Duffin, June 23, 1999 Not quite right. The following works proc a {} " # comment with unbalanced } " but this doesn't proc a {} { # comment with unbalanced } } And conversely this works proc a {} { # comment with unbalanced " } but this doesn't proc a {} " # comment with unbalanced " " Also the following doesn't work either proc a {} { puts "}" } The rules: 1. Parsing of a string only occurs when that string is evaluated and is lazy, which means that it does not parse the contents of words. That job is up to the particular command. (*) 1. When parsing a {} enclosed word all characters apart from {} and \ are ignored. 1. When parsing a "" enclosed word substitutions occur for $... and [[]] and \... but {} have no significance. 1. When parsing a command if the command starts with a # then all characters up until the end of the line (or end of the script, whichever comes first) are ignored. This means that no substitutions are done in the comment. These rules are relatively easy to follow in isolation but the problems come about when they are nested. Take for instance the following code. proc a {a b} { if {$a < 0} { puts "Negative \{" } elseif {$a > 0} { puts "Positive \}" } else { puts "Zero {}" } } When the '''proc''' command is evaluated the parser has split its arguments into the following words (the | are delimiters and as such are not part of the words). 1: |a| 2: |a b| 3: | if {$a < 0} { puts "Negative \{" } elseif {$a > 0} { puts "Positive \}" } else { puts "Zero {}" } | The parser did not look inside word 3 because that is up to the command and so it only used rule 2. (*) The byte compiler does invalidate this slightly as it 'knows' how to parse certain commands and therefore will parse before evaluation. ---- '''The [[switch]] command''' Watch out for comments inside '''switch''' commands, since the switch command does not itself interpret # as anything other than an ordinary character. Thus, you must make sure that they are only ever put ''inside the body'' parts, as otherwise you will end up with either wierd parsing or an out-and-out error. Thus, the following code is wrong: switch $foobar { # Match a first a { puts "Matched a" } # Match b second b { puts "Matched b" } } This is because it is equivalent to ''(i.e. precisely the same as)'': switch $foobar { # Match a first a { puts "Matched a" } # Match b second b { puts "Matched b" } } Which is the same as: switch $foobar { # Match a first b second } Not what was intended! The following is possibly even worse (depending on your point of view...) switch $foobar { #Match a first a { puts "Matched a" } # Match b second b { puts "Matched b" } } This is equivalent to: switch $foobar { #Match a first a { puts "Matched a" } # Match b second b { puts "Matched b" } } In this case, the body of the switch has an unpaired word, and so the switch command moans conspicuously. The ''correct'' way of writing the above is: switch $foobar { a { # Match a first puts "Matched a" } b { # Match b second puts "Matched b" } } '''DKF''' ---- What other surprises do we want to document in this area? ''Well, here's another case with similar troubles -- JC'' array set config { # this comment is messing up things ... path /usr/local/bin user admin } ---- ''Quoth an anonymous person via firewall.cmsis.com...'': This problem could be removed from the language completely by modifying the lexical analyzer portion of your interpreter to simply skip over any characters from a "#" to a "\n". However this could affect existing code you have, by forcing you to backslash escape the "#" character in places where you didnt have to. ''DKF replies...'': This is actually very difficult to do, since Tcl doesn't have conventional lexical analyser and parser stages (which has tremendous advantages in terms of lignuistic flexibility.) Comments are defined to act by causing the interpreter to ignore all characters through the next new-line. The thing is that comments only start in places where the interpreter is expecting a command to start, and the interpreter puts off making that decision for as long as possible. If you feel this is easy to fix (despite all the people telling you that it isn't) then perhaps you should have a go at "fixing" it to your satisfaction. The source code is freely available, so you can make whatever changes you want. If you can figure out a way of doing it that doesn't break thousands of scripts, we'll talk... ''LWV replies...'': One thing that often is not thought about when someone starts with '''it should be easy''' is that tcl treats data and commands identically. One is not required to surround data with quotes, etc. Thus, when the lex/parse code hits a #, all it can tell is '''is this the beginning of a command'''. It doesn't know things like '''am I in the middle of data'''. ''JC argues...'': I understand quoting (I wrote TinyTcl, so I better!), but the original poster does have a point. Just like it is possible for the scanner to replace "\" by a single space (everywhere), it could replace "#" by (everywhere), unless the hash was preceded by a backslash. It would also affect list-to-string conversions. This can break scripts, and it can even break data-as-script, but it *is* a trade-off - a weekly thread on c.l.p versus a gotcha for everyone upgrading to the latest Tcl release. ''[Bryan Oakley] points out...'': ...that the "break data-as-script" argument is a pretty darn big thing to break. It is for this reason more than any other that we absolutely ''can't'' preprocess comments and throw them away. It's just too hard to know what is a comment and what is data. For example, consider the following fragment. Is the data following the # a comment that should be thrown away? There is no way for the interpreter to know, and I sure as heck don't want it to guess: set foo { # is this a comment? } ''PSE notes...'': I wish Scriptics was as conscientious about breaking ''thousands of Tcl scripts'' as you folks are! LV: Scriptics is that conscientious - or they would have changed this as well as several other things that people report as bugs dozens of times every day. I haven't had to change a Tcl script due to Tcl changing the script level language in about 5-7 years (whenever the change relating to geometry vs width and height occured). LV: In news:3923CCA2.DD966DC7@hursley.ibm.com I found a variety of interesting comments relevant to this page: Bob Techentin wrote: > > If a hash character (``#'') appears at a point > where Tcl is expecting the first character of the > first word of a command, then the hash character and > the characters that follow it, up until the next > newline, are treated as a comment and ignored. ^ not preceded by a continuation character, or end of script > The > comment character only has significance when it appears > at the beginning of a command. > The above perfectly describes the behaviour of the Tcl comment, however it is too concise and relies on the reader to completely understand the rest of the man page before they can really understand this. I wish that Scriptics would improve the documentation to at least explain why it is the case and why it is unlikely to change. Also, it is possible to provide much more useful error messages. The indentation of the different blocks can be used to decide what the user probably meant, if a } is found whose indentation level is different from the line on which its matching { was found then there could be an unbalanced { somewhere. Similarly while parsing blocks Tcl could looks for comment like structures and possibly strings containing {}. If an error occurs, note that I am not using this information unless an error occurs, then Tcl would provide this extra information to the user which may help them track down their problem. How about adding something like this to the documentation, in a seperate man page ''(rendered for Wiki between the next two horizontal rules by DKF)'' ---- '''Consequences of Tcl's quoting rules''' ''Note:'' Tcl cannot distinguish between data and code until it is actually asked to execute a block. It is only at this point that it can determine what the commands are and hence what the comments are. Prior to being asked to execute a block Tcl's parsing is limited to counting unquoted {}s if the block starts with a {, or finding the next unquoted " if the block starts with a ". This means that it is not possible to place unbalanced and unquoted {}s anywhere in a {} delimited block, or unquoted "s in a " delimited block. Anywhere includes comments and strings. The behaviour of switch is a prime example, the body of the switch which contains all of the different cases to check is not code, it is data to the switch command, switch decides what to do with the data itself and switch does not support comments in its data. The body of each case is a block of code and as such can contain comments. It is understood that the behaviour of the Tcl comment may be strange to people used to other languages, hence the reason for this lengthy explanation and the following examples. However, it is not possible to provide a completely backward compatible preprocessing step which can make comments behave as you may expect. This is because it is not possible in general to tell just from looking at some Tcl code which blocks are data and which are code because Tcl treats all commands exactly alike. As you can see the comment behaviour is an unfortunate side-effect of the structure of the Tcl language in much the same way as the Tcl programming style of if {...} { : } rather than if {...} \ { : } is. You will find that you will be much more productive in Tcl if you modify your expectations and style to suit it. If you need or want to use a preprocessing step of your own then it is very easy to write your own in Tcl. However, because it is not going to be completely backward compatible do not expect that everyone will want to use your version. '''Examples.''' # This is a comment at global scope, note no enclosing # braces. # An unbalanced { in a comment works at global scope # because it is not surrounded by {}. set a "A string containing an unbalanced {" v proc example1 {} { # However place an unbalanced { in here and you # will cause problems because it means that when # Tcl parses the block starting at "v" the block # will not be terminated by the brace at "^". } ^ v proc example2 {} { # A solitary { in one comment. if { ... } { } # Followed by a } in another comment inside # the same block does not cause a problem because # the block starting at "v" ends at "^" as you # expect. } ^ proc example3 {} { # Unbalanced { # v if {1} { # Balancing } } # ^ # The outer block is balanced but when it is # executed the parser strips out the comment on the # first line ignoring the { and so when it comes # to parse the block starting at "v" it ends it # at the first } it comes across so the one at "^" # is assumed to be the name of a command and as # no such command exists an error is generated. } v proc example4 {} { set "this problem is not really a comment problem because if I put an unbalanced { in here then I have exactly the same problems" } ^ v proc example5 {} " # If I change the delimiters from {} to double quotes then # I can have as many {{{{ and }}}}}}}} in here as I like # and it will not cause a problem because Tcl's parser is # not counting them any more, it is looking for an unquo ted # \". It is not recommended that you use \"s as delimiters # to procedures as Tcl will perform substitution on the body # which may cause some problems if you are not prepared for # it. " ^ ---- '''Stephen D. Cohen (SDC) Asks:''' What I do not really understand is why # is afforded any special treatment at all. Why is it not simply a command that takes a variable number of arguments and does nothing? That does not seem to be too far from what it does now, and does not appear as if it would break any of the above. Then something like: proc foo { mire } { # This is a line that comments the rest of this muck... { set num [binary scan $mire H* muck] puts "The muck was: $muck" } } Would do the nothing that I (and most reasonable people?) would expect. In data, the # would have the same effect it has now - basically nothing unless special quoting / extreme measures are taken. I guess I am asking: Why has "#" been endowed with special powers that only seem to make it behave in a manner that is inconsistent with the rest of Tcl? Or am I missing something either deep / philosophical or intuitively obvious to the casual observer? Am I going to wish I never screwed up the courage to post on the Wiki? :-) ---- '''An unidentified poster responded:''' ''the answer may be that without the special treatment, you can't have comments with unbalanced quotation characters in them. With the special behavior you can. In practice, the current behavior works 99.9% of the time, so why change? '' '''and SDC Responds:''' So the answer may be that since this has been a problem since the beginning of time, it must remain a problem. Such thinking is sure to enocurage progress. I would also argue that the current behavior does not work 99.9% of the time, unless the developer keeps the special rules for # in mind. Why then, I ask, are we using up precious programmer creativity on a command which does nothing? I am not spoiling for a fight or anything, I am just looking for a good answer to why # is not simply a command that does nothing. One of the Tcl insiders must know... ---- '''Another unidentified poster responded (SDC cleaned this up so it prints correctly)''' # is not treated as another command for at least this reason: % set a try try % # [puts $a] % set b [puts $a] try % puts $b After the # , tcl code isn't executed. If it were treated like any other command, that would no longer be the case - so one wouldn't REALLY be commenting out lines... '''To which SDC Responds:''' OK, that certainly makes it obvious why # has to be endowed with *some* special powers. :-) I did not really mean that # should just be a command with *no* special powers, just that its parsing should be the same as every other command. I thought it went without saying that the arguments to # should not be evaluated. I am still looking for the answer as to why the parsing of the # command is not the same as the parsing for every other command in Tcl, with the exception that parameters to # are not evaluated at all. [Lars H]: There are a few character combinations which cannot be parsed as commands at all, and it is useful that comments can contain them. Consider # Some ASCII graphics: {}{}=={}{} # Some C: Tcl_DStringAppendElement(&cmd, (newName ? newName : "")); Both of these are OK in comments, but neither can be part of a command: % list Some ASCII graphics: {}{}=={}{} extra characters after close-brace % list Some C: Tcl_DStringAppendElement(&cmd, (newName ? newName : "")); extra characters after close-quote Adding e.g. $a (if there is no variable a) at the beginning of these commands will still produce the same error messages, thus confirming that these errors are detected before Tcl has done any substitutions. ---- As far as I am aware, there are no special exceptions for any command parsed. ---- '''Ingemar Hansson:''' Wouldn't also the comment #Comment without white space after the # sign break SDC's suggestion? In the above example the command - as SDC sees it - isn't '#' anymore, but '#Comment'. [FW]: Yeah, it would, but as an alternative it could automatically capture all commands starting with #. But anyway, we've alredy proved SDC's idea unusable, so it's a moot point. [NEM]: I'm not entirely sure any of the arguments against having comments as commands are particularly convincing. Which is harder to explain: requiring people balance braces in comments, or requiring people properly quote comment strings, and put a space first? For instance, lots of people seem to do: if {0} { ... } to comment out blocks of code. Why not simply make that: # { ... } ? Tcl won't touch anything between the braces anyway, and it would allow you to do cool things by redefining the # command to do e.g., automatic documentation generation, meta-data etc. I can see it would probably be confusing to people used to other languages that comments are just commands too, but I don't see that as a particularly huge problem. Still, probably too late to do anything about this now, as it would definitely break lots of scripts (and changing the behaviour of comments would be a particularly nasty way to break someone's script!). One more serious breakage of this scheme would be #! lines at the start (IIRC, some systems don't like a space between #! and the path). Ah well, perhaps best to leave # as it is, and write a new command for comments (like [XOTcl]'s @ stuff). ---- See also [Why can I not start a new line before a brace group].