null

If you have a null reference, then every bachelor whom you represent in your object structure will seem to be married polyandrously to the same person. Let's call her Nulla.

-- Edsgar Dijstra, related by Tony Hoare in his presentation, "Null References, the Billion Dollar Mistake"

Description

Many languages have a special "null" value (be it null or NULL or nil or ...) that is different from all other values and which usually indicates that something is mising. In Tcl everything is a string, which implies that nothing is null. There is the empty string, lists can be empty, but there can be no value that is not a string, so there can be no null.

Null appears in a number of different contexts. In SQL NULL is a special value that indicate that a particular field is empty in a result set, but NULL is also abused to indicate a field that, for whatever reason, is missing in the original table. When NULL is one of the operands, the result of a comparison for equality is is neither true nor false, leading to so-called three-valued logic (3VL), where unknown is one possible result. Whereas in SQL , NULL can appear in the place of any other value, other languages confine null to a particular type: that of references. For instance, in C, NULL is a pointer with a value of 0, i.e. it points to an inaccessible part of memory. In Java, null is a non-valid reference. In Python, Smalltalk and some other OO languages, null (or None or Nil) is an actual (singleton) object that defines no methods, or defines methods that quietly ignore all messages.

Perceived need for null in Tcl

Most of the discussions of null in Tcl that keep arising seem to be related to database transactions since SQL, but not necessarily non-SQL relationl database management systems, uses NULL to express missing data.

in Re: TCLCORE null handling and TIP #185 (alternate ), 2008-06-17, DKF writes:

When dealing with the result of an SQL database query, you've got a result set (in effect a row of a view). The values in the row are only part of the information available (there's also the name of each column, the type of the columns, etc.) and so adding the ability to ask whether a column was NULL is no big deal. The information might not be exposed in some of the short-cut interfaces, but they're just syntactic sugar; people who really need to know about nullity should use the more detailed API. (I suppose you could even have a method on the result set to set the string representation of NULL...)

Why the concept of a NULL has no place in the universe of strings

  • Every value has the same type: It is a string.
  • If everything is th same type, then there really is no concept of type. Any procedure can assign to a value any meaning it chooses. In this sense each value is typeless.
  • The concept of null requires a separate data type so it is can not be confused with any other value.
  • Therefore, there can be no cannot be in Tcl. QED.

To put it another way, it is mathematically well-defined what it means to be a string: The set of all strings constitutes a free monoid. A certain set of operations can be performed on a string, and it should not be possible to peform this set of operations on null. Since null is distinguished from each and every possible string, it is not itself a string. Therefore, in a universe of strings, there simply is no null.

This argument, however does not preclude individual procedures or sets of procedures from interpreting some particlar string as null. See "Solutions" below for some examples of this.

McVoy's null

Larry McVoy has suggested a null by way of a "magic" Tcl_Obj holding an empty string. The idea is that this particular Tcl_Obj — at that particular memory address — is null, and no other Tcl_Obj is. The implementation of this, and of an isnull command for testing whether a value is the null Tcl_Obj or some other Tcl_Obj, is trivial. The problem with this idea is that it turns EIAS into a leaky abstraction — two values may be equal as strings, but isnull (and by extension any command that makes use of it) will treat them differently — and moreover this leak could never be fixed, since it is all there is to this feature.

The problem with making EIAS a leaky abstraction is that it breaks (or at least renders unreliable) all programming techniques that process values as strings, including:

  • Writing data to file (e.g. preferences) and then reading them back.
  • Sending data to another process or computer via a channel.
  • Passing data to another thread.

See finally L is getting out there , Tcl Core Team mailing list, 2016-04-16, for a spirited discussion.

Treatment of null by various systems

Languages that provide a null value sometimes it special treatment. Some examples of such behaviour, which are sometimes expected also of a "null in Tcl", are:

null as a non-argument
Each null is skipped when forming a list of arguments.
null as missing argument
Passing null for a command argument is the same as not passing an argument, so if the argument has a default then that should be used. Similar to the above, but subsequent arguments aren't shifted. Sometimes requested as a way of specifying a value for the second argument with default without specifying a value for the first argument with default.
null as boolean
Null is a valid value for all boolean operations (&&, ||, etc.), where it is treated according to some three-value logic.

These expected behaviours often contradict each other. They typically also require any implementation to check every value for being null before acting on it. Such checks would slow down the language as a whole.

Solutions

There are plenty of ways in Tcl to handle situations that other systems might use null for.

Out-of-domain data

If the domain is e.g. numbers, any non-numerical string (including null, NULL, nil, and more legibly data missing) can represent null. The string NaN in a numeric context means "not a number", which may suffice in some cases. See also the infinity trick.

Unicode null

To represent missing text, there are some Unicode codepoints that are explicitly set aside as illegal. Even though they can occur in a Tcl string, there is no sensible way to interpret them as text, since they're not even characters. Hence they are available [L1 ] for internal use in applications precisely to express situations such as "data missing".

The most easily remembered non-character is probably \uFFFF.

Return codes

In some languages, a return value of null signals "no result" as a kind of error indication. In C a NULL pointer servers this purpose. To accomplish something similar in Tcl use a return code other than ok. Most often error is appropriate, but sometimes return, break or continue is a better match.

Missing dictionary/array keys

If the data are in a dictionary, then a missing datum can be expressed by not including an entry for that datum. dict exists tests for missing data, and dict merge or array set do the right thing. This approach usually also applies to data that might originally come as a C struct or similar (i.e., fixed collection of data), since these are often mapped to dicts in Tcl. Access is quite fast since hash keys are cached in the internal representation of the words of your command. It approach also works with arrays, where info exists tet for missing data. Indeed, it will work with any associative container.

Unset variables

A Tcl variable can be unset, and whether a variable is set or not can be tested with info exists. This information can also be used to represent missing data. This also works with the upvar command when passing variable references to other commands:

proc mycmd varName {
    upvar 1 $varName var
    if {[info exists var]} {
        puts "$varName is $var"
    } else {
        puts "$varName is missing!"
    }
}
mycmd foo ;# foo is missing!
set foo 12
mycmd foo ;# foo is 12

Empty list as null

An empty list can represent no value, and any other value, including the empty string can be expressed as a list containing one item. an example of this is the special value "args" in the argumnt specification for a procedure. Compare

proc foo {{arg {}}} {
    return "My arg is: $arg"
}

to

proc foo args {
    if {[llength $args] == 0} then {
        puts "Called without arg."
        set arg {} 
    } elseif {[llength $args] == 1} then {
       puts "Called with arg."
       set arg [lindex $args 0]
    } else {
       error "Wrong # args: foo ?arg?"
    }
    return "My arg is: $arg"
}

Some languages, e.g. Maple, treat null as an "ignore me" in argument sequences and lists, and this maps very well to this wrap-in-list idiom. With the advent of {*}, this merely becomes a matter of changing your API spec from "returns an X" to "returns a list of Xs" for affected commands and add a {*} in front of calls to them.

Tag with type

This is a variant on "wrap in list" which extends to a more general mechanism for handling data where "type" matters.

NEM has written an extension (Maybe package) which does this with minimal storage overhead. He describes it thus:

In the belief that actions speak louder than words, and code even more so, I hereby present the "maybe" package for Tcl that provides complete support for handling missing/unknown data in much the same way as a NULL pointer does in C, only nicer. The package comes with both Tcl and C implementations -- the C implementation efficiently represents such values as a single pointer, which is either NULL or points to a valid Tcl_Obj. The interface provided is simple:
     [Nothing]   -- creates a NULL pointer, string rep: "Nothing"
     [Just $foo] -- creates a non-NULL pointer, string rep: {Just $foo}
     if {[maybe exists $val var]} {
         puts "exists: $var"
     } else {
         puts "doesn't exist"
     }
The [maybe exists] command both tests for whether a value is not Nothing and extracts the value into a variable in one operation. If this command returns true then the variable var is guaranteed to contain a valid value (without any Just wrapper around it).

jmn 2008-06-16: Sounds great - but your choice of names for this don't seem helpful to me. Naming can make all the difference as far as getting people to understand and use something new I think.

'maybe exists' is surely going to cause confusion with the standard notion of Tcl variables 'existence' as reported by 'info exists'.

'Just' seems like a bizarre name.. I can only assume you meant it to be like the southpark policeman saying 'nothing to see here' - indicating of course that something extra is indeed going on. Cute .. but it would feel strange to be lying to myself like this whilst programming.

How about something (IMHO) more intuitive like the following?

 [Nothing] -> [NewNull]
 [Just $foo] -> [NewNullable $foo]
 [maybe exists $val var] -> [NotNull $val var]

(or even just Null, Nullable & NotNull )

NEM: The names come from Haskell's maybe . maybe exists is equivalent to dict exists, so I don't see it causing confusion (indeed, you can think of maybe as being like a 0/1-element dictionary).

Discussion

AMG: Everything is a string is nice and all, but for many applications it's important to have a special value that's outside the allowable domain.

If the domain of values is numbers, any non-numeric string (e.g., "") will do, so "" can be used to signify that the user didn't specify a number. C strings can't contain NUL and therefore are free to reserve NUL as a terminator or field separator. Unix filenames reserve / and NUL, so / is available to separate path components and NUL can be used with find -print0, xargs -0, and cpio -0 to separate filenames in a list. (The more common practice of separating filenames with whitespace breaks whenever whitespace is used in filenames.)

But if the allowable domain of values is any string at all, no string can be reserved for a special purpose.

Since Tcl has nothing that is not a string, the only remaining solution is to have a separate, out-of-band way of tracking the special case. Returning to the C example, if a program needs to support having NUL in the middle of a string, it must either encode the string using a possibly fragile quoting scheme, or it can use a separate variable to track its length. As for the Unix filename example, if a filename needs to contain a /, it absolutely must be encoded, for instance as %2F, but then the quote character must also be encoded (%25). This is because Unix filenames have no room for an out-of-band channel. (By the way, KDE uses this encoding scheme to support / in filenames.) In Tcl, a separate variable can be used, such as a variable that's false when the user didn't specify a string.

This can be very cumbersome and isn't always viable (again, when the domain is all strings). Two examples are default arguments and SQL nulls. Foolproof tracking of the former requires the proc to accept args and do its own defaulting (you can use the trick from ML); [llength $args] serves as the out-of-band channel. Tracking the latter may require asking the database to prepend a special character to all non-null string results; basically the first character is the out-of-band communication channel identifying the nullity of the result. A more straightforward option is to SELECT the NOTNULL of the string columns whose values could be null.


jhh proposes a possible solution in TIP 185 . Basically, {null}! is recognized by the parser as a null, which is not a string; it is distinct from all possible strings. "{null}!" is, of course, a seven-character-long string, and it's also a one-element list whose sole element is a null.

Note : TIP 185 was rejected in 2008.

I (AMG) have several strong comments regarding the TIP:

  • I prefer to say "null" instead of "null string" because I feel that a null is not a string at all. It's the one thing that isn't a string! I guess we'll need to change our motto. :^)
  • Likewise, I'd rather not tack the null management functionality onto the [string] command.
  • I think I'd prefer a [null] command for generating nulls and testing for nullity. It's best not to use the == and != expr operators for this purpose; null isn't equal to anything, not even null.
  • We can ditch the {null}! syntax in favor of using the [null] command to generate nulls, but then [null] cannot be implemented in pure script. This might be an important concern for safe interps.
  • Automatic compatibility with "null-dumb" commands is a mistake; it's the responsibility of the script to perform this interfacing.
  • When passed a null, the Tcl_GetType() and Tcl_GetTypeFromObj() functions should return TCL_ERROR or NULL (in the case of Tcl_GetString() and Tcl_GetStringFromObj()).
  • Most commands should be "null-dumb". Only make a command handle nulls when it is clear how they should be interpreted.
  • The non-object Tcl commands can probably represent nulls as null pointers ((void*)0 or NULL). If for some reason that can't work, reserve a special address for nulls by creating a global variable.

Feel free to argue. :^)


AMG: Here's a silly and inefficient proc to help me play around with the ideas presented above:

proc foobar {varname {value {null}!}} {
    upvar 1 $varname var
    if {![null $value]} {
        set var $value
    }
    return $var
}

This proc should behave the same as [set].

You will notice that I used {null}! even though in my above comments I suggested removing it in favor of always using [null] to obtain nulls. But it turns out that's not feasible in the above code; it would only result in $value defaulting to the string "[null]". To get the desired behavior, I'd have to write [list varname [list value [null]]]], which is far from readable. (With Tcl 9.0 Wishlist #67, it becomes (varname (value [null])), which I can live with.)

That's one black mark against my idea...

A more worrying problem is that [foobar] can't be used to set a variable to null! Why? Because the domain of $value includes all strings and null, there is (once again) no possible value outside the domain that can be used to indicate that a special condition occurred and cannot be "forged" by the caller. So what are nulls good for again?

I'm up to two black marks now. It's not looking good.

It seems nulls aren't as useful as originally hoped. (Notice the use of the passive voice.) But are they still good for something? The reason [foobar] doesn't work in the above case is that it is being driven by the script, and the script is capable of producing nulls. If its input instead came from a file or socket, it would be just fine because reading from a channel will never result in a null. Of course, at this point I'm reminded of tainting, which might be a better solution.


wdb: When switching from Lisp to Tcl, the lack of some special value such as NULL was one of the drawbacks with which I decided that I can live. It is the price of the simplicity I am willing to pay. There are more than one cases where something similar is resolved by some trade-off:

  • In the switch statement, the word default impacts the value "default".
  • In proc's arg list, the word args impacts the choice of argument names.
  • In Snit and Itcl, the argument #auto or %AUTO% impacts the choice of instance name.
  • And so on.

Extending the value range of type string leads to the consequence of leaving the principle eias. It is possible, and sometimes even desirable, to extend it. If so, ask yourself if Tcl is your right choice of a programming language anymore.

If you ask me: I prefer the state as is. The drawbacks are known, and as mentioned above, I can live with them.

AMG: switch can select on the value "default" if "default" is not the last option given. proc can accept an argument named "args" if it's not the last one in the list (although see Tcl 9.0 Wishlist #77). I'm just pointing out that these "keywords" only have special meaning when in combination with some other out-of-band data, which in these cases is list position. One more example is the use of - to signify an option. To disambiguate, we have -- to partition the argument list into options and non-options (see '--' in Tcl).

Yes, it's totally true we can live without nulls. The real problem comes when interfacing with systems that do have nulls. Tcl has no easy and safe way to represent them. Reserving a string will work most of the time, but the Tcl script becomes confused when the reserved string collides with valid data. This may happen by accident or as part of a malicious attack, which means even nonsense strings like "ßÿÑâRI'" aren't safe.

All the other stuff I said about nulls is just cute, sugary things we can do with them if they were added.


wdb (again): But if really necessary, it is possible to introduce typed data to tcl. Just put them in a list the first of which contains the type, and the second the data as follows:

 set typed_value1 {allowed {hello world}}
 set typed_value2 {disallowed {bye bye}}

This example shows the use of two data types allowed and disallowed. It allows easily to construct a null value by choice of type disallowed.

AMG: This is like jhh's method of prepending a special character to all non-null SQL results, except of course it's cleaner.

NEM: Tagged data is also how functional programming languages like ML and Haskell handle optional data/NULLs:

 # data Maybe a = Just a | Nothing
 proc Just data { return [list Just $data] }
 proc Nothing {} { return [list Nothing] }
 set val1 [Just "some data including Just and Nothing"]
 set val2 [Nothing]

Then you can test for missing data (NULL/Nothing) using a switch:

 switch -exact [lindex $data 0] {
     Just    { do stuff with [lindex $data 1] }
     Nothing { handling missing data }
 }

Alternatively, in many cases you can use the (non-)existence of a variable or dictionary/array element to test for nullity. e.g. in a database-like interface:

 $db query $query row {
     if {![info exists row(name)]} { # name is NULL }
 }

Lars H: In the original discussion of TIP#185, the following methods were proposed for interfacing Tcl with systems that have NULL values:

  1. If the external function returns a value or NULL, then have the corresponding Tcl command return a list of one element for non-NULL values or an empty list for a NULL value. In Tcl 8.5, {*} greatly simplifies using such commands.
  2. If the external function returns a "record" where some of the entries may be NULLs, then have the corresponding Tcl command return a dictionary which only has entries for the fields with non-NULL values.

Type-tagging values using lists as shown above may also be necessary when interacting with other systems, as some indeed take different actions for data of different types (even if the values are the same). tcom apparently has some troubles in this area, as it does not provide for specifying the type of data to pass on. In TclAE, the types are instead explicitly specified.

What NULL proponents should take note of is that Tcl values, as a consequence of the dodekalogue, constitute a monoid [L2 ] with the empty string as identity element and string concatenation (cconcat, for those who require a command name) as operation. The Everything is a string principle says that the monoid of Tcl values is in fact a free monoid (currently the free monoid of words in the alphabet of all BMP Unicode code-points), and I think it is an extremely good principle, but the dodekalogue does not explicitly proclaim it. Hence one could imagine a Tcl where there in addition to the strings exists a NULL value, but then it would have to be sorted out how this NULL should act under concatenation. What is passed on to A in the following commands?

  A [null][null]
  A [null]somestring
  A somestring[null]

Another problem with introducing special values like NULL is that there's no reason to believe that one special value is always going to be sufficient: once in widespread use, someone will come up with a situation where NULLs should be handled as an ordinary value, but at the same time needs a SUPERNULL that isn't! On the whole, it is much simpler to avoid introducing any special values.

AMG: Existence checking can work. Maybe sqlite eval's two- and three-argument forms can unset the variable or array element to signify that its value is null. The script already knows all the variable names, so it shouldn't need to be explicitly told what's null. But on the other hand, maybe the array, dictionary, or whatever can be accompanied by a list of all fields whose values turned up null.

Encoding the data as a list seems clever. If [llength] is zero, the data is null. If [llength] is one, the data is stored in [lindex 0]. Use {*} to get at it most easily. I imagine it's possible to recursively apply this encoding to dictionaries and nested lists.

Regarding the combination of nulls and non-nulls, jhh's TIP suggested that concatenating a null with a string resulted in a null. "Nulls propagate. A null combined with any nonnull is null. Appending a null to a string, or substituting a null into a string nulls the entire string." By this rule, A will receive null in all three cases.

My [foobar] example shows a case where SUPERNULL would at first glance appear to help, but of course it's a ridiculous thing to ask for, especially since it would still not allow setting a variable to SUPERNULL. What's asked for is a value outside the input domain, but no such value can exist because a variable can be set to anything (string or otherwise). Therefore the only solution is the out-of-band channel, as in [llength $args] indicating how many arguments were passed. With some sugar it might be possible to add a command to check if an argument was explicitly passed or if it was left at its default; this seems like a halfway point because [llength $args] is being used internally but to the programmer it's no different than checking for null. I don't propose such a thing; I'm just giving examples.

Lars, as you say, null would only be useful for this purpose so long as the command is intended only to interface with stuff that cannot generate its own nulls. But this is a funky reason to advocate null--- the original impetus was the desire to interface with stuff that does generate nulls.


jhh: Let me try to clarify what I was proposing in TIP 185 [L3 ]. (I see more comments have come in since I looked 2007-01-06 14:42GMT, so this is not up to date on the discussion.) Forgive me if I screw up this funky wiki markup language.

I agree, every thing is, and should, be a string, but I regret Tcl has no way of representing a null string, unlike most other languages, even Java and lowly Visual Basic. There are really two proposals here, and I probably confused people them by lumping them together in one TIP. The reason I did was that the two together are synergistic and compelling (at least to me). They are:

  1. Extend the meaning of a string to include a null string. I will call this TIP 185a.
  2. Extend the meaning of list and dicts to allow the representation of unknown elements. I will call this TIP 185b.

TIP 185a might be helpful on its own. For example, SQLite's loop construct (e.g., "db eval {select * from accts} {} { ...}") can return null information without using a contrived query statement and similarly contrived decoder code, thus opening the door to the creation of general purpose packages to integrate databases.

Either proposal could exist alone, but together they allow Tcl, the preeminent system glue language, to transparently manipulate system communications without kludging them as they enter and leave. A lot of my coding time is spent on this silly matter; if the system involved two or more database engines, as is common, the time is thus multiplied. And every programmer is doing the same thing, over and over. Aside from that, those of us who think Tcl is the best medium for exploring ideas and algorithms would be grateful at having this gap filled, and those unfamiliar with nulls would soon find nulls quite useful in their own right.

TIP 185 was not well received, perhaps partly due to my poor presentation of the idea -- I would do it differently now, having seen the response. My recent thinking is to wait for Tcl 9, or perhaps submit TIP 185a separately. I think the most difficult issues are implementation and performance, and the handling of legacy, null-dumb commands: should they see an empty string and proceed, or should they fail? Currently I am leaning toward the latter, more conservative direction.

Most voices against the idea have argued from misunderstanding, or have been vague, so I am not yet convinced the idea has no merit. Little else has been presented that would really help the matter, aside from endless accounts of the workarounds that we all have to invent in the absence of true null handling. These are invariably presented as reasons for why we don't need the feature. The sheer number and variety seems to argue just the opposite. I think if we can recognize the lack of null handling as a real shortcoming and get people working on the problem, we can come up with a fine solution. Imagine if, back at Tcl 6, people had said "speed isn't an issue, just use C for that, then wrap it in Tcl," and did not pursue the performance issue.

At the recent conference in Naperville, Illinois, there were two papers on relational algebra packages that might be helpful, though neither package currently supports nulls. I approached Andrew Mangogna, author of TclRAL: A Relational Algebra for Tcl [L4 ], about adding the feature, and he explained that he is a follower of C.J.Date, who opposed the use of nulls. I explained my pragmatic argument: perhaps the largest use for Tcl is interfacing with databases, nearly all of which, including SQLite, routinely represent null data. The relational algebra packages presented might provide an interface that could help compensate for Tcl's weakness in this area. He was not interested. Theoretical orthodoxy seemed to be important than practical programming.


GAM There has been a lot of churn on this page lately, and it has prompted me, since my name has been thrown about, to clarify my perspective on the above comment. Although I do vaguely remember the conversation referred to above (it having been more than two years now), the primary reason that I did not wish to get into a discussion of nulls is that it has no real effect on TclRAL. TclRAL is not a database management system and was not designed to be a front end for one (although I am aware of at least one person using it in that manner -- good luck with that) nor is it a Tcl interface to some other data storage mechanism. TclRAL is a Tcl extension that brings formal relational values to the Tcl language. Those values are not different, conceptually, than dict or list values. The motivations behind TclRAL are all about relation-oriented (or table-oriented, if you wish) programming and the belief, guided by experience, that a single unified data theory is better than the collection of data structuring techniques that we currently typically employ. TclRAL uses Tcl_Obj structures to hold all of its attribute values and uses expr to evaluate all its expressions. If by some means there were a Tcl_Obj implementation of NULL and all the byzantine, nonsense of three-valued logic could be implemented in expr then TclRAL would just work because it builds strictly on the mechanisms already in Tcl.

That being said, I still believe nulls are just wrong in so many ways that have been discussed and written about by those much more capable than I. I can't believe anyone would design a data schema using them. However, I am sympathetic to the legacy problem; there is just nothing I can do about it. And that has nothing to do with how practical a programmer I may or may not be.


Later I talked to Jean-Claude Wippler (jcw), author of Vlerq + Ratcl = Easy Data Management [L5 ], and a very practical programmer, and he found the pragmatic argument more compelling, and promised to look into it. I hope he was in earnest. While such a package is not a full answer to the problem, could provide

  • A common interface for relational databases, so database APIs could be more uniform.
  • Providing null handling -- a sort of "standard workaround."
  • A more complete implementation of relational algebra than can be offered by most database engines, optimized for large databases and high query volumes.

A few miscellaneous points :

1. If I revise TIP 185, I will change the nomenclature to "unknown" ({unk}!) instead of "null," because the word is more accurate, and because the English meaning of "null" as "nothing," instead of "unknown," was such a stumbling block for so many people. I originally chose "null" because it is used in SQL. C.J.Date (1995, An Introduction to Database Systems, 6th ed) uses "UNK" for unknown values in relational algebra, but it wasn't adopted in SQL. I have used the "null" nomenclature here, for consistency with the present discussion.

2. A null concatenated with a string is a null. Again, the "null" nomenclature is confusing. One envisions something like "The quick \000 fox", but that thing in the middle might be "brown" or "" or the full contents of the 1957 Encyclopedia Britannica, we just don't know, thus making the whole string unknown, or null.

This example seems to suggest the programmer might have been better off modeling the phrase as a list of words:

   set phrase [list The quick {null}! fox]

or

   set color [null]

   set phrase [list The quick $color fox]

Note that if he then joins the list, it will collapse into a null, like a black hole.

3. A null command can generate a null (TIP 185a), but can not replace the {null}! syntax. Its purpose is to implement TIP 185b. Sticking null in the middle of a list serialization string turns it into a null (by point 2, above), instead of a serialization of a list with a null element.

4. Three valued logic is widely used and well standardized, particularly in relational databases. Yes, a null is not equal or unequal to anything, including null: the result is never true or false, it is null. However, null && false is false, etc. There is a whole set of tautologies, I think I included most of them in the TIP write-up. These will propagate sensibly through expressions, just as IEEE floating point NANs do.

I mentioned above that C.J.Date argues against the use of nulls. This is not because of problems with three valued logic, but because of unavoidable paradoxical result sets in formal relational algebra. I reiterate my pragmatic argument to Andrew Mangogna, above.

5. One of the most common misunderstandings in previous discussions is that nulls can not be serialized. This is an example of the synergism of TIP 185a and b: 185b enables this -- merely encapsulate a transmission in a list. Now you can send any data structure representable by a string or list, containing nulls in whole or embedded in any part. It can be handled on the other end by general purpose code, without any need for a special protocol or understanding of the transmitted data structure. This data format might well be appreciated, like Tk, by a broader audience than just the Tcl community. We would be leading, instead of lagging behind, in the area of data management.

Sorry for the long exposition. Hope some of you look it over and find it useful.

NEM: From the discussion above, it appears that your main point for including a special null non-value into Tcl is that NULLs are widespread in the world of SQL databases (your pragmatic argument). Can you describe how the alternatives outlined by others, which do not require such a radical alteration of Tcl's EIAS philosophy, fail to address this issue? In particular, I see at least three separate means of encoding the concept of missing data into a run-of-the-mill string:

  1. Using a list encoding: llength $data == 0 for NULL, otherwise lindex $data 0 is the real data.
  2. Using a tagged list encoding: Nothing vs {Just $data} (a variant of the above).
  3. Using a dict or array encoding where missing data is represented by a missing key, so that dict exists can be used to check for null/missing data.

To me, these are all much nicer options than introducing a special non-string non-value.


slebetman I'd just like to point out that Tcl is not the only language that doesn't have a NULL data. Neither C nor C++ has NULL data. NULL pointers yes, nul byte yes, but not NULL data. This is not just pedantic but fundamental in C. C is like Tcl (or should it be that Tcl is modeled like C?): "everything is a number". C, freestanding in itself, doesn't recognise the nul terminator in strings as anything other than ordinary data. It is the standard library that cares, not C itself (though it is part of the C standard). Even for a NULL pointer which C does fuss about, it is only special when used as a pointer. The NULL pointer if used as data will be treated by C as a regular integer. In C NULL data is just a convention, nothing more.

NULL data is useful for determining the difference between the user entering an empty string and the user not entering anything though. But in this case I don't see why we can't steal the C convention and signify NULL data with \0. I don't know much about SQL, am I missing something? This is all very confusing because it starts out with a misleading C example (A nul in the middle of a "standard" C string is by definition the end of the string hence is not in the middle) then goes on to talk about a completely different concept of NULL data in SQL. Note that when sending data to an external program, Tcl already support C nul. There is absolutely no problem in this department so I have no idea what the C compatibility complaint is about in this discussion:

  # Saves a file which C can read as containing 3 strings:
  set f [open test.dat w]
  puts -nonewline $f "this\0is a\0test"
  close $f

  # It's also easy to parse C strings:
  set f [open test.dat r]
  set strlist [split [read $f] \0]
  close $f

Indeed for me, Tcl has one mechanism not found in C/C++ (at least not usable in runtime code): the nonexistent variable! For true NULL values I normally simply don't set the variable at all unless needed. Then use info exists to check the difference between an empty (possibly binary) string and a nonexistent string. Though you need to be careful to always unset the variable at the end of the block lest you accidentally use the previously set value. There are already commands in Tcl that use this convention. regexp is one example which simply doesn't set the match variables if the values don't exist.

AMG: I wasn't trying to be misleading when I wrote about NULs, but I see that it worked out that way despite my best intentions. Let me reiterate my very first point at the top of this page. I said that it's often useful to have a value outside the acceptable domain to signify an exceptional circumstance. My second example of this is that C strings are arrays of bytes in the range 1 through 255, so 0 (NUL) is available for this purpose, and it is defined to be a terminator (standard C string), but some systems (find/xargs/cpio) use it as a separator. (If you think about it, these are really the same use.) SQL was merely another example of this general concept. What concept? The concept is sending control data in the same stream as regular data. This can be implemented in one of three ways:

  1. Have a separate, parallel data stream. [llength $args], [info exists], and the "contrived" SQL queries are examples.
  2. Encode with some kind of quoting. The backslash character is an example. Note how the backslash must itself be quoted; null would not have the same problem since it's outside the domain.
  3. Signal special data with a value that cannot appear in the normal data stream. {null}! can be used when the normal data stream can be any string. When the data stream is numbers, any non-numbers will do.

There's some overlap between the three. But since I probably haven't succeeded in clearing anything up, I'll try to make this all really simple by saying that it's dumb to use null for this purpose and that I'm sorry I brought it up. Instead let's focus on interfacing with other systems that use nulls. Also consider why other systems use nulls (to represent unknowns); maybe Tcl can benefit from using that same feature internally.

Yes, it's true that C doesn't really have null either, but it does have the NULL pointer because it reserves one memory address for that purpose. More memory addresses can be reserved simply by creating globals or defining functions. The meaning of the NULL pointer (or any other reserved pointer) is application-defined, and it's often used to signal something exceptional, such as "data not available", "cached string representation invalid", "error", or "SQL query returned null".

By the way, I use several distinct terms that are all pronounced the same way. They mean different things. Let me clarify:

  • null: A non-string, non-value object. As jhh says, it is used to indicate an unknown.
  • NULL: (void*)0, the value assumed by a C pointer when it doesn't point to anything in particular.
  • NUL: (char)0, an ASCII value used by C to terminate a string.

None of this discussion is about interfacing with NUL or NULL, only with null. (Man, that sounds stupid when you say it out loud! I can see why this got confusing.)

slebetman: Again you're using C as an example and again I must point out that C does not have a feature to signify a non-entity when working with binary data, only when working with text strings. So again I'm asking what's wrong with using \0? This does not fall into either 1. or 2. but fits perfectly for 3. which is what you want. If you're only working with strings, simply return the nul character like C if you find C's behavior acceptable. Like C, if you decide to work with binary data then the null representation is unavailable.

Regarding the other meaning of null, that is something similar to C's NULL, where C has the convention of using the NULL pointer as a signal, Tcl have traditionally used the empty string as the same signal. Some people find this unacceptable* since they don't consider the empty string as being outside the domain of valid data. I think this is the real issue. Not the "interfacing" with other systems part because Tcl can do that already since it can handle binary data transparently.

* Note: That includes me. I do personally wish that Tcl would differentiate between an empty string and "null" (or "undef" in Perl). But this kind of null is different from what is described by TIP 185. It's more like the half-way point in the existence of a variable: the variable exists but the value doesn't.

AMG: Pointers can be used for more than just text strings. An SQL query result can be formatted as an array of pointers to values, for example {int* user_id, char* account_name, char* nickname, int* tel_exchange, int* tel_extension}. If any of these values is null for a particular row, the pointer can be NULL.

A C string consisting only of \0 (NUL, as I call it) is not a null; it's an empty string.

Since I've been a really poor communicator, I think I must say again that I too think the real issue is that using the empty string (or any other string) is not an acceptable method of representing nulls. This is because no matter what string is chosen, that string might possibly show up in valid, non-null data. I then spewed some gobbledegook about how this method works just fine when using limited domains like numbers or arrays of bytes 1 through 255 instead of all strings, but that was just me trying to be complete.

Interfacing with other systems still remains an issue because some systems really do try to send nulls in addition to strings*. On the Tcl side, methods 1 and 2 (above) must be used to do this safely, but usually method 3 is used even though the possibility of ambiguity exists. (*any array of binary data is a string, which can include any number of NULs)

"The variable exists but the value doesn't." Yes, this pretty much sums up what I'd like to see in a true null implementation. The variable definitely does exist, but its value isn't known.

slebetman: This is where you've lost me. It is impossible to send/transmit null nor is it possible to save nulls to file. Unless you're talking about a file that doesn't exist representing null or a zero length TCP packet representing null. Once data is sent or saved it has to have a physical representation. What represents non-data in such cases is application specific. For example, Tcl code represents such "nulls" as any string which begins with "#" until the end of line (in other words, a comment). Lots of text files represents such "nulls" using a single byte "0x0c". Another common representation of "null" in binary is 0xffffffff which when interpreted as 2's complement integer is -1. Of course 0 (zero) is also a very common representation of "null".

In C, a NULL pointer only exists at the source code level. In running code, a NULL pointer can never be portably transmitted or saved into a file without explicitly first converting it to 0 (zero). Yes you can do that as a hack. And yes the C standard does specify that:

  int *a = 0;
  int *b = NULL;

both point to NULL. But only at the source code level. Once compiled, NULL is not guaranteed to be equal to 0. Indeed there exist platforms in the real world where a NULL pointer is represented in binary as not "all bits zero".

This is why I equate the null concept to a variable state. Null data is application specific. What happens when you try to save/send a null variable depends on the language. The C standard decided to let you do it but they have a "we don't define the result and don't call us if anything goes wrong" attitude. Perl on the other hand does it the same on all platforms by decaying undef into an empty string or the number 0 depending on context if treated as data.

AMG: It is not impossible to send/save null. Every SQL database in the world manages to do exactly this. It has to use methods 1 or 2 to do it, though. All the null encodings you mention are examples of method 3, but method 3 only works when the domain is limited. For general-purpose data, method 3 can't work.

Perhaps a concrete example might help to explain how to safely send/save null. Here's a database record structure that supports nulls:

struct record {
    enum record_type type;  /* Type selector code. */
    union {
        int integer;        /* Signed integer. */
        struct {
            char* address;  /* Address of first byte. */
            int length;     /* String length in bytes. */
        } string;           /* UTF-8 or binary string. */
        double real;        /* Floating-point. */
    } data;                 /* Record data. */
    int null;               /* Nonzero if null. */
};

With a little bit of work, this can be serialized. (1) Write a byte indicating the type. (2) Write a byte indicating null or non-null. (3) If integer or real and not null, write the value out in a standard format. (4) If string and not null, write the length in a standard integer format, then write length bytes at the address pointed to by address.

This structure is an example of method 1. The data and the nullity are interleaved in the bitstream.

A nonexistent file might be able to represent null, but it goes too far. It's like representing a null in Tcl by unsetting the variable. Neither the variable nor the value exist, which isn't a fair way to describe what null really means. By the same token, Perl's implementation seems wrong: neither 0 nor "" are null. Null is not a number, and it is not a string.


LV: There's lots of theoretical discussion regarding null vs empty strings above. The best place to look in terms of practical application, is interfacing with true SQL. How do extensions like oratcl and others, currently, provide the tcl developer with the ability to distinguish a row/column entry which has an empty string as a value vs having no value set (i.e. the NULL situation)?

AMG: sqlite has a nullvalue subcommand on database connection objects which is used to set the string representation for nulls. It defaults to the empty string. [L6 ]

slebetman: Ah, so this discussion is really about SQL then? Not the general idea of null data? Because, as I've mentioned above, the binary representation of "null" in a data stream is application defined and I have a feeling it's even different between different database systems (or indeed different versions of SQL).

LV: I certainly wouldn't say that SQL is what everyone's concerned about. However, it is probably the most common practical example of what is being discussed. A lot of the above discussion, as well as discussion elsewhere, has tried, in the past, to avoid mentioning a specific extension because people get side tracked into solving the problem in one specific extension, while the advocates are trying to discuss the base problem. Think of it as the kinds of discussions that went on before ZERO became a part of the number system...

AMG: Yes, the binary representation of null is up to the application. We're not trying to standardize this. If TIP 185 or a descendant is accepted, Tcl scripts will still have to implement their own serialization of null because it is (or should be) an error to call puts with a null argument. Most scripts will never need to serialize null; it's primarily useful for communicating with systems that do have null. It also may be good for communicating between different parts of the application, as is the case when null is used to signal that nothing was specified. But writing a null into a file or over the network requires that it be encoded as bytes, same as a string or number or anything else, and that takes methods 1, 2, or (when possible) 3.

NEM: This flags up one of the biggest problems with null: not just puts, but every Tcl command would need to be adapted to handle being passed a null. If a null is allowed everywhere that a string is, then all commands need to be able to deal with it (either throwing an error or performing some other default handling). Given that there already exist various ways of encoding "missing data" in Tcl (see my previous response some way up this page), it seems unnecessary to introduce such a special mechanism that mostly just causes trouble. The lack of nulls in Tcl is a great feature, leading to much more principled ways of dealing with missing data. Can you describe a single concrete example (preferably with sketch code) of where a special null value would provide any advantage over one of the existing approaches that have been described on this page?

RS: Isn't the non-number NaN somehow comparable to the more general null?

LV: RS, it seems functionally comparable to me, though literally means something different.

As for the request for concrete examples where special null values would provide advantage... I don't understand NEM's "tagged" option, but I certainly see where the array/dict alternative, if actually used by extension writers, probably would be sufficient if done correctly. I don't know how the llength option would work - the results of a SELECT * from my_table, where a specific row/column can take on either:

  1. a string value
  2. an empty string
  3. a null

seems like would have problems with llength - would the second and third option be able to be distinguished in that technique? However, if the extension was set up so that only columns that had non-nulls were returned, then one could code to see if the column existed and take appropriate action in that case.

As for the argument that this approach means that special code has to be written for each column that within the database can be NULL, well, you'd frequently have to do something special for them if the variables took on a special Tcl NULL object as well.

On the other hand, one _could_ have the Tcl code treat NULLs as some unique string as was suggested above, when dealing with said object. It would, however, require a "type" to be maintained along with the string value, so that one could, via perhaps a "string is NULL" ensemble, code special action when needed...

NEM: I don't know much about NaN, so can't really comment, except to say that it seems to be used as a sort of exception to indicate that the result is outside of the range of what can be represented (e.g. taking the square root of a negative number in a system which cannot represent complex numbers). The purpose of NULL in SQL, and in this discussion (from what I gather), is to representing missing or unknown data. These seem to be somewhat different notions.

The tagged list representation is simply borrowed from functional languages like ML and Haskell, which also have no notion of NULL. The idea is that instead of introducing a special "null" value, which is a member of every type (and thus can appear anywhere), you instead introduce a special "option" type (called Maybe in Haskell, and option in ML). There are two cases to consider with this type: either the data is present, or it is absent, and this is what the two constructors directly represent:

set val1 [Just 12]
set val2 [Nothing]
proc maybeDouble x {
    switch [lindex $x 0] {
        Just    {expr {[lindex $x 1] * 2}}
        Nothing {error "No value!"}
    }
}
maybeDouble $val1
maybeDouble $val2

You can think of it as a little wrapper which adds some information about whether the value is present or not. You can use a list for this too, where Nothing == the empty list, and Just == a list of length 1. Thus, if we consider a database API that returns result rows as lists, and we want to return the following three rows (Larry's three alternatives, above):

 Command Name | Arguments
 ===============================
 "proc"       | "name args body"
 "pwd"        | ""                (empty string)
 "someCmd"    | <NULL>            (we don't know the arguments accepted)

We can encode this by making the second element of each row list use the Just/Nothing constructors:

 {proc        {Just "name args body"}}
 {pwd         {Just ""}}
 {someCmd     Nothing}

The equivalent list encoding would be:

 {proc        {"name args body"}}
 {pwd         {""}}
 {someCmd     {}}

And the dictionary encoding would be:

 {name proc   args "name args body"}
 {name pwd    args ""}
 {name someCmd} ;# args entry simply doesn't exist

These are all good and natural ways of encoding possibly missing data. As LV points out, a NULL from the database has to be converted to something when passed to Tcl, so a built-in notion of NULL doesn't have any particular advantage over any other approach, and several disadvantages.

AMG: Wow, I had an edit conflict with NEM! I'm happy that I was able to spark such a lively discussion. I apologize if my comments below seem a bit repetitive; I'm stitching multiple replies together into some unholy Franken-reply.

NEM, if I read you correctly, the disadvantages against built-in null are: (1) inability to serialize, and (2) difficulty interfacing with existing code.

I see (1) as a fundamental property of null--- it doesn't have a string representation, so when written out as bytes or characters it must be somehow encoded. We don't aim to standardize this encoding since we know it's an application-specific thing. Python standardizes its encoding of None: it just prints None. But it's able to do this because it surrounds strings with quote characters to signify that they're strings, so None is null and 'None' is a string. That's an example of an encoding.

We shouldn't trouble ourselves too much with (2) because the point of null isn't interfacing with (the majority of) existing code. Null is for interfacing for systems that internally have null. Cramming a null down the throat of code that wasn't set up to handle it should result in an error raised by Tcl itself. Simply make the Tcl_GetType() and Tcl_GetTypeFromObj() functions generate an error when passed a null object (or NULL when using the old string-based API). I feel this is the right way to handle it. Relatively few commands can actually do anything meaningful with a null; they should handle these cases by explicitly checking for nullity.

My example of a true null being a helpful is interfacing with any SQL system. Take sqlite:

sqlite3 db filename.db
db nullvalue [null]
db eval {select id, name from users} {
    if {[null $name]} {
        puts "user \"$id\" has no name!"
    } else {
        puts "user \"$id\" is named \"$name\""
    }
}

This should be robust even in the face of users having names like "", "null", or "NULL". If this code is part of a Web forum, the probability that someone will call himself "NULL" is pretty much 100%.

Here's a version of the above that works without any null support in Tcl:

sqlite3 db filename.db
db eval {select id, name, isnull(name) as name_is_null from users} {
    if {$name_is_null} {
        puts "user \"$id\" has no name!"
    } else {
        puts "user \"$id\" is named \"$name\""
    }
}

It works (I think; I can't test right now), but I think it's burdensome.

Now for RS's question. According to jhh, NaN and null are not all that different; it's possible to use both in (numeric, logical, string, list) algebra. If null means "unknown", then [expr {[null] || true}] should evaluate to true because nothing to the left of the || can make it false, and [expr {[null] && true}] should evaluate to null because it's unknown whether it's true or false. Nullity propagates. Concatenating a string or list with null should result in null. Numerically adding to null should also result in null. Multiplying null by zero yields zero, but I'm not totally sure on this point.

jhh says a standard exists for the propagation of NaN; I'm just going to have to trust him on this point since I don't know the standard myself. He says that by the same coin there is a well-defined three-valued logic system that supports the concept of unknown. See his exposition for more.

I think it's a mistake for nulls to automatically decay to any predefined string or number. Null isn't "", and it's not 0. If it's not clear how a null should be handled in a given situation, an error should be raised. Scripts and extensions shouldn't depend on nulls turning into "" when looked at from the right angle. If the code isn't written to support nulls, its caller shouldn't be trying to pass nulls to it. If the code is written to support nulls, it won't need to rely on any such compatibility mechanism.

NEM: Item (2) above is a lot more serious than you seem to think. Not all code calls Tcl_Get<type>FromObj, particularly older code, and some code implements its own string parsing functions and internal reps. All of this code would have to be able to detect and handle a possible NULL. I don't see how you could make this work. What happens when I try to take the string rep of a NULL? An error? This is the problem with NULLs -- they introduce lots of new error conditions and edge cases that simply aren't present in the other solutions. As you say, NULLs propagate, which means they tend to turn up where you least expect (e.g. in places you didn't think a NULL was possible).

Your SQLite example is a good one. However, any of the other techniques given work fine for this. In particular, SQLite's usual eval interface could simply not set array keys for NULL columns:

interp alias {} null? {} info exists
sqlite3 db filename.db
db eval {select id, name from users} row {
    if {[null? row(name)]} {
        puts "user \"$row(id)\" has no name!"
    } else {
        puts "user \"$row(id)\" is named \"$row(name)\""
    }
}

So, my arguments against NULL are that (1) it is unnecessary, and (2) it is unworkable! :)

DKF: Moreover, with a database you can let the sql engine take the strain anyway:

db eval {select id, name, name isnull as noname from users} {
    if {$noname} {
        puts "user \"$id\" has no name!"
    } else {
        puts "user \"$id\" is named \"$name\""
    }
}

AMG: DKF, that was my example. :^)

NEM, indeed code may exist which assumes it will never be passed (void*)NULL. To handle that, Tcl itself can raise an error when a script attempts to pass a null to a string-based function. I know this sounds like a horrible incompatibility, but consider that older, non-object code isn't set up to handle null anyway, and even if it did properly check for NULL arguments, it would only respond by panicking or raising an error.

Here's another possibility:

 /* Special, reserved address for null string objects.  Not to be confused with NULL!
  * This variable is in BSS so it will be initialized to 0. */
 static char tcl_null;

 /* Return a Tcl null string object. */
 static char* Tcl_get_null_pointer(void) {return &tcl_null;}

 /* Check to see if a string object is null. */
 int Tcl_is_pointer_null(char* pointer)  {return pointer == &tcl_null;}

This way the pointer will not be NULL and dereferencing it will only yield an empty string, preventing segfaults. But null-savvy string-based Tcl extensions can call Tcl_is_pointer_null() to check if passed a null argument.

Which approach we go with (if we take any action) depends on how committed we are to making new features available through the old, string-based API.

Code never has to "take the string rep" of NULL, because the only way code can be passed a NULL is through a char* argument, meaning that the code is expecting to receive the string representation directly. "Taking the string rep" of null, on the other hand, is a matter worth discussion. Null doesn't have any valid representation since it's not a value, so I say that all Tcl_GetTypeFromObj() functions should generate errors when passed null objects. Code that is designed to handle nulls should always check the nullity of an object it knows might be null; only after it is known to be non-null should it attempt to look at the object's value (string or otherwise). Also scripts should not be trying to pass null to other code that isn't designed for it. Hiding the error by supplying a default value would be negative help in debugging this situation.

slebetman: I think this is where it really rubs Tclers the wrong way. When we say "everything is a string" we really mean that everything is "ordinary" and can easily be manipulated. A very fundamental part of that "ordinariness" comes from the fact that everything has a string representation (which is the natural way of saying binary representation in Tcl). The array is the first and only thing so far that is opaque in Tcl and even this very uncontroversial addition have since become very controversial for it. But people have learned to cope with arrays because string rep can be faked using array get/array set.

Your (or rather, TIP 185's) null would be an abomination even worse than arrays since its semantics completely disallows a string representation. This is in direct conflict with the fundamental transparency of data in Tcl.

I would now like to note that the more you describe your null (with the NaN like behavior of not decaying at all) the more it seems like it already exists in Tcl: the non-existent variable. It seems to me that the real problem is that all implementation of database API in tcl so far is flawed the same way foreach is flawed: it assigns to the variable even though the value doesn't exist! Think about it, Tcl already treats non-existent variables mostly the way you want null to behave. Wouldn't a more natural API be something like:

db eval {select id, name from users} {
    if {[info exists name]} {
        puts "user \"$id\" is named \"$name\""
    } else {
        puts "user \"$id\" has no name!"
    }
}

Or if the slightly less controversial Perl-undef-like null gets implemented:

db eval {select id, name from users} {
    if {[info isnull name]} {
        puts "user \"$id\" has no name!"
    } else {
        puts "user \"$id\" is named \"$name\""
    }
}

The difference here is that "existence" never decays. Trying to use a non-existing variable always results in an error. On the other hand "nullness"/"undefness" decays silently and would not generate errors when used as data.

NEM: AMG, let's leave aside implementation options. The greater point is that NULL is unnecessary.

AMG: slebetman, I find your comparison between arrays and null to be very apt, and I appreciate it. It brings some much needed perspective to this discussion.

 % array set a {1 2 3 4}
 % puts $a
 can't read "a": variable is array

This is very similar to what I had in mind for null-valued variables. Trying to read them directly will cause the script to 'splode, so they need to be handled with care. Treat them like nitroglycerin. Sounds awful, doesn't it? That's the nature of null; if it's known that a value can possibly be null (i.e. unknown), never assume it's always non-null. This is as much a mistake in SQL as it is in Tcl+null. Or C, for that matter, when it comes to dereferencing pointers. I consider C's behavior concerning NULL pointers to be exactly right--- crash immediately upon dereferencing.

 % set b [null]
 % puts $b
 can't convert null value to a string

But don't think this landmine behavior is the ultimate goal; it's just the only sensible way to handle a script's attempt to query the value of something whose value isn't known. It's like how Tcl raises an error when a nonexistent variable is read, whereas FORTRAN (at least some versions of it) will silently create the variable on the spot and report garbage for its contents.

Also don't think that introducing null would be like introducing barbed wire to the open prairie, signaling the end of freedom and turning Tcl into the same minefield that C becomes when pointers are overused. Because of Tcl's excellent behind-the-scenes memory management, Tcl null would be needed in far fewer places than C NULL. Its prime use would be interfacing with systems that do use null, but it's also good for representing unknowns within a script.

Using a variable's existence to represent its nullity is a very good approximation, one that I think would work just fine if added to the several Tcl<->SQL interfaces we have. It works great when the response data is communicated back to the script via variables, arrays, or dicts, and I'd like to see this implemented. However, it does not work when the data is stored in a list indexed by position, which is the default mode of operation for sqlite's eval subcommand [L7 ]. This is one advantage of true null over variable existence; nulls can be stored in lists.

Here's a non-SQL example of using existence to track whether or not a value is specified. I have written many scripts that work like this.

 foreach filename $filenames {
     set chan [open $filename]
     unset -nocomplain password
     while {[gets $chan line] != -1} {
         if {[regexp {^password=(.*)} $line _ password]} {
             break
         }
     }
     if {[info exists password]} {
         puts "file \"$filename\" contains password \"$password\""
     } else {
         puts "file \"$filename\" does not contain a password"
     }
     close $chan
 }

A null-ized version of the above would replace unset -nocomplain password with set password [null] and [info exists password] with ![null $password].

What's the difference? In the null version, the variable's existence isn't being questioned; instead the question is about whether or not its value is known. This is a fine point, perhaps a point too fine to worth worrying about. But it is the primary difference.

NEM, slebetman has pointed out that we already have null in the form of [info exists], and there are many places where this idiom is used (at least in my code, heh). This shows that null (the concept) is not unnecessary. Whether explicit support for null needs to be added to Tcl is a separate matter. At this point the only advantages I see that explicit null support has over variable existence are list storage, null propagation (e.g. tri-state logic), and keeping variable's existence orthogonal to its value. All three are (of course!) highly debatable. The first can be dodged by using dicts instead. The second may be a mistake and might better be implemented in pure script. The third is just a quibble for purists.

NEM: You seem to have moved the goalposts. Are we discussing being able to represent the concept of missing/unknown data, or are we specifically discussing adding a special NULL value-which-isn't-a-string to Tcl? If you are merely concerned with the former, then yes that is an important concept, but one which is amply provided for by Tcl. (I mentioned [info exists] right back at the top of this page!) My point is that a NULL value is entirely unnecessary, precisely because Tcl can already deal with all of these situations. As you say, list storage can be achieved by using dicts, but also by using either the Maybe monad that I showed, or one of several other encodings (search up for my Command Name/Arguments example, showing three separate safe encodings of NULL).

To try one final time to convince you that NULL (the value) isn't a good idea, what would you expect the following to print in your scheme?

set a [null]
set b [list foo $a bar]
puts "b(2) = [lindex $b 2]"
catch {puts "b = $b"}
puts "b(2) = [lindex $b 2]"

You have said that NULL can be stored in a list. However, you have also said that NULLs propagate: a string that contains a NULL is itself considered NULL. These two views are incompatible -- in Tcl a list is a string. So either a NULL cannot be put into a list (without the entire list becoming NULL), or NULLs do not propagate.

slebetman: Indeed, according to AMG's own definition, a list containing a null would necessarily erase all other data in the list since he defines concatenating a null with data (a string) results in null. The goalposts haven't really been moved since what is being asked is exactly what TIP185 proposes. What we're seeing here is that people are starting to realise that there are two different semantics for null which are probably incompatible with each other. Especially since AMG wants data null to behave like NaN and not decay to a string rep.

Like I said before, the more we discuss this the more it seems to me that the various database APIs in Tcl are flawed and people want to modify Tcl instead to handle this flaw. People tried their best to frame it in general terms and avoid mentioning SQL but in the end it turns out that what is really being asked is compatibility with existing SQL API.

I do think it is useful to distinguish a variable that doesn't exist and a variable that has no value. I just can't get my head around data that has no value.

DKF: It's been said before, but here it is again. Tcl has exactly two natural representations of a NULL value. One is a variable that is unset (or list item that is not present, or dict mapping that is not present; the key concept here is of absence), and the other is the empty string.

AMG: NEM and slebetman, it's not my definition that a list containing null would entirely collapse to null. I was quoting the TIP: "A null combined with any nonnull is null. Appending a null to a string, or substituting a null into a string nulls the entire string." The TIP also gives examples of non-null lists that contain null elements: "This string represents a list whose second element is a list containing a null." Perhaps this is contradictory, perhaps the propagation of nulls isn't being defined correctly, or perhaps the first quote is making an overgeneralization.

Regarding "goalposts", we're discussing both the concept of null and how to implement it in Tcl. We have established that the concept of null is useful, that Tcl mostly* supports it, and that existing Tcl scripts use it. DKF names two implementations, and TIP 185 is another. Tagged data and other encodings are also valid implementations. We have also established that TIP 185 is flawed as written because it says that concatenating a null with a string results in null but also contains language describing lists containing null elements. As NEM pointed out, strings are lists are strings! While it may be possible to implement both of these attributes of TIP 185, it will create a script-visible distinction between strings and lists. This is certainly the wrong way, as the below code demonstrates.

(*) By "mostly" I meant (past tense) that Tcl doesn't support lists containing nulls, but then I remembered that tagged data can be used in lists. So this means the deficiency isn't in Tcl but rather in the SQL interfaces for failing to offer unambiguous null representations; this is precisely what slebetman is saying. Also see below for more interesting developments on this matter.

Addressing NEM's code question: I expect the code to print "bar" twice. The script inside the catch will fail because no string representation exists for a list containing a null. But the list itself is not null, only certain elements, so it can be probed and manipulated using the list commands. However, this leads to multiple sticking problems:

 % set data [list happy happy [null] joy]  ;# Success, but tclsh can't print [set]'s return value.
 % lappend data joy                        ;# Success, but again the result can't be printed.
 % lindex $data 0                          ;# Success.
 happy                                     ;# Result is as expected.
 % lindex $data 2                          ;# Success, but the return value isn't printable because it's null.
 % llength $data                           ;# Success; the list's length is known even if element #2 isn't.
 4                                         ;#
 % set data [list {*}$data joy!!]          ;# Success, but result isn't printable.

 % set data [list happy happy [null] joy]  ;# Reinitialize.
 % set data [concat $data [list joy]]      ;# Success, since all arguments to [concat] are pure lists.
                                           ;# This works just like the [lappend] and [list {*}...] lines.

 % set data [list happy happy [null] joy]  ;# Reinitialize.
 % set data "$data joy"                    ;# Failure!!! because string representation isn't available.

 % set data [list happy happy [null] joy]  ;# Reinitialize.
 % set data [concat $data joy]             ;# Failure!!! because [concat] is forced to fall back on string reps.

The four above methods of appending to a list should all work the same, yet they don't. That's a serious problem, I think. This means that even if Tcl gets explicit support for nulls, it cannot support lists containing null. In sqlite eval's default mode (return results as a list), nulls cannot be used. This leaves its other two response methods, by-array and by-variables, both of which already support null via existence testing. The conclusion is that explicit null support in Tcl does not benefit sqlite or similar SQL interfaces.

That's all, folks. After this point it no longer makes sense to talk about TIP 185 or any other such modification to Tcl itself. We should coalesce all the concept discussion, viable null implementations, and sugary wrappers; all this extraneous discussion can go into a separate page.

Thank you everyone for working tirelessly to get this through my thick skull. :^)


LV: Conversations on this topic have arisen on the TCT mailing list as TIP 185 is being called for a vote. One author, Twylite, summarizes the arguments on that list as:

I think this is a summary of the proposed approaches (from everyone):
(1) A "NULL" value (TIP #185) - a Tcl_Obj that explodes when touched.
(2) Tcl_NullObj - a distinguished/well-known value with an appropriate
string rep (possibly empty) that can be treated as a string but also
explicitly tested for nullity.  Breaks EIAS.
(3) Explicitly tagged values / Maybe monad / pairs / cons / object with
isNull & toString / tuples - treat the value as a pair
(type/state/isnull, value), allowing explicitly handling of nullness.
Various interfaces possible including OO.
(4) Unset variable - typically a dict or an array that is missing a
particular name/key to indicate nullness
(5) Nil object - an object that throws an exception when accessed,
requires an API that returns objects rather than strings.
(6) NullObject pattern - an "empty" object with appropriate behaviour
for the given context (and therefore context-specific).
(7) Sparse lists - lists with "holes" but which can have a string rep.
Breaks EIAS.

(That this breaks EIAS was later shown to be false, but just that something doesn't break EIAS does not mean that it is also feasible.)

(8) API-specific approach to ask if a particular value/variable is null
(typically assumes that the API translates null into some appropriate
preset or configurable string, similar in that respect to NullObject).

Any corrections/omissions?

Observations:
- Anything that breaks EIAS should be rejected unless there is a clear
use case and no EIAS-compatible alternative - this should rule out (1)
NULL, (2) Tcl_NullObj and (7) Sparse-lists.
- Since EIAS, and TclOO is not "design[ed] ... to be *that* fast" [DKF],
requiring an interface to return objects which must (almost always) be
turned into strings would appear to be the wrong direction - this places
(5) Nil and (to some degree) (6) NullObject under suspicion.
- The remaining options (3) Maybe/pairs, (4) unset-var, and (8)
API-specific all require the API user to check explicitly for nullness,
and impose similar requirements on code structure (assuming you are not
adopting the Maybe monad throughout your source tree).  In each case you
must retrieve from a data set each component that is of interest and
test to see if it is null/empty before attempting to extract or use the
component's value.
Each of these options have pros and cons:
* Maybe/pairs can make it more difficult to handle data when you are
reasonable sure there are no nulls (worst case: loop over each element
extracting the value and building an ordered list of values)
* unset-var (e.g. dict with missing key) will cause exceptions if you
attempt to access the "null" field accidentally, but could also silently
lead to bad data if it is assumed there are no nulls (e.g. you use [dict
values] to get an ordered list of values).
* API-specific approach is more likely to cause a programmer to fail to
handle null when it should have been handled, because the API translates
null into some context-appropriate string and the explicit test for
nullity may be forgotten.

DKF: Expanding on the "TclOO is not that fast" comment, I note that TclOO is a way of constructing commands with encapsulated state, not a replacement value system for Tcl. Of course, that's how almost everyone uses objects in Tcl anyway.


slebetman: FB makes the comment:

  • If everything is a string, everything has the same data type. ''-
   [FB] This is inexact even in Tcl. EIAS means that everything has the
   same '''representation''', however the data types may differ. Tcl
   uses so-called duck typing [http://en.wikipedia.org/wiki/Duck_typing], meaning that the
   data type is implicit to the operation (hence [shimmering]).
   Case in point: [list] vs. [dict].
   Consequently, the following points don't hold.''

The sentiment unfortunately confuses Tcl the language with Tcl the implementation. From the language point of view the first point is (mostly) true. To be precise, Tcl has three native "datatypes": strings, arrays and functions. So, not counting functions and arrays, everything in Tcl has the same datatype. When one talks about representation one is really talking about the implementation, the internals of tcl. From the point of view of the language as documented, a list is merely a specially formatted string and a dict is merely a specially formatted list. It is this that conflicts with the concept of null: the language, not the implementation.

Granted, the reasoning above could be better phrased. The incompatibility of null with Tcl is not because it requires its own type. Instead it is because Tcl requires everything to have a string representation: all values are merely different types of strings (the unfortunate implementation of arrays notwithstanding). And null does not want to be a string.

DKF: Tcl has named and unnamed entities. Named entities are commands, variables, namespaces, interpreters, channels, etc. Unnamed entities are values (including the names of named entities). The fundamental datatype of values is that of a string (implemented as a Tcl_Obj because Tcl_Value was taken for something else that's now obsolete); all other value datatypes (numbers, lists, dicts, etc.) are effectively subtypes of string; the implementations might be a bit complex, but that's the principle. That's why, for example, the C API has Tcl_GetString which operates on all values.

Nulls represent values that are not "values". In a reference-based language, they're relatively sane. In a language like Tcl where values are literal absolutes, they're completely crazy. (They are easy to do in the space of named entities, either by making a variable unset or through using metadata.)

AM 2008-06-23: During a refreshing, but windy, bicycle ride to work I thought of two ways of dealing with nulls within the constraints of current Tcl. I can not pretend to have followed the discussion (I have not even read this page in full yet), but I do know that "nulls" can be used for many things in the world of databases - such as: the value is simply not known, the value is not known yet, the value is of no relevance in this case, the value we got is completely unreliable (reconstructing from an article I read many years ago :)). All represented by a single value that is not even a value.

The ways I thought of are these:

  • Represent a null by an undefined variable:
     foreach var $varlist column $columnlist {
         if {[hasValue $column]} {
             set $var [columnValue $column]
         } else {
             unset $var
         }
     }
  • Represent a null by a read/write trace: the read trace throws an error whenever you try to use the value of the variable, whereas the write trace will delete the read trace, when the variable gets a perfectly ordinary value. Auxiliary procs: hasValue, setNull or the like.

(But I am the first to admit that these ideas are not new or not practical or that they do not solve the issue at hand in any way)


xk2600: I think I have come to this page hundreds of times over the years, and while I won't pretend that I am as apt as others in this arena, the two opposing ends of the arguments that seem to echo the loudest in my short term memory are:

  1. In Tcl EIAS, and null|NULL|NUL are not, therefore there is no need for them in Tcl. More importantly, keep it simple, the implementations that would require this functionality can be accommodated through means involving info exist, an empty string, out-of-band tracking, defining a script local value, etc.
  2. The rest of the world has null|NULL|NUL, and Tcl is primarily used to interface to languages/implementations that the script writer has no control over, making for complexities in integration.

So I believe the various cases are:

  1. null: A non-string/non-value object -- simply 'unknown'
  2. NULL: (void*)0, the value assumed by a C pointer when it doesn't point to anything in particular.
  3. NUL: (char)0, an ASCII value used by C to terminate a string.

I fail to see the difference between case 1 and 2. They both represent effectively the same thing in Tcl: a defined variable which has no value. Specifically in Tcl, the underpinnings don't currently support this natively. As Everything is a String, and a string is a value, a variable which holds no value, doesn't currently work. Case 3, Tcl already fully supports through utilization of binary format, binary scan, unicode escape sequences, etc. After all \0, \000, \ffff, etc are all in fact bytes which can be stored in a string.

However if the facility existed to set a Tcl_Obj via script to a non-value and the representation of that non-value's string result be functionally the same as today while extending visibility of that non-value in ways Tcl does not provide today, it seems feasible that EIAS can still be true.

There are several ways in which Tcl does this today. The biggest one is in the ability for a Tcl_Obj to 'shimmer'. We say 'Everything is a String' but what we really mean is, everything has a string representation and that representation is always in sync with the underlying data type that is hidden behind the interpreter in libtcl's C implementation.

The second way this is done is as AMG noted, how the resultant output of a procedure provides acknowledgment or indication of state. For example, the result of gets allows one to test for -1. This is actually a way to indicate an error condition (the channel is now closed) without requiring a catch. Granted this can also be done with eof. We also see similar returned response get when searching strings with string first or when using string compare. I assume the reason has a lot to do with Tcl's roots in C as it's strikingly similar to the read() and strcmp() functions, wherein the number that is returned indicates the length of chars pulled from a channel/socket, a lack of index in the string, or that the left side is greater than the right alphanumerically. In this scenario -1 means "unknown, empty, or simply not found"

So from my perspective, as EIAS we need the ability to assign non-value to a variable without destruction of the variable, test a variable for whether a defined variable in fact has a value or non-value, and possibly add additional behavior to commands which should in fact act different when dealing with a variable that has no value ... my suggestion would be:

  • Define in libtcl a single static instance of Tcl_StringObj named Tcl_NonValueObj which will be referenced by all nonvalue variables in the TCL interpreter.
  • Implement [nonvalue ?value?] command, which:
    • when called without the value argument returns a reference to Tcl_NonValueObj.
    • when called with a value, evaluates the reference to the value to determine if it is bound to Tcl_NonValueObj and returns 1, otherwise returning 0.
  • Modify [string is] implementation to return 0 for any variable referencing Tcl_NonValueObj.
  • Modify [expr] to resolve !=, ==, <, >, <=, >= Tcl_NonValueObj to always be boolean false.
  • Modify Tcl_GetBooleanFromObj to return false when referencing Tcl_NonValueObj.

As has been done with Integer and Double, with the oversight of those that have come before us we should as a community decide what the String representation of a non-value should look like because there will be scenarios in which serialization of a non-value variable will be needed by TCL's userbase at large.

I'm not married to the following, but my gut tells me, a non-value should be an empty string. This works when the value is referenced by a scaler, an array key, or by a dict key.

It is also worth noting, that since Tcl_NonValueObj is effectively a single struct referenced by variables without values assigned, the string representation is universally defined. This could also allow a configuration or out-of-band programability of what the string representation should be through constructs similar to trace, unknown, or tcl_prompt1. I'm okay with the serialization of null into an empty string because the ability to store or transmit a lack of something (null) should be dealt with from implementation to implementation. There only way I can imagine one would universally serialize lack of value or null is to insert exactly nothing where the lack of value existed. So for the most part, in the interpreter when dealing with real Tcl_Obj objects you could know the value of something, but once it leaves memory and is written to disk or transmitted, it should be the sole responsibility of the developer to handle the exceptional case of 'lack of value' in one of the many facets that have been used historically, be it the use of special characters as in Case #3 or some other form of general debauchery.

This keeps Tcl functionally the same as it has always been, while providing the facilities that have been asked for for going on 15 years. From my perspective, this does not step outside of EIAS and provides a very simple way to extend the functionality to accomodate the requests.

It seems like I run into this issue a lot more recently, as I have been implementing a lot of Tcl extensions which bring systems lib functionality directly into Tcl. And while I haven't found anything I can't work around, the lack of non-valued references really does make you have to think in depth on how your userbase will interpret the various methods for representing 'lack of value'.

I have nearly written the above implementation about 5 times now, only to remind myself that I don't want to have to distribute and support a custom Tcl binary. I would be happy to provide a patch should the Tcl core team and the community at large be comfortable with the concept.