TclXML

TclXML is Explain's XML extension

See Also

tDOM
C based XML extension for Tcl

Attributes

current release
3.2
contact
Steve Ball

Documentation

Programming XML and Web services in TCL, Part 1 : An initial primer ,Cameron Laird ,2001-04-01
TclXML: The Next Generation , Steve Ball ,Ninth Annual Tcl/Tk Conference ,2002

Description

TclXML covers a range of specifications and tools for processing and manipulating XML documents using Tcl. Collectively these tools are known as TclXML. Includes a parser written in Tcl which has equivalent functionality to TclExpat, known as the "native" TclXML parser (now also available for download). Check the code for some introductory work on an XML DTD parser. A mailing list is available. Binary versions for MacOS X and Windows are available see the combo html page above.

TclXML v3.2 now includes TclDOM and TclXSLT.

As of September 2001, TclXML has a pure-Tcl XPath parser.

This package is part of the ActiveTcl Batteries Included distribution.

Components

xmlgen / htmlgen
tkxmllint
a component of TclXML, is a GUI for libxml 's xmllint. Checks XML documents for well-formedness, validity, etc.
tkxsltproc
A GUI for xsltproc. Transforms XML documents using XSL stylesheets. Available from [L1 ]

Examples, Tutorials, etc

Please let SRB know what XML-related tasks you are having trouble understanding. I'd like to start a series of pages that provide a tutorial on using TclXML.


There have been some people having problems loading the TclDOM and TclXML packages into Safe interpreters. The problem is that the safe package restricts filenames to 14 characters. TclXML has tclparser-8.1.tcl as one of its files, which exceeds this limit. Here is the TclSOAP solution which removes this restriction:

proc SOAP::CGI::createInterp {interp path} {
    set slave [safe::interpCreate $interp]
    safe::interpAddToAccessPath $slave $path
    # override the safe restrictions so we can load our
    # packages (actually the xml package files)
    proc ::safe::CheckFileName {slave file} {
        if {![file exists $file]} {error "file non-existent"}
        if {![file readable $file]} {error "file not readable"}
    }
    return $slave
}

This returns a safe interpreter for which interp eval "package require xml" should work. PT


[As people keep turning up asking for tutorials on getting started with TclXML--the most basic things, like retrieval of one value from one tag of a single document--I want to make a point at least of putting a few links here which point to Steve's explanations in the mailing list.]

Dave Griffin offers a sample retrieval of values from a DOM tree:

dmg 2004-08-19 Note: This example seems to have stopped working! Did the getElementsByTagName method become "unimplemented" at some point?

SRB 2004-08-20: The API changed some time ago. getElementsByTagName is a "live-list"; i.e., its return value is the name of a variable that contains the list of matching nodes. This variable must be dereferenced to get the actual node tokens. In addition, the result is a set (list) of nodes so you must handle that, but in this simple case just getting the first node is sufficient. E.g.

set sr [dom::document getElementsByTagName $d SampleRequest]

should be

set sr [lindex [set [dom::document getElementsByTagName $d SampleRequest]] 0]

Sounds complicated? That's because it is. A simpler and better way is to use XPath:

set sr [dom::document selectNode $d /SampleRequest]
set purl [$sr selectNode PostingURL]

package require xml
package require dom

set xmlSrc {
    <?xml version="1.0" encoding="UTF-8"?>

    <SampleRequest>
        <PostingURL url="http://foo.com/some/service/url"/>
        <Password>FooBar</Password>
    </SampleRequest>
}

# First you parse the XML, the result is held in token d.
set xmlSrc [string trim $xmlSrc] ;# v2.6 barfed w/o this
set d [dom::DOMImplementation parse $xmlSrc]

# One way in is to parse it by the assumed structure and
# use the Document interface -style of query.  This code isn't
# flexible at all and only highlights how the dom methods
# can be used.

# First find the SampleRequest element in the DOCUMENT
set sr [set [dom::document getElementsByTagName $d SampleRequest]]

# Next retrieve the two sub-elements in SampleRequest
set purl  [set [dom::element getElementsByTagName $sr PostingURL]]
set pword [set [dom::element getElementsByTagName $sr Password]]

# Now we will retrieve the url attribute of PostingURL
set url [dom::element getAttribute $purl url]
# url == "http://foo.com/some/service/url"
puts "url = $url"

# Finally, we want to retrieve the password.  This is non-obvious.
# The value "FooBar" is actually in a "textNode" child of pword,
# so you have to do ferret it out with generic node commands.
set pwordv [dom::node children $pword]
# You could have also used: dom::node cget $pword -firstchild

# dom::node cget $pwordv -nodeType  -> textNode
set password [dom::node cget $pwordv -nodeValue]
# password == "FooBar"
puts "password = $password"

Note: The above example did not work (at least not with my TclXML 3.0). It was missing the second "set" command in order to access the data returned by the dom::document and dom::element instructions.


Steve's starting to make courseware available through http://www.zveno.com/courses/samples/XML-App-Dev-Tcl/ (this link is broken :-( )

escargo 2005-05-05: As of this date, there is nothing there, perhaps because Steve isn't at zveno any more.

SRB 2005-11-07 - that's right, I'm now at Explain. We'll try and get some of these tutorials available ASAP.


SRB 2005-11-07 - See above. My move from Zveno has resulted in a few pages being dropped. I'm working on getting these restored.


LV 2008-01-15:

A few notes while building tclxml for Tcl 8.5. I obtained the cvs head tclxml tar file from the tcl xchange's netcvs mirror. I extracted the tar so that it was a sibling to tcl 8.5's directory. I then had to run autoconf so that a configure file was available. I typically run configure and make in a subdirectory named after the type of OS I am using (for instance, unix). The configure was successful, but included a warning about tclxml's Makefile.in seeming to ignore the --datarootdir setting. At this point, there were warnings about PACKAGE_NAME and a few other macros being redefined, but the .o files were generated. Next, the makefile built the libTclxmlstub3.1g.a and .so files. Then, I tried to run the test suite. However, the Makefile attempts to use tclsh8.4 for the test suite, but the install directory supplied only has tclsh8.5, so that failed.

I'll submit a bug report on this error.

SRB 2008-12-17 - The v3.2 release has been tested against Tcl 8.5. Things should be much better now.


HaO 2009-01-26:

(please remove this remark when anything changes) The Readme on the web-site is quite outdated corresponding to the windows makefile instructions. The srcipt-only install remark (./configure make install) did not work for me. I made a Linux compile install and took the script file for a windows script-only version. IMHO it would be helpful for a quick start/try to provide a script-only version on the web-site.


filker0 2009-12-03:

I'm attempting to use the TclXML package in a tool I'm writing. The XML document is automatically generated by another program (in C), and looks like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assets name="base">
    <image name="banner" src="banner.png" class="ui" />
</assets>

The actual file is over 300 lines long, and contains quite a few different element types.


Steve Ball 2009-12-07 That's only a few kilobytes. Why not just read the entire contents of the file into memory and then parse it in one go? Very few applications need to truly stream parsing XML.


My stripped down example code is:

package require xml

set edata(level) 0
set edata(0) {}

proc elementStart {type attrs args} {
    global edata
    puts stdout "<$type> attrs=\{$attrs\}, args=\{$args\}"
    incr edata(level)
    set edata($edata(level)) {}
    return default
}

proc elementEnd {type args} {
    global edata
    puts stdout "</$type> args=\{$args\$}, data=\{$edata($edata(level))\}"
    incr edata(level) -1
    return default
}

proc elementData {data} {
    global edata
    lappend edata($edata(level)) $data
    return default
}

proc xmlComment {data} {
    puts stdout "Comment: $data"
    return default
}

The following does the expected thing:

set parser [::xml::parser \
              -elementendcommand elementEnd \
              -elementstartcommand elementStart \
              -characterdatacommand elementData \
              -commentcommand xmlComment \
              -reportempty 1]

set infile [open sample.xml r]
set buffer [read $infile 100000]
$parser parse $buffer
close $infile
$parser free

However, when I try this with a line-by-line parse:

set parser [::xml::parser \
              -elementendcommand elementEnd \
              -elementstartcommand elementStart \
              -characterdatacommand elementData \
              -commentcommand xmlComment \
              -reportempty 1 \
              -final 0]

set infile [open sample.xml r]
while {[gets $infile buffer] >= 0} {
    $parser parse $buffer
}
close $infile
$parser configure -final 1
$parser parse ""
$parser free

It fails to parse even the XML decl on the first line with the error:

unexpectedtext {unexpected text " version="1.0" encoding="UTF-8" standalone="yes"?" in document prolog around line 0}

I'm using ActiveState's latest Tcl8.5 package on WindowsXP. I used "teacup" to get the packages that were not bundled in the free download.

The XML file has Unix style line termination.

Does anyone have insights as to what I'm doing wrong? I want to process this file in an event (SAX-like) manner rather than use DOM or XPath.

Thanks in advance for any help offered.

Steve Ball 2009-12-07: Setting the -final option to 0 (ie. stream the parsing) was never finished properly. I should just remove the option until such time as it is fully debugged.

dzach 2011-12-1: It might be worthwhile to implement the -final 0 option with a coroutine.


I have downloaded source package for v3.2. I am just trying to install pure tcl version on mingw as I've only a simple xml to parse, but there seems no way to do this. If I try ./configure the build system looks for xml2-config and fails. Other than through configure and make, is there an easy way to install tcl only version? I see previous versions used a tcl installer for this.


JohnnyS: A simple and boring question. Why I always get "method getElementsByTagName not yet implemented", when I run SRB's code?


KCW 2011-1-12:

I'm using the libxml2 parser class in TclXML 3.2 under Tcl 8.4 on cygwin. The parser objects I'm creating don't seem to respond to "configure -startcdatasectioncommand" or "configure -endcdatasectioncommand" calls. That is, the proc name I pass doesn't get set, is not returned when I make a corresponding cget call to the parser object, and the configure call raises TCL_ERROR. For example:

package require xml
proc cmd {} {}
set p [::xml::parser -parser libxml2]
$p configure -startcdatasectioncommand cmd

Leads to script termination. Catch'ing the configure command gives an empty string. This behavior is in contrast to pretty much all of the other parser object options I've tried. I am new to TclXML, so apologies if this is an obvious error on my part. Any feedback greatly appreciated.


HK 2011-12-02 11:52:24:

Can somebody please upload a simple example to read the tag names (variables) and corresponding values from an xml file. There is not much tutorial help on the web about using the tclxml.

Historical

website ,circa 2006