Internet

Learn XML Programming

What are attributes?

When creating DTD’s, you sometimes find yourself using attributes instead of elements to create usage rules for your XML documents.  Attributes are a lot like elements, and are defined in an attribute list (as mentioned previously.)  A lot of questions exist as to when attributes should be used and when elements should be used instead… as a basic rule, though, you should keep your definite content (text, and other types of data) in elements, and use attributes to handle metadata, or the information about the text or real data that’s contained within the elements.

Let’s look at our sample attribute list from before:

<!attlist cat_characteristics

hair_length #implied

hair_color #required

name #required

>

As you can see, this list defines various attributes of the element, cat_characteristics.  The attributes given are hair_length, hair_color, and name.  Two of these are required, and one is implied… what all of this means is that when your document starts to be accessed, the program accessing it is going to find an element named cat_characteristics.  To use this element, it’s going to have three pieces of information to find… those are your attributes.  Looking at the attribute list, though, it will determine that one of the pieces might not be available… so it will search for the three pieces, and as long as it finds the two required attributes then it will be able to go on to the next bit of code with no problem.  If it can’t find the required attributes, then it knows that it doesn’t have enough information to use the element, and is therefore unable to continue and displays an error message.

Of course, if you wish you could do things a bit differently… there’s technically nothing wrong with storing your data in attributes, and using elements to create the rules by which the attributes are used.  Unfortunately, this can be a bit confusing in the long run.

Much like elements, there is a specific set of jargon associated with attributes… in addition to #required and #implied, you can also use #fixed (which means that there’s a definite value for the attribute, and it can’t be changed) or enter a value, as well as a variety of different keywords to create your data attributes.  The list of attribute modifiers is:

CDATA The value is character data
(en1|en2|..) The value must be one from an enumerated list
ID The value is a unique id
IDREF The value is the id of another element
IDREFS The value is a list of other ids
NMTOKEN The value is a valid XML name
NMTOKENS The value is a list of valid XML names
ENTITY The value is an entity
ENTITIES The value is a list of entities
NOTATION The value is a name of a notation
xml: The value is a predefined xml value

An example of the format for using these keywords is as follows:

<!attlist cat_characteristics

hair_length CDATA #implied

hair_color CDATA #required

name IDREF #required

>

As you can see, both hair_length and hair_color are to be defined as character data, whereas name is a reference to the value of another element.

What are the limitations of DTD’s?

DTD’s are not the do-all, end-all for coding in XML, however.  Powerful as they may be, DTD’s do have their limitations like most anything in the programming world.  Some of these limitations are minor, but in the right circumstances they can create some major headaches.

First of all, DTD’s require a reference to the DTD in the source document.  This isn’t as much of a problem with small files that have the DTD built in, but if you get into larger documents with an external DTD then you have to have either a SYSTEM or PUBLIC source reference.  Unfortunately, dynamically-generated pages (such as those made with XSL using XML-FO) can have problems finding and following the references.

Secondly, DTD’s don’t work well with namespaces (as the DTD isn’t what we like to call “namespace aware”… it doesn’t know what the namespaces mean.)  Larger documents that make extensive use of namespaces can find themselves running into problems with DTD’s.

Another drawback is that while elements that are grouped together in a mixed-content model don’t necessarily need to be in order, many of the programs that will be accessing the DTD aren’t as liberal with their orders and can become confused when encountering the out-of-order elements in a DTD.

Also, DTD’s assign fixed characteristics to elements based upon their name… unfortunately, there are many instances where there’s more to the content of an element or attribute than just the name that the DTD just doesn’t see.

Of course, there are other limitations to DTD’s… these are just a few of the most common.  This isn’t saying that DTD’s shouldn’t be used, however… they can be quite useful and powerful tools when creating XML documents.  You just need to keep in mind that no system of programming is perfect… especially not one that has the ability to be used in so many different ways.