Internet

Learn XML Programming

Section I: An Introduction to XML

Welcome to the fast-paced and exciting world of XML programming!  Surely you don’t want to waste any time with unneeded things such as code or syntax, so… oh, wait, maybe you do want to cover syntax and code.  Pretty fortunate, really, considering that XML is code and syntax… but more on that in a minute.

So what is XML, anyway?

XML is the latest in a long line of programming languages that have been used to define pages on the World Wide Web. The letters “XML” stand for “eXtensible Markup Language”… and yes, it should technically be “EML” instead of “XML”, but the “X” makes it look a bit more cutting edge, so it stuck.  But what does “Extensible Markup Language” mean?

First, let’s define the term, “Markup Language”, which is the “ML” in many of the common (and some of the uncommon) programming languages that you’ll find used with the Internet.  At its most basic, a Markup Language is simply a type of programming code that tells a program how to display information.  The programs that are being told how to display the information are known as “Browsers,” and they really do need to be told how to display information or they just start shoving as much information into as small of a space as possible.

Here’s an example of how a Markup Language helps to display information.  Let’s say that you want to tell the world about your two cats, Tooter and Shade.  You type three sentences in a file, about like this:

Tooter is a fat, fuzzy black cat.

Shade is a slightly thinner short-haired gray cat.

Both of the cats are spoiled rotten.

When you load the file into a browser, though, what you see looks more like this:

Tooter is a fat, fuzzy black cat.Shade is a slightly thinner short-haired gray cat.Both of the cats are spoiled rotten.

The browser didn’t know how to format the information, so it just shoved it all together.  Instead of simply listing the information, try putting in a few extra characters, known as tags.

Tooter is a fat, fuzzy black cat.</p>

Shade is a slightly thinner short-haired gray cat.</p>

Shade is a slightly thinner short-haired gray cat.<

Both of the cats are spoiled rotten.

Both of the cats are spoiled rotten.

When you load this new file into your browser, you’ll end up with the original result that you intended.  The tags you added tell the browser that each line should be a paragraph of its own, and should have a blank line after it.

The basic language used to create most web pages (and is what we used in the example about Tooter and Shade) is known as HTML, or the “HyperText Markup Language.”  Though it sounds like something out of a bad 50’s science fiction movie, all that HyperText means is that there is a text document in an electronic form.  HTML uses text documents to tell the browser exactly how it should display the various files that the HTML document references.  These files can be images, other HTML documents, or even movies or music files.  (Oh, and for you trivia buffs out there… HTML was invented in 1990 by Tim Berners Lee, a physicist who was looking for a way to cross-reference text and information through hypertext links on the early version of the Internet.)

Unfortunately, HTML has its limitations.  These limitations have been greatly increased by the introduction of other languages, such as JavaScript, DHTML (Dynamic HTML), CSS (Cascading Style Sheets, which you’ll hear more about later), and ASP (Active Server Pages).  Even with all of these expansions, though, there’s still only so much that you can do with the basic tags and code of HTML.  That’s where XML comes in.

XML, like HTML before it, is a Markup Language… but it’s also so much more.  It’s the “X” that makes it special; XML isn’t limited by the tags it was created with, and can be expanded with custom tags to create effects that would sometimes take several lines of code in HTML (if the effects could be created at all.)

Alright, XML is an advanced Markup Language.  But what does it look like?

Though it may sound complicated, XML actually shares a lot in common with HTML.  In fact, there’s even a language called XHTML, which was created to bridge the gaps between the two and make converting to XML easier for those who’d been using HTML for years… it still has some of the limitations of HTML, but the fact that it exists shows that there is common ground between the two.

The first part of an XML document is the Directive, which tells your browser that it’s going to be looking at XML code and which version of XML the code is written in.  If you’re using XML version 1.0, then your directive is going to look like this:

<?xml version=”1.0”?>

This will go at the very top of your page, and will be followed by the rest of the page information.  If the page that you created previously about Tooter and Shade was an extremely basic XML document, then the code for it would look a little something like the following:

<?xml version=”1.0”?>

<p>Tooter is a fat, fuzzy black cat.</p>

<p>Shade is a slightly thinner short-haired gray cat.</p>

<p>Both of the cats are spoiled rotten.</p>

Of course you wouldn’t be using XML for something as simple as this, but it helps to give you an idea of the sort of things that you’ll be looking at.  A lot of people upon hearing the word “code” think that they’ll be looking at a bunch of numbers and letters that no one but programmers can make heads or tails of… while there are instances of web page coding that looks like that, the majority of it is a lot closer to the Tooter and Shade example.

You said that XML isn’t limited by its tags, and has custom tags.  What do you mean?

The main benefit of XML is the fact that it allows for the creation of custom tags within the code itself.  Whereas HTML and other languages are more or less stuck with the limitations of their tags and commands, XML is able to move beyond those limitations.  While it’s true that creative HTML users are able to skirt around some of the hang-ups of the language, the problems are still there.  In XML, the problem points can pretty much be taken away by the addition of new code to the page, which defines what the browser should do when it encounters the problem.

Of course, another benefit of custom tags is that large amounts of code can be condensed into a single tag, thus saving a lot of potential problem points.  As an example, let’s say that you wanted to make the names of your cats appear large, and bold, and red.  To do this in HTML, you’d need to add tags like the following every time you came to a cat’s name:

<b><font size=”15” color=”red”>

Then you’d type Tooter or Shade’s name, and have to close all of your tags:

</font></b>

That’s a lot of typing just for a single word!  Using XML, though, you can create a custom tag called <cats>, and have it contain both the bolding instructions and the font size and color.  Now, when you’re typing and come to Tooter’s name, you can put this:

<cats>Tooter</cats>

Obviously, this is a lot simpler, and greatly reduces the chance of mistakes made by typos or cats wandering across your keyboard.

There are other benefits to XML, as well.  Increased use of Style Sheets, Document Type Definitions, and the integration of elements from other languages all help to make XML one of the most powerful and adaptive programming languages ever created.

Why is XML so adaptable?

If XML is a new generation, then SGML is its mother.  SGML stands for the Standard Generalized Markup Language, which was actually conceived in the 60’s and 70’s.  SGML is likely one of the most adaptable languages of all time, allowing the use of constructs that even XML won’t allow.  Unfortunately, SGML is more complex and not as universally supported as XML, so the use of SGML instead of XML isn’t really recommended.

XML has inherited many of the key features of SGML, however, and puts them to good use; in many cases, the ways that it differs from its predecessor is pretty much inconsequential.  While you may occasionally run across strange circumstances that would work out better with SGML, it’s best to focus on XML since that’s where most of the support and interest lies.

Ok, so what can I use XML for?

It might be a bit easier to list what you can’t use it for!  In addition to making simple web pages about your cats, you can use XML to create more complex applications such as online databases, custom-built pages, and more.  By combining XML with Style Sheets and dynamic elements, you can even create a storefront for online shopping!  The possibilities are nearly endless.

Of course, by adding Web Services into the mix, things become even more interesting…

Web Services?  What are those?

Glad you asked.  XML web services are a relatively new development, designed to become the fundamental building block in distributed computing on the Internet.  In other words, web services allow business and personal users to interface online without the need for a third-party program.

Imagine having an application running on your computer.  You enter information into that application, while a business partner in another part of the country does the same in an application they’re working with.  Instead of you having to use an Instant Messenger or some other third-party program so that you can communicate with your partner (even simple e-mail is considered a third party in this example), the applications that the two of you are using share the information directly, providing both of you with the results.  Now imagine that the applications that the two of you are using are simply built into a web page.  That’s the integration of web services with XML.

When it comes down to it, all that web services are is a new way for users to interface with applications.  The web services application uses an XML messaging system to make itself available over the Internet, and because the core of the communications is XML, it doesn’t even have to communicate with the systems that it would normally be compatible with.  Windows can communicate with Unix, and Java can send messages to Perl.  Web services use XML as a sort of “universal translator.”

Ok, so what are some common web services?

Web services are still a relatively new technology, with new web services being developed by programmers every day.  Some of the web services that have already been created, though, are nothing short of amazing.  Examples of these web services are:

  • News syndication services that present the most recent headlines to users
  • Stock market analysis, with up-to-date tickers of the ups and downs of the market
  • Weather reports giving current information for the users area
  • Shipping systems, giving up-to-date tracking information
  • Traffic reports for different localities
  • Interactive sites offering product troubleshooting
  • Up-to-date currency exchanges
  • Applications that need to be used by large numbers of users, some of whom are behind firewalls

Additionally, more web services are being developed every day.  It’s still a growing technology, so advances that are made which might seem minor at first stand a chance of becoming ground-breaking work.