This topic very briefly introduces XSLT. It covers:
• An Introduction
• A listing of all Basic XSLT Functionality
• Basic XSLT Algorithm
Introduction
The XSLT (Extensible Stylesheet Language Transformation) Definition sets-out the rules for translating the XML schema into HTML or another XML set. The basic capabilities of XSLT includes:
• Generating constant text
• Filtering out content - only displaying information that is relevant to the reader
• Change tree ordering - remember XML is a hierarchical data definition in which the nodes can be reorganized
• Duplicating nodes
• Sorting nodes
• Any programmatic interpretation and processing of the input data
Fig. 1. XSLT Processing
In the simplest terms the XML Processor uses the XSLT declaration to interpret the Input XML file to produce a specific output file. The following example shows how the data contained in the original file is modified to produce the final HTML file. Code Samples sourced from the posting on Wikipedia at http://en.wikipedia.org/wiki/XSLT (accessed September 2005)
Example XSLT
h1 { padding: 10px; padding-width: 100%; background-color: silver }
td, th { width: 40%; border: 1px solid silver; padding: 10px }
td:first-child, th:first-child { width: 20% }
table { width: 650px }
The following host names are currently in use at
!--'Used by' column-->
Input XML File
www World Wide Web site
java Java info
www World Wide Web site
validator web developers who want to get it right
Output XHTML File
,br> h1 { padding: 10px; padding-width: 100%; background-color: silver }
td, th { width: 40%; border: 1px solid silver; padding: 10px }
td:first-child, th:first-child { width: 20% }
table { width: 650px }
Sun Microsystems Inc.
The following host names are currently in use at
sun.com
The World Wide Web Consortium
The following host names are currently in use at
w3.org
Basic XSLT Functionality
XSLT functionality is accessed using:
• XSLT elements
• XPath Node Axes
• XSL Functions
The tables listed below have been extracted almost exactly as presented in:
Benz, B.; Durant, J.R.; XML Programming Bible; Wiley Publishing Incorporated; Indianapolis, USA; 2003
XSLT Elements
The table below lists all the elements available to XSLT stylesheet developers:
W3C XSLT Elements
Element Description
stylesheet Defines the root element of a stylesheet. Can be used interchangeably with transform
transform Defines the root element of a stylesheet. Used to replace stylesheet, when stylesheet cannot be used
output Defines the format of the output document.html, xml and text are defined. html includes rtf and pdf documents. If output is not specified the XSLT parser checks the document to see if its html based. Various options can be specified - including encoding, document indentation, etc.
namespace-alias Replaces the source document namespace with a new namespace in the output node tree. (Must be a child of the stylesheet element)
preserve-space Defines whitespace preservation for elements. (Must be a child of the stylesheet element)
strip-space Defines whitespace removal for elements. (Must be a child of the stylesheet element)
key Adds key values to each node, using XPath functions. (Must be a child of the stylesheet element)
import Imports and external stylesheet into the current stylesheet. The current stylesheet takes precedence if there is any conflict. (Must be a child of the stylesheet element)
apply-imports Applies templates to all the children of the current code or a specified node set (select).
include Includes an external stylesheet in the current one if there are any conflicts the XSLT parser needs to decide which takes precedence. (Must be a child of the stylesheet element)
template Applies rules in a match or select action. Optional attributes can be used for defining a node-set by match, template name, processing priority and an optional QName for a subset of nodes in the node set
apply-templates Applies a template to all children of the current node, or a specified node set using the optional select attribute. Parameters can be passed using the with-param element
call-template Calls a template by name. Parameters can be passed using the with-param element. Results can be assigned to a variable
param Defines a parameter and default value in a stylesheet template. A global parameter can be defined as a child of the stylesheet element
with-param Passes a parameter value to a template when call-template or apply-template is used
variable Defines a variable in a template or a stylesheet template. A global variable can be defined as a child of of the stylesheet element
copy Copies the current node and related namespace only. Output matches the current node (element, attribute, text, processing instruction, comment or namespace)
copy-of Copies the current node, namespaces, descendant nodes and attributes. Scope can be controlled with a select attribute
if Conditionally applies a template if the test attribute expression evaluates to true
choose Makes a choice based on a multiple options. Used with when and otherwise
when Defines an action for the choose element
otherwise Defines the default action for the chooseelements. (Must be a child of the choose element)
for-each Iteratively processes each node in a node-set defined by an XPath expression
sort Defines a sort key used by apply-templates to a node-set and by for-each to specify the order of iterative processing of a node set
element Adds an element to the output node tree. The details of the element can be set using the names, namespaces and use-attribute-sets
attribute Adds an attribute to the output node tree. (Must be a child of an element)
attribute-set Adds a list of attributes to the output node tree. (Must be a child of an element)
text Adds text to the output node tree
value-of Retrieves a string value of a node and writes it to the output tree
decimal-format Specifies the format of numeric characters and symbols when converting to strings. Used with the format-number function only
number Adds a sequential number to the nodes of the node-set, based on the value attribute. Can also define the number format of the current node in the output node tree
fallback Defines alternatives for instructions that the current XSL processor does not support
message Adds a message to the output node tree. This element can also optionally stop processing ona stylesheet with the terminate attribute. Mostly used for debugging
processing-instruction Adds a processing instruction to the output node tree
comment Adds a comment to the output node tree
XPath Node Axes
XPath Node axes facilitate the navigation of the input and output node trees. They are centered on the current node and radiate out to locate parents, ancestors, children, descendents and siblings
XPath Node Axes
Axis Description
self The current node.
XPath Location Operator: . - the current node
ancestor Parents and other predecessor nodes (not including the current node)
ancestor-or-self The current node and all predecessor nodes
attribute Attributes of the current node.
XPath Location Operator: @ - attribute identifier
child Children (direct descendents) of the current node.
XPath Location Operator: * - all child nodes
descendant Children and other succeeding (coming after) nodes (not including the current node).
XPath Location Operator: // - all descendents
descendant-or-self The current node and all succeeding nodes
following The next node in the document order, including all descendents of the next node; excluding the current node descendents and ancestors
following-sibling The next sibling (child of the same parent node as the current node) in the document order, including all descendents of the sibling node; excluding the current node descendents and ancestors
namespace All nodes that share the same namespace as the current node
parent The parent (the immediate predecessor) node of the current node
XPath Location Operator: .. - the parent node
XPath Location Operator: / - the root node
preceding The previous node in the document order, including all descendents of the previous node; excluding the current node descendents and ancestors
preceding-sibling The previous sibling (child of the same parent node as the current node) in the document order, including all descendents of the sibling node; excluding the current node descendents and ancestors
XSL Functions
XSL stylesheets support several functions for each data type:
XSL Functions by Data Type
Function Description
Boolean Functions
boolean() Converts an expression to a Boolean data type value and returns true or false
true() Binary true
false() Binary false
not() Reverse binary true or false. e.g.not(true expression)=false; not(false expression)=true
Number Functions
number() Converts an expression to a numeric data type value
round() Rounds a value up or down to the nearest integer e.g.round(45.49)=45; round(45.50)=46
floor() Rounds a number down to the nearest integer e.g.floor(45.49)=45; floor(45.50)=45
ceiling() Rounds a number up to the nearest integer e.g.floor(45.49)=46; floor(45.50)=46
sum() Sums the numeric values in a node set
count() Counts the nodes in a node-set
String Functions
string() Converts an expression to a string data type value
format-number() Converts a numeric value to a string data type value using the decimal-format element values as a guide (if present in the stylesheet)
concat() Converts two or more expressions into a concatenated string data type
string-length() Counts the characters in a string data type value
contains() Checks for a substring in a string. Returns a Boolean true or false
starts-with() Checks for a substring at the beginning a string. Returns a Boolean true or false
translate() Replaces an existing substring with a specified substring in a specified string
substring() Retrieves a substring from a specified string, starting at a numeric character position and optionally ending after a specified length (number of characters)
substring-after() Retrieves a substring of all characters, in a string, after a specified numeric character position
substring-before() Retrieves a substring of all characters, in a string, before a specified numeric character position
normalize-space() Replaces any tab, new line or carriage return in a string with spaces; and then removes any leading or tailing spaces from the new string
Node Set Functions
current() Returns the current node in a single node node-set
position() Returns the position of the current node in a node-set
key() Returns the node-set defined by the key element
name() Returns the name of the selected node
local-name() Returns the name of a node without a prefix, if the prefix exist
namespace-uri() Returns the full URI of a node prefix, if the prefix exists
unparsed-entity-uri() Returns the URI of an unparsed entity based on a reference to the source document, based on the entity name
id() Returns a node-set with nodes that match the id value
generate-id() Generates a unique string for a selected node in a node-set. The syntax follows well-formed XML rules (see XML Basics)
lang() Returns a Boolean true or false depending on if the xml:lang attribute for the selected node matches the language identifier provided in an argument
last() Returns the position of the last node in the node-set
document() Builds a node tree from an external XML document when provided with a valid document URI
External Object Functions
system-property() Returns information about the processing environment. Useful when building multi-version and multi-platform stylesheets in conjunction with the fallback element
element-available() Returns a Boolean true or false indicating if a processing-instruction or extension element is supported by the XSLT processor or not
system-property() Returns a Boolean true or false indicating if a function is supported by the XSLT processor or not
Basic XSLT Algorithm
XSLT declarations define a set of rules and guidelines that are applied during processing according to a predefined algorithm. Each XSLT processor appears to follow these steps:
1. The XSLT stylesheet content is converted into a tree of nodes (according to the XPath model), the stylesheet tree - information from included or imported other files is also interpreted to construct a complete tree
2. The input XML file is interpreted into a data node-tree (according to the XPath model) - the source tree
3. Whitespace only text elements are stripped from the stylesheet tree. Descendents of xml:text elements are not removed
4. Strip the whitespace from the source tree (if xsl:strip-space is specified in the stylesheet)
5. Any templates added to the stylesheet are interpreted and the appropriate actions taken on the output tree, alternatively the default template rules ar added to provide the default behavior for all node types:
• Root node and element node: processor continues on and processes all child nodes
• Attribute node and text node: processor makes a copy of the node in the result (output) node-tree
• Comment node and processing instruction: Processor takes no action
6. Process the root node of the source tree
7. Serialize the output tree according to the instructions provided in xml:output
Nodes are processed in two steps:
1. The best matching template rule for the node is found
2. The contents of the template rule are instantiated