Updated April 5, 2023
Introduction to XML reserved characters
XML reserved Characters are defined as special characters that are used in the CDATA section. XML processors parse these reserved characters since XML uses tree-like structures of tags and representing entities in a challenging task. To customize any string in The XML file, these reserved characters symbols have been used. This Reserved character gives hands while editing a Query and entering SQL code to XML file. As XML tags use few characters for tag elements and attribute, these characters are avoided inside the XML tags; instead, a numeric Character reference is used inside an XML file.
Syntax
The reserved characters are assigned as follows, or replacement entities have been used in any attribute data. Given below is the reserved Characters syntax:
Using ‘& ‘reserved Characters
attributeName="This is "made""
Using apos reserved characters
attributeName='She replied "OK"'
To use greater-than in XML, we provide > a symbolic name. In the example section, we shall see how to use ampersand within content data.
Working with the XML is a challenging task, particularly using an ampersand and greater than cause the XML parser to fail, giving incorrect results. In the above syntax, we have used a single quote for the content.
How to use reserved characters in XML?
The Symbols like open bracket (<) and ampersand (&) are reserved for the XML mark up as every element tag in XML begins with ‘<’. When an XML character like < and > are encountered by the XML parser, it assumes a new element tag is about to start. These are meta-characters denoting XML tags, so basically, they are represented using the entities. Some entities are replaced with character references in the content. These entities are used to represent items of data in the XML document rather than using the data itself. When writing an XML document, few symbols can neglect XML validating, and while processing instructions, few XML editors substitute the literals automatically. The five pre-defined reserved characters are listed below, and this table of characters shown should be encoded, and it is preferred to use in URL and other string methods as well.
Sno | Entities | Description |
1 | < | It specifies less than ‘<’ in Character data. This starts element mark-up. |
2 | > | It specifies greater than ‘’ in Character data. This ends a tag. |
3 | & | Specifies ‘&’ symbol in Character data. This Starts an entity Mark-up. |
4 | ' | Specifies ‘ ‘ ’ symbol in Character data. |
5 | " | Specifies ‘ “ ’ symbol in Character data. |
These entities are express characters without ambiguity and used to delimit tags in XML. To confuse the tags with the tags, a simple solution is to escape the characters so that the parser assumes them as data instead of mark-up. The above-mentioned entities are used as attributes, text, and elements in the XML document. Also, such entities are difficult in reading; therefore CDATA section can be used. The XML processor interprets these characters for the analysis of the source. And these are authorized by XML entities to type carefully.
A special advantage of XML syntax is that few characters are reserved for language purpose and need to be escaped.
Let’s see a sample of an XML document using XML Query:
select xmlextract("/",
"<x atr='> &'> > & </x>“)
The output is displayed without apos here:
<x atr="> &"> > & </x>
Examples
In this section, we will be shown how to use characters in Programming Language by escaping features. Let’s get started.
Example #1
Using entities in Mathematical Function
res.xml
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title> Reserved Characters in Mathematical functions. </title>
</head>
<body>
<h1><b> This test Shows how to display reserved characters in a text using XML </b></h1>
<div id ="sample1">
<![CDATA[
Lets take an expression of a formula " if T(m,n) is m n < t && m > t "
]]>
</div>
</body>
</html>
Explanation
The above code uses the CDATA part in which a mathematical expression like > or < are used. In the output, we could see the reflection of the statement.
Output:
Example #2
Xml using C#:
using System;
using System.Text.RegularExpressions;
class Res
{
static void Main()
{
string xmlre = @"
<?xml version='1.0' encoding='ISO-8859-1'?>
<mail>
<to_mail>Shayen</to_mail>
<from>Janvi</from>
<Subject>Official</Subject>
<Compose>hello~ < 3</Compose>
<Text>This is an official mail for the project!<br /></Text>
</mail>";
string regE = @"</(\w+)>";
MatchCollection ma = Regex.Matches(xmlre, regE);
foreach(Match x in ma)
{
string val1 = x.Groups[1].Value;
string regE2 = "<" + val1 + "( |>)";
Match x2 = Regex.Match(xmlre, regE2);
if (x2.Success)
{
char[] ch = xmlre.ToCharArray();
ch[x2.Index] = '~';
xmlre = new string(ch);
xmlre = Regex.Replace(xmlre, @"</" + val1 + ">", "~/" + val1 + ">");
}
xmlre = Regex.Replace(xmlre, @"<\?", @"~?"); // declarations
xmlre = Regex.Replace(xmlre, @"<!", @"~!"); // comments
}
string regE3 = @"<\w+\s?/>";
Match x3 = Regex.Match(xmlre, regE3);
if (x3.Success)
{
char[] ch = xmlre.ToCharArray();
ch[x3.Index] = '~';
xmlre = new string(ch);
}
xmlre = Regex.Replace(xmlre, ">", ">");
xmlre = Regex.Replace(xmlre, "~", "<");
Console.WriteLine(xmlre);
Console.ReadKey();
}
}
Explanation
We have replaced few characters with character references in XML using C# using Regular expressions in the above code. Finally, we could see an XML file in the output as shown below:
Output:
Example #3
Using java to escape a string
Xmlres.java
package demo;
import java.io.File;
import org.apache.commons.io.FileUtils;
import org.apache.commons.lang.StringEscapeUtils;
public class Xmlres {
public static void main(String[] args) throws Exception {
String ss1 = FileUtils.readFileToString(new File("fligh.txt"));
String res = StringEscapeUtils.escapeXml(ss1);
System.out.println(res);
}
}
fligh.txt
<sample>
Here is a welcome greet "hello" that I'd wish to "celebrate" for my parents
& husband in my native Language: kya. thum? atcha.
</sample>
Explanation
This example with java shows how to escape a character using apache escape. And the result of the console is shown below:
Output:
Conclusion
This article demonstrates on reserved characters of XML, which has five entities. Here we have explored in an example; they showed how to escape this XML entity for encode test process. Special characters in XML can be removed using JSON and java. So, when we decide to create an XML from any input source that has these reserved characters, care should be taken as more clients would not be interested in using this. The So-called Special Characters have some special meaning in XML.
Recommended Articles
This is a guide to XML reserved characters. Here we discuss How to use reserved characters in XML and examples along with the codes and outputs. You may also look at the following articles to learn more –