XML, or the Extensible Markup Language to give it its full title, is a system- and hardware-independent language for defining data and its structure within an XML document. An XML document is a Unicode text file that contains data together with markup that defines the structure of the data. Because an XML document is a text file, you can create XML using any plaintext editor,although an editor designed for creating and editing XML will obviously make things easier. XML markup looks similar to HTML in that it consists of tags and attributes added to the text in a file. However, the superficial appearance is where the similarity between XML and HTML ends. XML and HTML are profoundly different in purpose and capability.
Although an XML document can be created, read, and understood by a person, XML is primarily for communicating data from one computer to another. XML documents will therefore more typically be generated and processed by computer programs. An XML document defines the structure of the data it contains so a program that receives it can properly interpret it. Thus XML is a tool for transferring information and its organization between computer programs. XML is a language in which you can define new sets of tags and attributes to suit different kinds of data—indeed to suit any kind of data including your particular data. Because XML is extensible, it is often described as a meta-language—a language for defining new languages, in other words. The first step in using XML to exchange data is to define the language that you intend to use for that purpose in XML.
The Java API for XML Processing (JAXP) provides you with the means for reading, creating, and modifying XML documents from within your Java programs. To understand and use this application program interface (API) you need to be reasonably familiar with two basic topics:
- What an XML document is for and what it consists of
- What a DTD is and how it relates to an XML document
XML snippet ( gmail application data exchange )
<?xml version=”1.0” encoding=”UTF-8” standalone=”yes”?>
<from> firstname.lastname@example.org </from>
<subject> ANDROID online training </subject>
<date> 20th Oct </date>
An XML document basically consists of two parts, a PROLOG and a DOCUMENT BODY
The prolog provides information necessary for the interpretation of the contents of the document body. It contains two optional components, and since you can omit both, the prolog itself is optional. The two components of the prolog, in the sequence in which they must appear, are as follows:
- An XML declaration that defines the version of XML that applies to the document and may also specify the particular Unicode character encoding used in the document and whether the document is standalone or not. Either the character encoding or the standalone specification can be omitted from the XML declaration, but if they do appear, they must be in the given sequence.
- A document type declaration specifying an external Document Type Definition (DTD) that identifies markup declarations for the elements used in the body of the document, or explicit markup declarations, or both.
The document body contains the data. It comprises one or more elements where each element is defined by a begin tag and an end tag. The elements in the document body define the structure of the data. There is always a single root element that contains all the other elements. All of the data within the document is contained within the elements in the document body.
Processing instructions (PI) for the document may also appear at the end of the prolog and at the end of the document body. Processing instructions are instructions intended for an application that will process the document in some way. You can include comments that provide explanations or other information for human readers of the XML document as part of the prolog and as part of the document body. When an XML document is said to be well-formed, it just means that it conforms to the rules for writing XML as defined by the XML specification. Essentially, an XML document is well-formed if its prolog and body are consistent with the rules for creating these. In a well-formed document there must be only one root element, and all elements must be properly nested.
An XML processor is a software module that is used by an application to read an XML document and gain access to the data and its structure. An XML processor also determines whether an XML document is well-formed or not. Processing instructions are passed through to an application without any checking or analysis by the XML processor. The XML specification describes how an XML processor should behave when reading XML documents, including what information should be made available to an application for various types of document content.