kiranreddys.com
Kiran's personal website
Blog   Forums   Sitemap   Social    Phone
 

Storing XML in RDBMS

As XML is the de-facto standard for data exchange, systems need to store large volumes of XML documents and as I discussed earlier RDBMS is the best system to store and manage data.

DBMS provide various storage options, mechanisms and methods to store XML data in the Database.

Storing XML as Non-Native Character Data
In RDBMS you can store entire XML either as a character type (VARCHAR) or as a single value of large object (CLOB). When storing as character type, it can be split into multiple rows. XML stored into CLOBs or VARCHAR columns is not parsed.

Storing XML in non-native XML type gives very high performance because no parsing takes place in the process. Parsing can be done only at retrieval time. Storing XML data in not-native type is simple and easy.

 

Figure 1 Storing XML as non-native character data

Figure
1 Storing XML as non-native character data

Storing in to VARCHAR or CLOBs is easy to implement but DBMS does not know XML data structure and it’s not be parsed and no DOM object is created hence querying and searching of data is not possible or very difficult.

This type of storage is reasonable choice for following requirements
• When DBMS does not support native XML types
• Element search is not required
• Full documents insert and retrieval is more common operation rather than accessing partial data or individual elements of data.
• No updates or deletes required for XML elements
• XML validation is not required
• XML parsing and random access is not required
• When entire Documents need to be stored and retrieved in encrypted format

Storing XML as Non-Native Relational Values
Storing XML as Relational values is removing entire XML format and converting all the values in to relational values and saving them in to normal RDBMS tables.

In this approach it discards all the XML structure element tags and coverts all the elements, text and attributes in to relational values, normally this involves very complex table structure and multiple inserts or updates and involves significant time.

Figure 2 Storing XML as non-native relational values
Figure 2 Storing XML as non-native relational values

With this approach you loose all the XML capabilities and simply uses RDBMS capabilities, depending on the level of complexity and hierarchy of XML elements it might requires large number of relational tables and columns, defining such complex relational table schema could be very complicated and cumbersome; mapping XML elements and attributes in to relational fields is also a very complicated task.

This type of storage is reasonable choice for following requirements
• XML structure is very simple and data is regular and repetitive
• DBMS data retrieval performance is high priority
• Decomposition and reconstruction of XML to/form relational values is simple and straight forward
• Query data with plain SQL

Following example shows simple XML structure that can be easily converted in to relational values

<CLASS CLSID="007" CLSNAME=“Database“>
<STUDENT>
<STDNO>U012345</STDNO>
<FSTNAME>NAVEEN</FSTNAME>
<LASTNAME>REDDY</LASTNAME>
<DOB>1978-01-01</DOB>
</STUDENT>
<STUDENT>
<STDNO>U011119</STDNO>
<FSTNAME>RONCHE</FSTNAME>
<LSTNAME>UEL</LSTNAME>
<DOB>1988-05-05</DOB>
</STUDENT>
</CLASS>

Example 2 Class Students XML Document

The XML in the above example can be easily converted in to relational values and relational values can reconstruct back to XML Document.

Above example requires only two database tables as follows

CLASS

CLSID

CLSNAME

007

Database

009

Networks

STUDENT

STDNO

FSTNAME

LSTNAME

DOB

CLSID

U012345

NAVEEN

REDDY

1978-01-01

007

U011119

RONCHE

UEL

1988-05-05

007

Example 3 Class and Students tables

Once the values are inserted in to relational tables, you can use all the DBMS SQL capabilities; for example you can retrieve all the students’ details that are in the Database class with following simple SQL.

SELECT STDNO, FSTNAME, LSTNAME, DOB, CLSID
FROM CLASS C, STUDENT S
WHERE
C.CLSID = S. CLSID
AND CLSNAME = ‘Database’
Example 4 SQL for Database class students

Results from the above SQL query can re-construct back to XML Document by using any programming language. Many main stream programming languages like JAVA, C++ and C# provide native XML writers and readers.

Storing XML in a Native XML Data Type

Many popular DBMS provides native XML storage with native XML Data types which preserve XML structure.

Figure 3 Storing XML as a native XML type
Figure 3 Storing XML as a native XML type

XML enabled DBMS provides not only XML types but also provides all the XML capabilities such as Xquery, XPath etc.. You can define a table schema and a create table in a simple way; for example you can create table with the following create statement:

CREATE TABLE COURSE (COURSEID INTERGER, COURCE_DETAILS XMLTYPE);

Following example Course details XML document can be directly inserted in to COURSE_DETAILS column and you can easily query individual elements with XQuery/XPath

<COURSE COURSEID="15" COURSENAME=ORACLE“>
STUDENT>
<STDNO>1</STDNO>
FIRSTNAME>IAN</FIRSTNAME>
LASTNAME>PETTIT</LASTNAME>
BIRTHDATE>1978-01-01</BIRTHDATE>
STUDENT>
<STUDENT>
<STDNO>2</STDNO>
FIRSTNAME>MARK</FIRSTNAME>
LASTNAME>RAMPARKASH</LASTNAME>
BIRTHDATE>1988-05-05</BIRTHDATE>
STUDENT>
</COURSE>
Example 5 Course XML Document

Even though DBMS stores XML in its native form, different Database Management Systems store XML documents using different technologies internally, some databases might convert data in to relational and XQuery into SQL (this is basically either storing XML as non-native character objects or storing XML as non-native relational values) but more databases are now using native XML types and its technologies like DOM (Document Object Model).

XML type supported databases that store native XML types follow XML Data model or XQuery Data model. This is completely different with previous non-native XML storage. Stored XML represents hierarchical structure and DBMS parses XML before inserting it to XMLType.

XML enabled native XML database systems uses trees representation of elements as the fundamental storage and processing model. Some DBMS requires serialization to retrieve full XML document but DBMS like Oracle, SQL server full document can be retrieved without serialization.

This type of storage is reasonable choice for following requirements
• High query performance is required
• XML Schema is very complex and mapping is very difficult
• Need to query individual data elements or partial document.
• Changing evolving XML Schema
• Parsing and validation is required
• Require to manipulate data elements in the XML document
• The internal structure of XML documents must be preserved

5 Comments

  1. Schemaless database:

    Hi Kiran,

    I actually don’t agree with “…RDBMS is the best system to store and manage data…” that was valid some years ago, but today, Amazon, Google using the concept of datastore with subject–verb–object and removing the need of another xml layer over RDBMS. The most interesting being the “Poseidon Database” packed with Brainwave’s development and deployment platform is a schemaless database which removes the need of data modeling completely. Brainwave

    -Mac

  2. kiran:

    Hi Mat,
    Thanks for your valuable comment.
    I do not know much about “Poseidon Database” and exact nature of Google and Yahoo data storage technologies. I’ll try to read about it at some point.

    However, my judgement of using RDBMS for XML storage is based on following reasons:

    DBMS are well proven, robust, reliable and scalable systems for storing and managing data. DBMS also provide functionality for security and administration.

    There are many other advantages to using DBMS/RDBMS

    Information Retrieval: DBMS provide the best possible way to retrieve information stored in the database.

    Data independence: DBMS provides an abstraction layer between data store and applications; applications do not need to know internal implementations of DBMS

    Efficient data access: DBMS manages data in an efficient and effective way; it is very difficult to achieve such efficiency with applications.

    Data integrity and security: DBMS provides data integrity and security functionality. DBMS enforces integrity on the data. It provides multi level security features and prevents unauthorised data access.

    Data administration: DBMS provides centralised administration functionality; Database administrators can manage users, user groups, roles and other database settings and functionality.

    Concurrent access: DBMS allows multiple users to access database concurrently and it manages their connections and sessions.

    Failure recovery: DBMS recovers data from any un-expected failures and crashes.

    Because of the reasons given above, there is no reason why enterprises should not use DBMS/RDBMS to store and manage XML data. Many mainstream DBMS vendors support XML storage.

    Thanks
    Kiran

  3. vidhya:

    why xpath is converted to sql if we are mapping xml to rdbms.cant we use sql query to retrieve data from rdbms(which stores xml data)

  4. Shannan:

    Hmm is anyone else encountering problems with the images on this blog loading?

    I’m trying to determine if its a problem on my end or if it’s the blog.
    Any feed-back would be greatly appreciated.

  5. Starla:

    Tremendous things here. I am very glad to peer your article.
    Thank you a lot and I am having a look ahead to touch you.
    Will you kindly drop me a mail?

Leave a comment