Oracle XML Database

By Jeff Wu

Object-oriented design principles are one common way of designing distributed applications.  In this methodology, the system is architected (by way of state diagrams and UML sequences) primarily by focusing on generating actionable results based on computations and processes done at endpoints, designed as procedures or objects (think stubs and skeletons).  This means that data within the system cannot be viewed in its ”raw” format (because it is presumed to be an intrinsic part of objects), and the data format itself is fixed within the object — although generics alleviates this somewhat – and access to this data is only available through predefined methods within the context of the object.  Relational databases such as MySQL and SQL Server have table definitions which cannot accept data in an arbitrary format, because data must first be adapted to meet existing table constraints(unique and primary key), NULL-field requirements, data-length requirements and column-type requirements.  This type of design architecture is known as “application-centric” architecture.

Another design architecture, the “document-centric” methodology, is what I’d like to focus on in this post.  Unlike application-centric systems, the endpoint processes and definitions are secondary in a document-centric system because they are not necessary in order to design the system at a high level.  The focus is now on moving and transforming data within the system.  Data no longer needs to conform to strict structures (i.e. column definitions) and can be saved or loaded in a partial state.  Data can be viewed at any given time in its raw form (or any format) and be processed by any other system or application.

We’ve implemented document-centric design principles for several clients.  One of the implementations involved Oracle’s XML database.  From Oracle’s website, “Oracle XML DB is a feature of the Oracle Database. It provides a high-performance, native XML storage and retrieval technology. It fully absorbs the W3C XML data model into the Oracle Database, and provides new standard access methods for navigating and querying XML.”

From our experience, Oracle’s XML database is an embedded database and not a stand-alone product, so I needed to write a wrapper application that performed the function of inserting and retrieving XML documents from the database.  The XML documents I wanted to store in the database were wildly different in structure and content, and while not a good candidate for traditional relational databases, they were perfect for the XML database.  I was able to store the data “as-is” from my source and focus on re-distributing the data to downstream data consumer systems.  The database has an indexing feature which allows you do created indexes on documents based on vertices, edges and nodes, so even with millions of documents, retrieving documents was quick and efficient.  Retrieving documents was done using XQuery or XPath expressions.  Once that was done, I could focus on the part of the application which needed to deliver select documents to many downstream data consuming systems.

Since the documents are XML, transforming them so that they were useful to downstream systems was done with XSLTs.   We used many templates within our XSLTs to perform the needed transformations of the source data.  Here is an example of an XSL template which I used to transform text from each node of the source XPath file into element in a new XML file.

<xsl:template match="generic.info">
    <xsl:element name = "attribute">
        <xsl:attribute name = "attributeName">the name</xsl:attribute>
        <xsl:for-each select = "path/path/path/x" >
            <xsl:if test = "@type = 'nodeName'" >
                <xsl:value-of select = "text()" />
            </xsl:if>
        </xsl:for-each>
    </xsl:element>
</xsl:template>

We were even able to use BeanShell for added dynamic functionality within the XSLTs!

According to their website, “BeanShell is a small, free, embeddable Java source interpreter with object scripting language features, written in Java. BeanShell dynamically executes standard Java syntax and extends it with common scripting conveniences such as loose types, commands, and method closures like those in Perl and JavaScript.” It was extremely useful for performing encryptions and hashes within XSLT files without needing to write a lot of code.

Whether or not application-centric or document-centric design methodologies should be used in any application architecture are usually (and should be) dictated by the business requirements.  Using Oracle’s XML database is one of the many utilities Shepherd Interactive uses when making recommendations for our clients.  If you have any implementation questions, feel free to contact us!

Tags: , ,

Leave a Reply

You must be logged in to post a comment.