Monday, 20 February 2012

Two/multi phases XSLT processing

Two/multi phases XSLT is very useful if you do have multiple steps to process your data. Here is an example.

I have a data dictionary to store authors:

<?xml version="1.0" encoding="UTF-8"?>
<authors>
  <author>Allinson, J.</author>
  <author>Feng, F.</author>
  <author>Marsden, E.</author>
</authors>

Everytime when I've got a new XML document, I need to check a particular element (fullName) to update data dictionary file. The XML document looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<iris>
  <creator>
    <fullName>Wyatt, A.</fullName>
  </creator>
  ...
</iris>

The key point for multiphase processing is to store the previous processing result into a variable and use that variable as the initial XML to process in the later phase. The example XSLT (in SAXON 9) is shown below:

<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      exclude-result-prefixes="xsl">
  <xsl:strip-space elements="*"/>
  <xsl:output method="xml" indent="yes"/>

  <xsl:variable name="authorsDoc" select="document('authors.xml', /)"/>
 
  <xsl:template match="/">
    <!-- Phase 1 -->
    <xsl:variable name="phase-1-output">
      <authors>
        <!-- If the newly entered author does not exist in current data dictionary,
               then add it to DD -->
        <xsl:for-each select="/iris/creator/fullName[.!='']">
            <xsl:variable name="currentName" select=".[.=$authorsDoc//author]"/>
            <xsl:if test="not($currentName)">
              <author><xsl:value-of select="."/></author>
            </xsl:if> 
        </xsl:for-each>
       
        <!-- Copy all existing DD authors, only once -->
        <xsl:for-each select="$authorsDoc/authors/author[.!='']">
          <xsl:sort select="."/>
          <author><xsl:value-of select="."/></author>
        </xsl:for-each>
      </authors>
    </xsl:variable>


 
   <!-- Phase 2 -->
    <authors>
      <xsl:for-each select="$phase-1-output/authors/author[.!='' and (not(.=preceding-sibling::author))]">     <!-- Remove duplicates -->
        <xsl:sort select="."/>
        <author><xsl:value-of select="."/></author>
      </xsl:for-each>
    </authors>
   
  </xsl:template>
 
</xsl:stylesheet>

No comments:

Post a Comment