Monday, September 1, 2008

GSoC 2008 ....... Done

You may have noticed that my blog has gone rather silent in the last couple of months. Well, the reason was my Google Summer of Code (GSoC) project. The final deadline was on 18th of August so I worked really hard to deliver the goods on time and finally I can proudly state that I managed to successfully complete my project on time.

The objective of my GSoC project was to implement XML schema type alternatives support for Apache Xerces2/J, the legendary open source XML parser for Java applications. Type alternatives is the answer from W3C XML schema working group, for conditional type assignment problem. This feature was first introduced in the XML schema 1.1 structures specification. Type alternatives allow a type to be assigned to an element dynamically at validation time based on one or more conditions. Conditions are expressed as XPath 2.0 expressions. Here is an example element declaration which uses XML schema type alternatives.

As stated in the above code snippet the declared type of the message element is the complex type called messageType. However the element declaration also contains a few type alternatives. These type alternatives allow the type of the message elements to be determined at the validation time. Based on the conditions stated as the test attribute values the schema validator will assign a type for the message element prior to validating its content. The expressions '@kind' and '@code' refer to two attributes of the message element. The first type alternative will assign the type called messageTypeString if the kind attribute has the value 'string' and the code attribute has a value greater than 1000.

When the schema validator encounters an element declaration with one or more type alternatives it will evaluate the test expressions one by one until an expression which evaluates to true is found. When such a matching type alternative is found the corresponding type will be assigned to the element. If none of the type alternatives match then a default type will be assigned.

My project mainly consisted of two main sections. First section of the project was to implement the type alternatives traversal support so that Xerces2/J can properly traverse an XML schema document which contains type alternatives and add the corresponding information to the schema grammar. Implementing this was fairly easy and I managed to complete it prior to th GSoC mid term evaluations. The second part of the project was to implement type alternatives validation. This was fairly difficult since I had to develop a bare minimal XPath 2.0 implementation for Xerces2/J. Developing the XPath processor actually covered a significant portion of the entire project.

My workings will be fully available in the Xerces2/J code base (even now the code related to traversal part is in one of the SVN branches) very soon. All in all it was a great learning experience as it was a great opportunity for me to learn a whole bunch of cool technologies like XML, XML schema, and XPath 2.0. I also got the opportunity to put some of my knowledge on theory of computing into action and sharpen my programming skills. I would like to give my heartiest gratitude to the Google, the Apache community and very specially to my mentor Khaled Noaman for being a very supportive guide right from the start of my project.