Parsing unconventional Dates
Posted by Mike Haller
on Wednesday, February 6. 2008
at 19:38
in Java
Thirdparties delivering data as XML.In a perfect world, all those files have their XML Schema.
In a bad world, sometimes they don't have.
In the real world, they often don't have.
In my world, they just don't have.
Are you reading arbitrary XML data files?
They don't have a Schema?
Stuff like this:
200812312359 Dec.08
Some values such as Date, Time or DateTime values just aren't recognized by the standard parser?
The 'pattern' attribute for simple types do not work as expected?
Well, there is a simple solution to your problem and you're getting it for free:
XmlBeans' XmlFactoryHook.
Let's have a look on the following example:
200812312359
As this blog is observed by my dear collegue, the testdrivenguy,
the next step is to write a test case. I want the parser to recognize my my silly date as an XmlDate object:
public class XmlBeansDateFormatTest {
String CUSTOM_DATE = "200812312359 ";
@Test public void testHook() throws Exception {
XmlObject parse = XmlObject.Factory.parse(CUSTOM_DATE);
Assert.assertTrue(parse instanceof XmlDate);
}
}
As the test fails, we are now going to add the new stuff:
public class XmlBeansDateFormatTest {
String CUSTOM_DATE = "200812312359 ";
@Test public void testHook() throws Exception {
CustomDateHook hook = new CustomDateHook(); // New
XmlFactoryHook.ThreadContext.setHook(hook); // New
XmlObject parse = XmlObject.Factory.parse(CUSTOM_DATE);
Assert.assertTrue(parse instanceof XmlDate);
}
}
Now, let's create the class CustomDateHook and let it implement the XmlFactoryHook interface. You can implement all methods as delegate methods to the SchemaLoader, e.g. like this:
public XmlObject parse(SchemaTypeLoader loader, String content,
SchemaType type, XmlOptions options) throws XmlException {
return loader.parse(node, type, options);
}
Now, we're going to implement our custom date parser. For demo purposes, i'm adding it only to the parse method we're using in the test case: parse(..,String,..)
/** Transforms string 200812312359 into 31.12.2008 23:59 date object */
public XmlObject parse(SchemaTypeLoader loader, String string,
SchemaType type, XmlOptions options) throws XmlException {
// TODO: Check what type it is, so we can transform it accordingly
// For demo purposes, we simply use the element name.
if (string.startsWith("")
&& string.endsWith(" ")) {
Date parsed = new SimpleDateFormat("yyyyMMddhhmm")
.parse(string.replaceAll("", "").replaceAll(
" ", ""));
XmlDateImpl xmlDateImpl = new XmlDateImpl();
xmlDateImpl.setDateValue(parsed);
return xmlDateImpl;
}
return loader.parse(string, type, options);
}
If you run the test case again, you might get a StackOverflowException. That's because newInstance is calling newInstance, which calls newInstance etc. To solve that nasty loop, implement the following code in the newInstance() method of your hook implementation:
public XmlObject newInstance(SchemaTypeLoader loader,
SchemaType type, XmlOptions options) {
XmlFactoryHook remember
= XmlFactoryHook.ThreadContext.getHook();
XmlFactoryHook.ThreadContext.setHook(null);
XmlObject instance = loader.newInstance(type, options);
XmlFactoryHook.ThreadContext.setHook(remember);
return instance;
}
That removes our Hook as a hook from the Xml Factory temporarily and later reattaches it. This is required, because otherwise the hook would be called recursively.
Run the test again, et voila you will have your XmlDate object.
