CSV parsing .. once again

Posted by Mike Haller on Sunday, December 31. 2006 at 17:11 in Java
Who hasn't dealt with comma separated values? These damn little .csv files. Okay, there are a few Java libraries to handle CSV files, such as the Ostermiller lib. However, it's not usable in some environments (e.g. licensing conflict).

"1000";1;;"0";"1000";;9;2000;"1000";"Sonstiges";"Stck";;1,00;1754" "1001";1;;"0";"1000";;9;2000;"1000";"Sonstiges";"Stck";;1,00;584" "1002";1;;"0";"1001";;9;2000;"1000";"Sonstiges";"Stck";;1,00;255"


Then, in a new Java project, one is undecided whether to write a small importer which just fits the one .csv file you need to import. Then, there will always be another one and yet another one. Oh, and then there are the charset problems. Does the file originate from a legacy DOS application, a legacy Windows application? Or is it totally weird, because csv is only a de-facto standard and you cannot find a real specification anywhere? How do you handle the empty values with no quotes?

At that time, it'd be good to have a Java CSV library within reach.

Apache has the commons-csv library, and hopefully it gets out of the sandbox soon and will be merged with an existing library such as commons-lang or commons-io.

Mork or what?

Posted by Mike Haller on Tuesday, December 26. 2006 at 00:00 in Java
There is a rather funny database format used in some popular Mozilla applications called Mork. The lexical proximity to the word dork is probably not intended, but who knows.

A project has been started which has it's goal to provide a Java implementation which can parse Mork database files. The thingy is called jMork - a Java Mork implementation.

It's in a very early stage but can already be used to read in Thunderbirds address books.

Why i talk about Mork ... i wanted to import some contacts and already integrated a billing software based on paradox. I managed to use an evaluation driver of a Paradox JDBC driver which worked perfectly. I only needed it once, so I guess the evaluation license is ok for such usage.

Now I also want to import my contact from Thunderbird and I wondered whether I could directly use the internal text file format. After some google queries, it became clear that the text format is ... other than expected.

After some reading of the specification of the Mork format at Mozilla's archive site, i began implementing a Mork parser. Hopefully this will be useful to anybody who ever wants to fiddle with such files.

By the way, the format is also used for the history.dat file of Firefox and looks like this:
// <!-- <mdb:mork:z v="1.4"/> -->
< <(a=c)> // (f=iso-8859-1)
  (B8=Custom3)(B9=Custom4)(BA=Notes)(BB=LastModifiedDate)(BC=RecordKey)
...
  [24(83137)(84138)(^85=)(^86=)(87139)(^88=)(8913A)(8A13A)
    (^8B=)(^8C=)(^8D=)(^8E=0)(^8F=0)(^90=)(^91=)(^92=)(^93=)(^94=)(^95=)
    (^96=)(^97=)(^98=)(^99=)(^9A=)(^9B=)(^9C=)(^9D=)(^9E=)(^9F=)(^A0=)
...
@$${6D{@
<(157=4597c727)>[9:^80(BB157)]
@$$}6D}@


Some bloggers hate the author of the format for his "genuine" ideas, others are mature enough to not comment on it at all. However, in modern times, XML would have been choosen anyway.

Oracle sucks once again

Posted by Mike Haller on Sunday, December 24. 2006 at 00:00 in Java
If you are using iBatis SqlMaps and execute a stored procedure, you should probably not format the SQL string, e.g.:

<procedure id="exec" parameterClass="org.example.Param">
 {
      call p_main (
              #param1:NUMERIC#,
              #param2:DATE# )
 }
</procedure>


java.sql.SQLException: Non supported SQL92 token at position: 4:
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
        at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
        at oracle.jdbc.driver.OracleSql.handleToken(OracleSql.java:1165)
        at oracle.jdbc.driver.OracleSql.handleODBC(OracleSql.java:1064)
        at oracle.jdbc.driver.OracleSql.parse(OracleSql.java:984)
        at oracle.jdbc.driver.OracleSql.getSql(OracleSql.java:312)
        at oracle.jdbc.driver.OracleSql.getSqlBytes(OracleSql.java:557)


The solution: remove TABS and newlines from the SQL in the procedure call.

About

My name is Mike Haller and I'm a software developer and architect at Bosch Software Innovations in Germany. I love programming, playing games and reading books. I like good food, making photos and learning and mentoring about the craftsmanship of commercial software development. Stack Overflow profile for mhaller

Quicksearch

Archives