We've just discovered a serious problem with the Apache SOAP 2.3.1
DateSerializer class during deserialization.  (I think this probably
is also a problem in Axis, though I haven't really looked into that.)

DateSerializer.unmarshall instantiates a SimpleDateFormat and calls
its parse method to get a Date object from a string like
"2003-02-21T20:02:54Z".  

The problem is that parsing via SimpleDateFormat is locale-specific.
If the current Java Locale is one where integer signs come to the
right of the digits instead of coming to the left of the digits, the
parse fails.  The separator "-" in the ISO format is interpreted as a
sign for the year in such cases.  I've noodled around in the JDK class
source code a bit, and I think it comes down to calling
NumberFormat.parse to get the year, but letting it be "greedy" and
taking as much as it can reasonably interpret as an integer.

I have not yet discovered any workaround for this other than changing
the deserializer code.  Here is some simpler standalone code that
demonstrates the problem (the example Locale here is Egyption Arabic,
which uses the trailing integer sign).  This just boils things down to
the essence of the problem and is otherwise equivalent to similar code
in DateSerializer:

    void testArabicDate()
    {
        Locale arabic = new Locale("ar", "EG");
        System.out.println("Locale: " + arabic + " " + arabic.getDisplayName());
        Locale.setDefault(arabic);
        String dateString = "2003-02-21T20:02:54.123Z";
        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");
        sdf.setTimeZone(TimeZone.getTimeZone("GMT"));
        Date date = null;
        try
        {
             date = sdf.parse(dateString);
        }
        catch (ParseException pe)
        {
            System.out.println("Caught ParseException: " + pe);
            System.out.println("      errorOffset was: " + pe.getErrorOffset());
        }
        System.out.println("Date:        " + dateString);
        System.out.println("Parsed date: " + date);
    }

To run the above code and repro the problem, you have to have a locale
definition installed for "ar_EG".  On MSWindows, for example, that's
done via the control panel ("regional settings" or something
similar).  You don't have to set the machine locale to anything in
particular other than what you already have.  If you run the above
code in the US version of the Sun JRE, you probably won't see the
problem.  If you run it in the JRE international version (or "the only
version" for non-MSWindows platforms) or the full Sun JDK, you will
get an exception inside the "try".  (I've actually only tried this
stuff on MSWindows, so I'm conjecturing about other platforms.)

OK, if you accept that this is a problem (and not my hallucination),
what to do about it?  I think it is in a vague area where you can't
really say that it's a JDK bug (which says SDF is locale-specific) or
an Apache SOAP bug (though you still get the exception even if you put
'quotes' around the dashes in the SDF string), but why bother using
SDF for this parse anyhow?  We're interpreting an ISO format that is
completely structured.  In fact, the only leeway is that we'll accept
it with or without the milliseconds component.  Why not just bust it
up by brute force?  Here is a static method that does just that (which
could be dropped into DateSerializer and called from unmarshall,
replacing most of it's current method body):

    static Date isoDateParser(String dateString)
    {
        if (dateString.charAt(4)  != '-'
        ||  dateString.charAt(7)  != '-'
        ||  dateString.charAt(10) != 'T'
        ||  dateString.charAt(13) != ':'
        ||  dateString.charAt(16) != ':'
        ||  dateString.charAt(dateString.length()-1) != 'Z')
        {
            throw new IllegalArgumentException("Not a valid ISO date/time 
(delimiters): " + dateString);
        }
        if (dateString.length() == 20)
        {
            // OK, already checked 'Z' above
        }
        else if (dateString.length() == 24)
        {
            if (dateString.charAt(19) != '.')
            {
                throw new IllegalArgumentException("Not a valid ISO date/time (millis 
dot): " + dateString);
            }
        }
        else
        {
            throw new IllegalArgumentException("Not a valid ISO date/time (length): " 
+ dateString);
        }

        int year   = Integer.parseInt(dateString.substring(0, 4));
        int month  = Integer.parseInt(dateString.substring(5, 7)) - 1;
        int day    = Integer.parseInt(dateString.substring(8, 10));
        int hour   = Integer.parseInt(dateString.substring(11, 13));
        int minute = Integer.parseInt(dateString.substring(14, 16));
        int second = Integer.parseInt(dateString.substring(17, 19));
        int millis = 0;
        if (dateString.length() == 24) millis = 
Integer.parseInt(dateString.substring(20, 23));

        Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("GMT"));
        cal.set(year, month, day, hour, minute, second);
        cal.set(Calendar.MILLISECOND, millis);

        return cal.getTime();
    }

-- 
[EMAIL PROTECTED] (WJCarpenter)    PGP 0x91865119
38 95 1B 69 C9 C6 3D 25    73 46 32 04 69 D6 ED F3

Reply via email to