Some more info:

If the info I found is right, in a multithreaded Java application all instances 
of PythonInterpreter
share the same sys object.  So what I observed is normal behavior.

But it also means, all STAX jobs share the same sys object. This said, probably 
using “reload(sys)”
while other jobs are running isn’t a good idea?
Hence I tried to find a way to set the defaultencoding for the entire STAX 
service’s
PythonInterpreters. Normally, you could do this writing a module 
“sitecustomize”. But on STAX
the PythonInterpreter does not even load the ‘site’ module, so sitecustomize 
does not work.

On the other hand: as module ‘site’ is not loaded, setdefaultencoding() is not 
removed from sys.
So one can call it in the STAX job without having to reloading sys before!
If all STAX jobs running on the service can work on the same encoding, this is 
a simple way, I think.

On the other hand, jobs switching between different encodings uncoordinatedly, 
could produce
funny things…  Thus I think it would be fine to have a config param for STAX 
service’s
defaultencoding and on the other side setdefaultencoding() being removed from 
sys.

If different jobs need to use different encodings, it would be better to have 
separate sys objects
for each PythonInterpreter. On an old mailing list archive I saw a message 
suggesting that this can
be done. Unfortunately, I could not find out, how.

Bodo

------------------------------------------------------------------------------------------


Hi Sharon,

I’ve spent some time to find out, what’s going wrong when using the german 
characters.

Your second suggestion generally works fine for me.  But there is  something, 
that makes
me wonder, whether it is a bug or not:

After a STAX job changes the default encoding from “ascii” to “latin-1” using
    import sys
    reload(sys)
    sys.setdefaultencoding( "latin-1" )

all other jobs started later also show default encoding “latin-1” from start!
I wrote a job that only checks the setting but does not set it:
    import sys
    print "Jython Encoding: " + sys.getdefaultencoding()

I would have expected, that each job has its own default encoding set to 
“ascii” at start.
Maybe even other jobs running at the same time are influenced also? I didn’t 
test it yet.

Maybe this behavior also is the reason for the first method working fine 
sometimes …


Using the “reload(sys)” trick, I tested to use german characters for block,  
testcase and
process names, for <log> to monitor and Job_User_Log. They all work, but print 
from
<script> doesn’t. The reason is a little bug in the write method of 
STAXPythonOutput.
If a character from the upper half of “latin-1” is written, a negative integer 
value is provided
for write (sign-extension!)
The following patch fixes this by ignoring the upper bits of the integer value 
(as specified in
Java-API, see 
http://download.oracle.com/javase/6/docs/api/java/io/OutputStream.html)

--- services/stax/service/STAXPythonOutput.java 2011-06-28 19:21:02.000000000 
+0200
+++ services/stax/service/STAXPythonOutput.java 2011-06-28 18:13:35.000000000 
+0200
@@ -100,7 +100,7 @@ public class STAXPythonOutput extends Ou
      */
     public void write(int b) throws IOException
     {
-        fData.append((char)b);
+        fData.append((char)(b & 0xff));
     }

    /**

BTW: It works fine for Latin-1 only (or probably other 8-bit codes also). But I 
could not find a way to
make the full Unicode range work. I tried using U+12ca, but the Unicode string 
in Python contains just
a ‘?’ instead.

Best regards
Bodo


------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
staf-users mailing list
staf-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/staf-users

Reply via email to