Rob,

Apologies for the top-post, but at this point I think (a) you are satisfied you are on the right track and (b) I have become more confused. Given that (a) is much more important than (b), we can just leave it at that.

:)

Feel free to come back for further clarifications of suggestions if you need them.

Good luck,
-chris

On 12/14/20 11:32, Rob Sargent wrote:


Calling save() from the servlet would tie-up the request-processing thread 
until the save completes. That's where you get your 18-hour response times, 
which is not very HTTP-friendly.
Certainly don't want to pay for 18 EC2 hours of idle.

So your clients spin-up an EC2 instance just to send the request to your 
server? That sounds odd.

Maybe more than you want to hear:  I fill an AWS Queue with job definitions.  
Each job is run on separate EC2 instance, pulls an id or two from the job 
def/command line and requests data from the database.  Uses that data to run 
simulations and sends the analysis of the simulations back to the database.  If 
I didn’t spin the work off to the ThreadPoolExec,  the “large” version would 
have to wait for many, many records to be saved.  I avoid this.  (I actually 
had to look back to see where the “18 hours” came from...)

The two payloads are impls of an a base class. Jackson/ObjectMapper unravels 
them to Type. Type.save();

Okay, so it looks like Type.save() is what needs to be called in the separate 
thread (well, submitted to a job scheduler; just get it off the request 
processing thread so you can return 200 response to the client).


Yes, I think I’m covered once I re-establish TPExec.
That’s the thinking behind the question of accessing a ThreadPoolExecutor via 
JDNI.  I know my existing impl does queue jobs so (so the load is greater than 
the capacity to handle requests).  I worry that without off-loading Tomcat 
would just spin up more servlet threads, exhaust resources.  I can lose a 
client, but would rather not lose the server (that looses all clients...)

Agreed: rejecting a single request is preferred over the service coming down -- 
and all its in-flight jobs with it.

So I think you want something like this:

servlet {
   post {
     // Buffer all our input data
     long bufferSize = request.getContentLengthLong();
     if(bufferSize > Integer.MAX_VALUE || bufferSize < 0) {
       bufferSize = 8192; // Reasonable default?
     }
     ByteArrayOutputStream buffer = new ByteArrayOutputStream((int)bufferSize);

     int count;
     byte[] buffer = new byte[8192];
     while(-1 != (count = in.read(buf)) {
         buffer.write(buf, 0, count);
     }

     // All data read: tell the client we are good to go
     Job job = new Job(buffer);
     try {
       sharedExecutor.submit(job); // Fire and forget

       response.setStatus(200); // Ok
     } catch (RejectedExecutionException ree) {
       response.setStatus(503); // Service Unavailable
     }
   }
}

This is working:
       protected void doPost(HttpServletRequest req, HttpServletResponse
    resp) /*throws ServletException, IOException*/ {
         lookupHostAndPort();
         Connection conn = null;
         try {
           ObjectMapper jsonMapper = JsonMapper.builder().addModule(new
    JavaTimeModule()).build();
           jsonMapper.setSerializationInclusion(Include.NON_NULL);
           try {
             AbstractPayload payload =
    jsonMapper.readValue(req.getInputStream(), AbstractPayload.class);
             logger.error("received payload");
             String redoUrl =
    String.format("jdbc:postgresql://%s:%d/%s", getDbHost(),
    getDbPort(), getDbName(req));
            Connection copyConn = DriverManager.getConnection(redoUrl,
    getDbRole(req), getDbRole(req)+getExtension());

So it's here you cannot pool the connections? What about:

    Context ctx = new InitialContext();

    DataSource ds = (DataSource)ctx.lookup("java:/comp/env/jdbc/" + 
getJNDIName(req));

I’ll see if I need this (If I’m never getting a pooled connection).  But JNDI is not 
a good place for the “second investigator’s name (et al)"

Then you can define your per-user connection pools in JNDI and get the benefit 
of connection-pooling.

             payload.setConnection(copyConn);
             payload.write();

Is the above call the one that takes hours?

The beginning of it for sure.  The COPY work happens pleasantly quickly but 
does need it’s own db connection.  Payload says thanks, then goes on to using 
the temp tables filled by COPY to write to the real tables.  This is the slow 
part as we can be talking about millions of records into/updating a table with 
indexed. (This is done in 1/16ths. Don’t ask how.)

             //HERE THE CLIENT IS WAITING FOR THE SAVE.  Though there
    can be a lot of data, COPY is blindingly fast

Maybe the payload.write() is not slow. Maybe? After this you don't do anything 
else...

             resp.setContentType("plain/text");
             resp.setStatus(200);
             resp.getOutputStream().write("SGS_OK".getBytes());
             resp.getOutputStream().flush();
             resp.getOutputStream().close();
           }
             //Client can do squat at this point.
           catch
    (com.fasterxml.jackson.databind.exc.MismatchedInputException mie) {
             logger.error("transform failed: " + mie.getMessage());
             resp.setContentType("plain/text");
             resp.setStatus(461);
             String emsg = "PAYLOAD NOT
    SAVED\n%s\n".format(mie.getMessage());
             resp.getOutputStream().write(emsg.getBytes());
             resp.getOutputStream().flush();
             resp.getOutputStream().close();
           }
         }
         catch (IOException | SQLException ioe) {
         etc }
Obviously, the job needs to know how to execute itself (making it Runnable 
means you can use the various Executors Java provides). Also, you need to 
decide what to do about creating the executor.

I used the ByteArrayOutputStream above to avoid the complexity of re-scaling 
buffers in example code. If you have huge buffers and you need to convert to 
byte[] at the end, then you are going to need 2x heap space to do it. Yuck. 
Consider implementing the auto-re-sizing byte-array yourself and avoiding 
ByteArrayOutputStream.

There isn't anything magic about JNDI. You could also put the thread pool 
directly into your servvlet:

servlet {
   ThreadPoolExecutor sharedExecutor;
   constructor() {
     sharedExecutor = new ThreadPoolExecutor(...);
   }
   ...
}

Yes, I see now that the single real instance of the servlet can master the 
sharedExcutor.
I have reliable threadpool code at hand.  I don't need to separate the job 
types:  In practice all the big ones are done first: they define the small 
ones.  It's when I'm spectacularly successful and two (2) investigators want to 
use the system ...

Sounds good.

But I am still confused as to what is taking 18 hours. None of the calls above 
look like they should take a long time, given your comments.

I think I've explain the slow part above.  TL/DR: DB writes are expensive

If you want to put those executors into JNDI, you are welcome to do so, but 
there is no particular reason to. If it's convenient to configure a thread pool 
executor via some JNDI injection something-or-other, feel free to use that.

But ultimately, you are just going to get a reference to the executor and drop 
the job on it.

Next up, is SSL.  One of the reason’s I must switch from my naked socket impl.

Nah, you can do TLS on a naked socket. But I think using Tomcat embedded (or 
not) will save you the trouble of having to learn a whole lot and write a lot 
of code.

No thanks.
TLS should be fairly easy to get going in Tomcat as long as you already 
understand how to create a key+certificate.

I've made keys/certs in previous lives (not to say I understand them). I'm 
waiting to hear on whether or not I'll be able to self-sign etc. Talking to AWS 
Monday on the things security/HIPAA

AWS may tell you that simply using TLS at the load-balancer (which is 
fall-off-a-log easy; they will even auto-renew with an AWS-signed CA), which 
should be sufficient for your needs. You may not have to configure Tomcat for 
TLS at all.

I will definitely bring this up.  Thanks.
I'm sure I'll be back, but I think I can move forward.  Much appreciated.


Any time.

-chris

Same threat as before ;)
Thanks a ton,

rjs

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to