Rob,
Apologies for the top-post, but at this point I think (a) you are
satisfied you are on the right track and (b) I have become more
confused. Given that (a) is much more important than (b), we can just
leave it at that.
:)
Feel free to come back for further clarifications of suggestions if you
need them.
Good luck,
-chris
On 12/14/20 11:32, Rob Sargent wrote:
Calling save() from the servlet would tie-up the request-processing thread
until the save completes. That's where you get your 18-hour response times,
which is not very HTTP-friendly.
Certainly don't want to pay for 18 EC2 hours of idle.
So your clients spin-up an EC2 instance just to send the request to your
server? That sounds odd.
Maybe more than you want to hear: I fill an AWS Queue with job definitions.
Each job is run on separate EC2 instance, pulls an id or two from the job
def/command line and requests data from the database. Uses that data to run
simulations and sends the analysis of the simulations back to the database. If
I didn’t spin the work off to the ThreadPoolExec, the “large” version would
have to wait for many, many records to be saved. I avoid this. (I actually
had to look back to see where the “18 hours” came from...)
The two payloads are impls of an a base class. Jackson/ObjectMapper unravels
them to Type. Type.save();
Okay, so it looks like Type.save() is what needs to be called in the separate
thread (well, submitted to a job scheduler; just get it off the request
processing thread so you can return 200 response to the client).
Yes, I think I’m covered once I re-establish TPExec.
That’s the thinking behind the question of accessing a ThreadPoolExecutor via
JDNI. I know my existing impl does queue jobs so (so the load is greater than
the capacity to handle requests). I worry that without off-loading Tomcat
would just spin up more servlet threads, exhaust resources. I can lose a
client, but would rather not lose the server (that looses all clients...)
Agreed: rejecting a single request is preferred over the service coming down --
and all its in-flight jobs with it.
So I think you want something like this:
servlet {
post {
// Buffer all our input data
long bufferSize = request.getContentLengthLong();
if(bufferSize > Integer.MAX_VALUE || bufferSize < 0) {
bufferSize = 8192; // Reasonable default?
}
ByteArrayOutputStream buffer = new ByteArrayOutputStream((int)bufferSize);
int count;
byte[] buffer = new byte[8192];
while(-1 != (count = in.read(buf)) {
buffer.write(buf, 0, count);
}
// All data read: tell the client we are good to go
Job job = new Job(buffer);
try {
sharedExecutor.submit(job); // Fire and forget
response.setStatus(200); // Ok
} catch (RejectedExecutionException ree) {
response.setStatus(503); // Service Unavailable
}
}
}
This is working:
protected void doPost(HttpServletRequest req, HttpServletResponse
resp) /*throws ServletException, IOException*/ {
lookupHostAndPort();
Connection conn = null;
try {
ObjectMapper jsonMapper = JsonMapper.builder().addModule(new
JavaTimeModule()).build();
jsonMapper.setSerializationInclusion(Include.NON_NULL);
try {
AbstractPayload payload =
jsonMapper.readValue(req.getInputStream(), AbstractPayload.class);
logger.error("received payload");
String redoUrl =
String.format("jdbc:postgresql://%s:%d/%s", getDbHost(),
getDbPort(), getDbName(req));
Connection copyConn = DriverManager.getConnection(redoUrl,
getDbRole(req), getDbRole(req)+getExtension());
So it's here you cannot pool the connections? What about:
Context ctx = new InitialContext();
DataSource ds = (DataSource)ctx.lookup("java:/comp/env/jdbc/" +
getJNDIName(req));
I’ll see if I need this (If I’m never getting a pooled connection). But JNDI is not
a good place for the “second investigator’s name (et al)"
Then you can define your per-user connection pools in JNDI and get the benefit
of connection-pooling.
payload.setConnection(copyConn);
payload.write();
Is the above call the one that takes hours?
The beginning of it for sure. The COPY work happens pleasantly quickly but
does need it’s own db connection. Payload says thanks, then goes on to using
the temp tables filled by COPY to write to the real tables. This is the slow
part as we can be talking about millions of records into/updating a table with
indexed. (This is done in 1/16ths. Don’t ask how.)
//HERE THE CLIENT IS WAITING FOR THE SAVE. Though there
can be a lot of data, COPY is blindingly fast
Maybe the payload.write() is not slow. Maybe? After this you don't do anything
else...
resp.setContentType("plain/text");
resp.setStatus(200);
resp.getOutputStream().write("SGS_OK".getBytes());
resp.getOutputStream().flush();
resp.getOutputStream().close();
}
//Client can do squat at this point.
catch
(com.fasterxml.jackson.databind.exc.MismatchedInputException mie) {
logger.error("transform failed: " + mie.getMessage());
resp.setContentType("plain/text");
resp.setStatus(461);
String emsg = "PAYLOAD NOT
SAVED\n%s\n".format(mie.getMessage());
resp.getOutputStream().write(emsg.getBytes());
resp.getOutputStream().flush();
resp.getOutputStream().close();
}
}
catch (IOException | SQLException ioe) {
etc }
Obviously, the job needs to know how to execute itself (making it Runnable
means you can use the various Executors Java provides). Also, you need to
decide what to do about creating the executor.
I used the ByteArrayOutputStream above to avoid the complexity of re-scaling
buffers in example code. If you have huge buffers and you need to convert to
byte[] at the end, then you are going to need 2x heap space to do it. Yuck.
Consider implementing the auto-re-sizing byte-array yourself and avoiding
ByteArrayOutputStream.
There isn't anything magic about JNDI. You could also put the thread pool
directly into your servvlet:
servlet {
ThreadPoolExecutor sharedExecutor;
constructor() {
sharedExecutor = new ThreadPoolExecutor(...);
}
...
}
Yes, I see now that the single real instance of the servlet can master the
sharedExcutor.
I have reliable threadpool code at hand. I don't need to separate the job
types: In practice all the big ones are done first: they define the small
ones. It's when I'm spectacularly successful and two (2) investigators want to
use the system ...
Sounds good.
But I am still confused as to what is taking 18 hours. None of the calls above
look like they should take a long time, given your comments.
I think I've explain the slow part above. TL/DR: DB writes are expensive
If you want to put those executors into JNDI, you are welcome to do so, but
there is no particular reason to. If it's convenient to configure a thread pool
executor via some JNDI injection something-or-other, feel free to use that.
But ultimately, you are just going to get a reference to the executor and drop
the job on it.
Next up, is SSL. One of the reason’s I must switch from my naked socket impl.
Nah, you can do TLS on a naked socket. But I think using Tomcat embedded (or
not) will save you the trouble of having to learn a whole lot and write a lot
of code.
No thanks.
TLS should be fairly easy to get going in Tomcat as long as you already
understand how to create a key+certificate.
I've made keys/certs in previous lives (not to say I understand them). I'm
waiting to hear on whether or not I'll be able to self-sign etc. Talking to AWS
Monday on the things security/HIPAA
AWS may tell you that simply using TLS at the load-balancer (which is
fall-off-a-log easy; they will even auto-renew with an AWS-signed CA), which
should be sufficient for your needs. You may not have to configure Tomcat for
TLS at all.
I will definitely bring this up. Thanks.
I'm sure I'll be back, but I think I can move forward. Much appreciated.
Any time.
-chris
Same threat as before ;)
Thanks a ton,
rjs
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org