Alan,

>>> When I tried to [spawn a new thread on behalf of the remote
>>> user] on older versions of Tomcat (4.x), all output to the
>>> user was suspended until _all_ of the threads completed.
> 
>> This was probably due to mismanagement of the thread; a new
>> thread will run independently of your request handler thread,
>> unless you call "join" on that thread, which will block the
>> current thread until the "joined" thread completes.
> 
> I take it you've actually tried this and it worked for you.
> Which version of Tomcat were you using?

I can't recall what we were using on some of our development servers
(probably 4.1.something), but we were running Weblogic in production. We
used our own simple thread pool and it worked great. Our "batch job", as
it were, was actually opening an FTP connection, pushing a handful of
files through it, and notifying us of its progress during the whole thing.

The previous implementation was synchronous, and (as someone pointed
out) the request can time out if it takes too long. We modified this
process to use a background thread and an auto-refreshing page that
refreshed every couple of seconds. It actually made for a relatively
good looking progress bar.

> I'm positive that I didn't call join to wait for completion of
> the threads that I started.  Is it possible that earlier versions
> of Tomcat did that beneath the covers, or perhaps that this was
> configurable and our systems administrators (I wasn't
> administering Tomcat myself) forced this to happen?

I think that it was unlikely that Tomcat did anything like that; Tomcat
doesn't manage /all/ threads in the system... only the request handling
threads. Perhaps something else was happening. Were you writing to the
response in that thread that you created? That could cause some odd
things to happen...

> I agree that my technique was worse than spawning a thread.  I
> didn't want the overhead of running two JVMs.  But, try as I
> might (and I tried mightily hard), I wasn't able to make it work
> on the installation of Tomcat that I was using.

That's too bad. I don't recall having too much trouble when we did ours.
I argued with the engineers that we couldn't write a decent thread pool
fast enough and ought to get something off-the-open-source-shelf. I lost
the argument and we wrote our own simple one.

> I only allowed one batch job to run at a time, for just the
> reasons you mentioned.  I wasn't worried about a malicious denial
> of service attack since our server was running on an intranet for
> vetted users, and I wasn't concerned about starting arbitrary
> jobs since the users could only pick from my menu of jobs to run.
> But I was worried about a user starting a long running job, not
> seeing results right away, and starting it again and again.

That was probably a good move. We were dealing with several dozen users
on each machine, across 6 production app servers. Our peak load was
about 100 users/box every morning, so we weren't really excited about
/potentially/ starting 100 (or more, in case we mismanaged RELOADs) new
threads to handle these FTP transactions. At least those operations were
relatively infrequent.

>> You are better off creating a thread pool and re-using threads
>> to complete long-running operations.
> 
> Is this just to limit the number of threads that can run?

Yes. In fact, some thread pools are written to act like batch processors
(I think that .NET comes with one such implementation): you create a
Runnable object and basically queue it into the batch processor. The
batch processor then attaches a thread whenever one is available. That
Runnable object could still report status information to the session, etc.

So, if you only had a single thread in your thread pool (a "shallow"
pool, if you will), and high demand for batch jobs to be run, most
clients would sit around refreshing their pages simply saying "waiting",
since the job hadn't started. One lucky job would get done, and then the
rest would follow.

Refreshing pages aren't too much load, as long as you don't have a ton
of work to do in order to display them. For example, if you keep all the
information in the user's session (and the updated status information in
there, too), then you do very little work to spit out a status page each
time.

> I wouldn't be much worried about the overhead of starting a new
> thread since I already know I'm running a batch job which is
> going to take a long time.  Starting a new thread is in the
> noise, performance wise.

Agreed -- as long as you are managing those resources and not letting
many threads get started. You mentioned that you already protected
against that.

>> If possible, a better solution would be to actually run the
>> batch job /as a batch job/ -- i.e. store the job in a
>> database/file/whatever, and have another process come along and
>> process that request as necessary.
> 
> Another excellent idea.  I thought of it and have used it on
> other projects but, alas, the sys admins hated the idea of
> running another daemon and I didn't want to fight with them.

:)

I can certainly understand pushback on the admin side.

>> This is all academic, since I think the original question was
>> something along the lines of "Does Tomcat actually work like an
>> app server should?  with threads and everything? Does it?!".
> 
> Well, the original poster may not have learned anything from
> this, but I certainly did.

Well, I'm glad you did.

I wrote the code described above sometime in 2002 while working a
project as a consultant for ... a large domain name registrar and
digital certificate concern. I just checked, and it appears that they
are still using it. The page has gotten a face-lift, but it looks like
all the moving parts are still there. Apparently, they are still happy
with the mechanism and it's performance.

I hope your project is equally successful.

-chris


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to