Alan, >>> When I tried to [spawn a new thread on behalf of the remote >>> user] on older versions of Tomcat (4.x), all output to the >>> user was suspended until _all_ of the threads completed. > >> This was probably due to mismanagement of the thread; a new >> thread will run independently of your request handler thread, >> unless you call "join" on that thread, which will block the >> current thread until the "joined" thread completes. > > I take it you've actually tried this and it worked for you. > Which version of Tomcat were you using?
I can't recall what we were using on some of our development servers (probably 4.1.something), but we were running Weblogic in production. We used our own simple thread pool and it worked great. Our "batch job", as it were, was actually opening an FTP connection, pushing a handful of files through it, and notifying us of its progress during the whole thing. The previous implementation was synchronous, and (as someone pointed out) the request can time out if it takes too long. We modified this process to use a background thread and an auto-refreshing page that refreshed every couple of seconds. It actually made for a relatively good looking progress bar. > I'm positive that I didn't call join to wait for completion of > the threads that I started. Is it possible that earlier versions > of Tomcat did that beneath the covers, or perhaps that this was > configurable and our systems administrators (I wasn't > administering Tomcat myself) forced this to happen? I think that it was unlikely that Tomcat did anything like that; Tomcat doesn't manage /all/ threads in the system... only the request handling threads. Perhaps something else was happening. Were you writing to the response in that thread that you created? That could cause some odd things to happen... > I agree that my technique was worse than spawning a thread. I > didn't want the overhead of running two JVMs. But, try as I > might (and I tried mightily hard), I wasn't able to make it work > on the installation of Tomcat that I was using. That's too bad. I don't recall having too much trouble when we did ours. I argued with the engineers that we couldn't write a decent thread pool fast enough and ought to get something off-the-open-source-shelf. I lost the argument and we wrote our own simple one. > I only allowed one batch job to run at a time, for just the > reasons you mentioned. I wasn't worried about a malicious denial > of service attack since our server was running on an intranet for > vetted users, and I wasn't concerned about starting arbitrary > jobs since the users could only pick from my menu of jobs to run. > But I was worried about a user starting a long running job, not > seeing results right away, and starting it again and again. That was probably a good move. We were dealing with several dozen users on each machine, across 6 production app servers. Our peak load was about 100 users/box every morning, so we weren't really excited about /potentially/ starting 100 (or more, in case we mismanaged RELOADs) new threads to handle these FTP transactions. At least those operations were relatively infrequent. >> You are better off creating a thread pool and re-using threads >> to complete long-running operations. > > Is this just to limit the number of threads that can run? Yes. In fact, some thread pools are written to act like batch processors (I think that .NET comes with one such implementation): you create a Runnable object and basically queue it into the batch processor. The batch processor then attaches a thread whenever one is available. That Runnable object could still report status information to the session, etc. So, if you only had a single thread in your thread pool (a "shallow" pool, if you will), and high demand for batch jobs to be run, most clients would sit around refreshing their pages simply saying "waiting", since the job hadn't started. One lucky job would get done, and then the rest would follow. Refreshing pages aren't too much load, as long as you don't have a ton of work to do in order to display them. For example, if you keep all the information in the user's session (and the updated status information in there, too), then you do very little work to spit out a status page each time. > I wouldn't be much worried about the overhead of starting a new > thread since I already know I'm running a batch job which is > going to take a long time. Starting a new thread is in the > noise, performance wise. Agreed -- as long as you are managing those resources and not letting many threads get started. You mentioned that you already protected against that. >> If possible, a better solution would be to actually run the >> batch job /as a batch job/ -- i.e. store the job in a >> database/file/whatever, and have another process come along and >> process that request as necessary. > > Another excellent idea. I thought of it and have used it on > other projects but, alas, the sys admins hated the idea of > running another daemon and I didn't want to fight with them. :) I can certainly understand pushback on the admin side. >> This is all academic, since I think the original question was >> something along the lines of "Does Tomcat actually work like an >> app server should? with threads and everything? Does it?!". > > Well, the original poster may not have learned anything from > this, but I certainly did. Well, I'm glad you did. I wrote the code described above sometime in 2002 while working a project as a consultant for ... a large domain name registrar and digital certificate concern. I just checked, and it appears that they are still using it. The page has gotten a face-lift, but it looks like all the moving parts are still there. Apparently, they are still happy with the mechanism and it's performance. I hope your project is equally successful. -chris
signature.asc
Description: OpenPGP digital signature