On Sun, Apr 14, 2013 at 6:51 PM, Christopher Schultz < ch...@christopherschultz.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Howard, > > On 4/11/13 10:38 PM, Howard W. Smith, Jr. wrote: > > On Thu, Apr 4, 2013 at 2:32 PM, Christopher Schultz < > > ch...@christopherschultz.net> wrote: > > > >> Your heap settings should be tailored to your environment and > >> usage scenarios. > > > > Interesting. I suppose 'your environment' means memory available, > > operating system, hardware. Usage scenarios? hmmm... please clarify > > with a brief example, thanks. :) > > Here's an example: Let's say that your webapp doesn't use HttpSessions > and does no caching. You need to be able to handle 100 simultaneous > connections that do small fetches/inserts from/into a relational > database. Your pages are fairly simple and don't have any kind of > heavyweight app framework taking-up a whole bunch of memory to do > simple things. > Thanks Chris for the example. This is definitely not my app. I am definitely relying on user HttpSessions, and I do JPA-level caching (statement cache and query results cache). pages are PrimeFaces and primefaces = xhtml, html, jquery, and MyFaces/OpenWebBeans to help with speed/performance. And right now, the app handles on a 'few' simultaneous connections/users that do small and large fetches/inserts from/into relational database. :) Hopefully one day, my app will be support 100+ simultaneous connections/users. > For this situation, you can probably get away with a 64MiB heap. If > your webapp uses more than 64MiB, there is probably some kind of > problem. If you only need a 64MiB heap, then you can probably run on > fairly modest hardware: there's no need to lease that 128GiB server > your vendor is trying to talk you into. > Understood, thanks. I have Xms/Xmx = 1024m, and I rarely see used memory get over 400 or 500m. the production server has 32GB RAM. > > On the other hand, maybe you have aggressive caching of data that > benefits from having a large amount of heap space. Or maybe you need > to support 1000 simultaneous connections and need to do XSLT > processing of multi-megabyte XML documents and your XSLTs don't allow > stream-processing of the XML document (oops). Interesting. Or maybe you have to keep a large amount of data in users' HttpSession > objects (maybe a few > dozen MiB) and you need to support a few thousand simultaneous users > (not connections). 10k users each with a 5MiB heap = 48GiB. > sometimes, i do keep large amount of data in user HttpSession objects, but still being somewhat junior java/jsf developer and listening to you all on tomcat list and other senior java/jsf developers, I want to move some of my logic and caching of data from SessionScoped beans to RequestScoped beans. That's interesting that you say, '10k users each with 5MB heap = 48 GB'; i never thought about calculating a size estimate per user; maybe, i should do that when i am done with all of my optimizing of the app. i've been in optimize mode for the last 5 to 8 months (slowly-but-surely, mojarra to myfaces, JSF managed beans to CDI managed beans, in preparation for JSF 2.2 and/or Java EE 7, glassfish to tomcat/tomee, and other things after/while listening to you all about JVM tuning, preventing/debugging/resolving memory leaks, etc... > There is no such thing as a "good recommendation" for heap size unless > the person making the recommendation really understands your use case(s). > understood/agreed > > I generally have these two suggestions that I've found to be > universally reasonable: > > 1. Make -Xms = -Xmx to eliminate heap thrashing: the JVM is going to > eat-up that large heap space at some point if you have sized things > correctly, so you may as well not make the memory manager have to work > any harder than necessary. > doing this, as I've seen this recommended quite often on this list and others (tomee, openwebbeans, openejb). if you have sized things correctly? size things correctly = set -Xms and -Xmx appropriately to meet your system/software requirements? > > 2. Run with the lowest heap space that is reasonable for your > environment. I like doing this because it actually helps you diagnose > things more easily when they go wrong: a small heap yields a smaller > heap-dump file, is GC'd more frequently and therefore contains fewer > long-lived dead objects, and will cause an OOME sooner if you have > some kind of leak. Of course, nobody wants to experience an OOME but > you also don't want to watch a 50GiB heap fill-up 800 bytes at a time > due to a small leak. > Agreed and this is definitely/really nice to know. Listening to you all here on tomcat list, that is why I lowered Xms/Xmx from 4096 to 1024MB. Listening to you, now, and since I hardly ever see heap rise above 500 or 600m, I could lower Xms/Xmx from 1024 to maybe 800/900m, but remember, I shutdown-deploy-start tomee/tomcat quite often, almost daily, so i'm really not giving it a chance to see if OOME will occur, even when set to 1024m. i have been listening closely and reading (a little) about long-lived and short-lived objects, and whenever I do a heap dump and analyze the 'classes', I see char[], byte, Object[], etc... be listed first with the most # of instances and most size. I do like to look through the list, to see if any of my code/classes are ever close to the top of the list, and this lets me know, if my code is being GC'ed well or not. I really don't like what I see...i see these char[], byte, Object[], and myfaces classes, and container and eclipselink classes at the top of the list... it makes me wonder, am I not using it/them correctly, are there leaks in these classes that I did not write/develop, am i not giving it enough time for them to be GC'ed, am I doing what I can to make my code/classes (as much as possible)...short-lived, is my sessionscoped beans instantiating so many List<>, String, etc..., and i'm not setting to 'null' to help GC, a bit? many questions go through my mind about all of this. :) > I've had people tell me that I should run with the biggest heap I "can > afford" meaning both financially - 'cause you have to buy a bunch of > memory - and reasonably within the constraints of the OS (it's not > reasonably to run a 9.9GiB heap with 10GiB of physical RAM, for > instance). The reasoning is twofold: > well, i probably was guilty of this, somewhat; my app was running quite well (and very fast) on Windows Server 2003 32bit server (Client JVM) with 4GB RAM, but when i got deadlocks, because of the initial version of the @Schedule job (I wrote) that download customer requests from email and insert into database. This is the reason why I requested for faster hardware, and i have be pleased with faster/newer server, more RAM even though I don't need 32GB (right now, but definitely no complaints and no need to 'return' the new server...smiling). > 1. If you have leaks, they will take a lot more time to blow up. > yeah, i have recognized this. :( > (Obviously, this is the opposite of my recommendation, but it's worth > mentioning as it's a sound argument. I just disagree with the > conclusion). If you watch the heap-usage profile over time, you can > see it going up and up and instead of getting an OOME, you can predict > when it will happen and bounce the server at your convenience. > hahaha, i can see myself doing this... i monitor how tomcat/tomee running often via java visual vm (JMX connection). And i monitor the log files a lot, looking for exceptions, checking performance in date/time, how JMS/ActiveMQ jobs are completing (sometimes I do that). > > 2. Since the cost of a GC is related to the number of live objects > during a collection and not the size of the heap (though obviously a > smaller heap can fit fewer live objects!), having a huge heap means > that GCs will occur less frequently and so your total GC throughput > will (at least theoretically) be higher. > understood, but even with my app, Xms/Xmx = 1024mb, GC seems to occur 'while' people are logged in and working, and eventually there is no GC activity, but when I do heap dump, i still see long-lived objects (still resident in memory)...for whatever reason. :( > > A counter-argument to the second #2 above is that short-lived objects > will be collected quickly and long-lived objects will quickly be > promoted to older generations, so after a short period of runtime, > your GCs should get to the point where they are very cheap regardless > of heap size. > honestly, i am quite pleased with GC against my app, but my frequent shutdown/deploy/restart probably doesn't give GC a chance to show it's true worth. :) > > > heap settings tailored to 'my' environment and usage... hmmm, not > > many users hitting the app, app is not used 'all day long', app has > > @Schedule tasks that connects to an email acct, downloads customer > > email requests, and inserts customer requests into database (Martin > > recommended to close resources; sometime ago, I had to refactor all > > of that code, and I am closing the connection to the email acct, > > and open the connection when @Schedule tasks are executed), i am > > using JMS via TomEE/activeMQ to perform some tasks, asynchronously > > (tomee committers told me that use of @Asynchronous would be > > better, and less overhead); honestly, I do open 2 or 3 JMS > > resources/queues in @ApplicationScoped @PostConstruct (if I'm not > > mistaking) and close those resources in @ApplicationScoped > > @PreDestroy; why? I think I read on ActiveMQ site/documentation, > > where they recommend that that is better on performance, than > > open/close-on-demand. > > IMO, batch processes like the one you describe are better done by > specialty schedulers like cron on *NIX and the Task Scheduler on > Windows. Sounds good to me, but the email server is 'gmail', and I am happily using JavaMail IMAP connection to grab data from gmail, and grab the body of the email, and insert data (from the email) into database. Honestly, I don't see the need for Windows Task Scheduler to do this for me. I will provide more justification, probably, in the next paragraph below. That may not be more convenient for you, so I can only tell > you my opinion: webapps should be focused on the web-based portion(s) > of your service offering. +1 I agree 100%, even still, keep reading below, please. > Divorcing your batch-style processing from your webapp will (at least > theoretically) make your webapp more stable and your batch-stuff more > flexible I would love to divorce scheduled jobs from web app, but web app is dependent on the database, and i would have to develop a way for the database to be shared between web app and scheduled-job-app/server. Right now, that is a bit beyond me. I've researched a bit about that, but not there yet. :( > (changed your mind about schedule? > nope... hahaha :) keep reading below, will tell you why i'm laughing... > No problem, just modify crontab or whatever instead of bouncing > your ebapp). You can also easily move that batch-stuff to another server if > it starts to get heavy -- needs lots of CPU, memory, etc. -- and you don't > have to set up a full-block application-server environment > with Tomcat (TomEE), etc. > well, this is the point... when production server is running and busy serving requests to endusers (or myself, when testing), TomEE only uses 1% of CPU, and I already told you that with all that the web app is doing, 1% of CPU and Xms/Xmx = 1024MB... the app is clearly-and-evidently 'not' stressing-out the server/hardware...at all! :) > > > Almost forgot...as I mentioned in another thread, as enduser > > changes data, I have an implementation that keeps google calendar > > in sync with the database, which involves JMS/ActiveMQ/MDB and > > many/multiple requests to google calendar API. > > Do you do that in a pipelined way (e.g. queuing the updates) or do you > do everything symchronously while the (web) user waits? Months ago, google calendar update was synchronous and enduser had to wait, but I am now using JMS/ActiveMQ (MDBs) to update google calendar asynchronously. One of the TomEE committer tells me that @Asynchronous methods is less overhead, and that may be true, but I guess I am just overly-pleased at the performance and the working-as-designed-ness of MDBs/JMS/ActiveMQ asynchronously updating google calendar. i really have 'no' complaints. most of my obstacles/issues, I have done some refactoring to overcome those obstacles, and sometimes refactoring has been multiple attempts over days, weeks, months. :) Hmmm, i thought I had more to share here... oh yeah, i also started using MDBs/JMS/ActiveMQ to asynchronously insert the data that is downloaded via the @Schedule task/job. From what I see, the multi-table insert takes anywhere between 1 to 6 (or 1 to 9) seconds. potential deadlock? honestly, I think it feels like a deadlock, when those customer requests, but @Schedule is every 2 minutes, and how many customer requests will come in at 'one' time when @Schedule actually connects to email server and downloads emails. There are advantages to both strategies. Obviously this is all off-topic > for the > thread. If you're interested in a separate discussion, please open a > new thread. > my apolgies, but this thread seems to be done > > > hmmm, more about usage, I have the following: > > > > <Resource id="jdbc/...." type="javax.sql.DataSource"> JdbcDriver > > org.apache.derby.jdbc.EmbeddedDriver JdbcUrl > > jdbc:derby:....;create=true UserName .... Password .... JtaManaged > > true jmxEnabled true InitialSize 2 MaxActive 80 MaxIdle 20 MaxWait > > 10000 minIdle 10 suspectTimeout 60 removeAbandoned true > > removeAbandonedTimeout 180 timeBetweenEvictionRunsMillis 30000 > > jdbcInterceptors=StatementCache(max=128) </Resource> > > That seems like a lot of db resources to me (again: it all depends > upon your user cases). We have a user-load of roughly 20-100 users, > mostly in the US and mostly during "normal" hours (between 06:00 and > 23:00 EDT), the webapp is *very* RDBMS-heavy, and we have > maxActive/maxIdle = 20/10. Only under the most extreme circumstances > do we ever have maxWait (10000ms) problems -- usually when we have a > bunch of users all performing the same very-complex query > simultaneously. (We're working on that one ;) > Interesting, and thanks for sharing....and might I add, now you're talking! :) Currently, these are my settings in tomee.xml file (see below). I had default settings, first, and then I decreased some of the numbers below, and then decided to bump them back up to what you see below. With the following, honestly, myself and endusers are happy (and no complaints). <Resource id="jdbc/mcmsJta" type="javax.sql.DataSource"> JdbcDriver org.apache.derby.jdbc.EmbeddedDriver JdbcUrl jdbc:derby:D:/javadb/mcms;create=true UserName ....... Password ....... JtaManaged true jmxEnabled true InitialSize 2 MaxActive 80 MaxIdle 20 MaxWait 10000 minIdle 10 suspectTimeout 60 removeAbandoned true removeAbandonedTimeout 180 timeBetweenEvictionRunsMillis 30000 jdbcInterceptors=StatementCache(max=128) </Resource> My web app was developed to replace a dBase IV DBMS MS-DOS app that I developed in 1994/1995 as a senior-year project in college. My family has used the (6-table, all-tables-not-normalized) MS-DOS dBase IV app ever since 1995 before i started my career as software engineer. Soooo, my web app is very RDBMS-heavy, too. :) > > > I do occasionally see the sawtooth-looking graph, and eventually, I > > see the graph even out (non-sawtooth-looking graph). > > Well, every request requires objects to be created and ultimately > discarded (e.g. java.lang.String objects to represent your request > parameters, etc.), so you should never see the sawtooth go away > completely. Depending upon the parameters of your graph, you might not > really be able to see it. > i don't see saw-tooth, when no one logged in the app, and when GC has finished doing it's thing (to the short-lived objects, I guess). :) > > > remember, I do restart tomee quite often, especially when I have > > software updates to deploy to/on the server. It's been a while, > > since I let it run a few days without stop-deploy-start. I am > > looking forward to letting it be and running without touching it, > > so I can see if it results in an OOME or reaches the peak. > > You could always run a load-test against it in a test bed. JMeter is > your friend. > hahaha +1 agreed! will have to do that. thanks for the recommendation. :) > > - -chris > -----BEGIN PGP SIGNATURE----- > Version: GnuPG/MacGPG2 v2.0.17 (Darwin) > Comment: GPGTools - http://gpgtools.org > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQIcBAEBCAAGBQJRazL5AAoJEBzwKT+lPKRY1AkP/i8fDS/trZEvabWHTR5Ly5TW > zC9T63meZn3KaOwtM7aZSgXeMies8ZCQBjhVm4bwMDIBknt3cDR8WXKshFCGP7eC > LeCHzYJQkfEqfNOkMkb+FGYsXOKyB33HGLlquQ6VdJKq+UQ8Shvr6CjwBmDfgOe2 > TC8EyKHcv/AD/3xBQKCfZr9xZvy2Pd+ut37QqLGV4d9I4BfH172B3lhdnav7Ovf5 > 6y0NCdfmd85QoCFhXNSCCZZOzJ2uiT0yFaEokFcRJuTDPBIpQ1PhLYx/kdTBN09W > tS5maDvQFbr7piIiDsnD2mZwtXKi6n5xQTAB6lBhVSkVNFV8A0Uj5+4n9Aj+sHGP > QHS5jFVn9cY6GljZ5WTbTtOpCiHb8/p1a0kISDnhzg7eabsBA5ONn2hHRVtz0gU3 > R/DAgTxh/IGJa5F4AzDPsGEIHZ8gy3kJrGjvmvg/dZCdwrCKSTVArmf188/45NSy > 6+KKEuXqTYOiVOy7EqGM8Mg00UX7GLXIKJppLXdMFtJ5YJeqef0/fvWbP5PYxUb8 > NgQZmK9rUTKbvPGnV6AAJixDhCj8jv/AkChgunPl20b/PAuioyVSo6X5tCByJd3j > 33kSqbdVf89gNb2ZFbqAqVAseCQKB+6+2MMGO23k5M9x0kUmmcb1lmTCKz3e83Yj > JqH0B7XxPzwKKf8lR+KM > =6LvH > -----END PGP SIGNATURE----- > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >