tag:blogger.com,1999:blog-8623074010562846957.post3283772130972730545..comments2023-09-01T03:38:08.236-04:00Comments on Changing Bits: Lucene with Zing, Part 2Michael McCandlesshttp://www.blogger.com/profile/04277432937861334672noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-8623074010562846957.post-51402602166342522542013-02-05T12:51:37.435-05:002013-02-05T12:51:37.435-05:00Hi Aaron,
It was 1.6.0_32 ... and I was really su...Hi Aaron,<br /><br />It was 1.6.0_32 ... and I was really surprised by the G1 results! But that's good news you are getting good results with it. I wonder if RAMDir in somehow a particularly bad / stressful test for it ...Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-59568768577884448322013-02-05T11:59:30.804-05:002013-02-05T11:59:30.804-05:00What version of the JDK were you using to get the ...What version of the JDK were you using to get the horrible G1 results (130s hangs) in this and the previous test? I'm curious as I've been getting fairly good results on large-ish heaps (20-60G) with JDK 7 u4 and later for heavy-garbage-producing applications... The folks over at hotspot-gc-use@openjdk.java.net have also been very helpful/responsive when attempting to work out some kinks with it.Aaronhttps://www.blogger.com/profile/01700077097582846898noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-84400354681330803282013-01-04T10:50:30.732-05:002013-01-04T10:50:30.732-05:00-XX:+PrintGCApplicationStoppedTime reports a numbe...-XX:+PrintGCApplicationStoppedTime reports a number that is different from what the GC logs (and jstat I believe) report for GC operation lengths. The difference is "subtle" but can end up being huge:<br />Regular HotSpot GC log entires report the amount of time that GC spent doing whatever work is being reported. During pauses, this time is measured from the point in time when all threads are stopped at a safepoint, to the time that threads were allowed to run again.<br />-XX:+PrintGCApplicationStoppedTime reports the length of time form the point where the first thread was asked to reach a safepoint, to the time that all threads had been allowed to run again.<br />The difference is the amount of time it takes threads to reach a safepoint. GC work does not start until they do, but threads that have already reached the safepoint are already paused. This time-to-sfaepoint is usually very short, but can sometimes be amazingly long (as in an extra 1/2 second in cases I've seen in the wild, and much longer in lab tests).<br />Regular HotSpot logs seem to not consider this time-to-sfaepoint time gap as "GC pause time", because GC has not actually started... I guess one could argue that they are pauses that do not include any GC work.<br />Anyway, I ALWAYS ask for -XX:+PrintGCApplicationStoppedTime, it's a much more reliable indicator of stoppage time.<br />Or, if you don't want to simply believe what your log files are telling you, you can use something that measures things instead. Like the actual application response time. It won't lie. jHiccup was built for exactly this purpose... <br /><br />Gil Tenehttps://www.blogger.com/profile/10732691137498021997noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-31554665340267366362012-11-27T15:05:06.999-05:002012-11-27T15:05:06.999-05:00Yes, its not transparent. I have a blog post that...Yes, its not transparent. I have a blog post that describes how you set it up.<br /><br />It's here:<br /><br />http://andrigoss.blogspot.com/2008/02/jvm-performance-tuning.html<br /><br />This is really old, but the part of configuring the OS for large page memory is still correct. One little tidbit that I would add is that you don't have to do the calculation I show for /etc/security/limits.conf and you can just use the word unlimited, which makes the setup a little less complicated.Andrig T Millerhttps://www.blogger.com/profile/05386153547711039401noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-17094633115512603952012-11-27T14:11:27.726-05:002012-11-27T14:11:27.726-05:00I don't know when Oracle's Java defaults c...I don't know when Oracle's Java defaults changed, but when I run "java-XX:+PrintCommandLineFlags" (which shows you what "defaults" were picked on startup) with -XX:+UseConcMarkSweepGC, the output includes "-XX:+UseParNewGC". Maybe it's a dynamic decision based on core count / amount of RAM, etc.?<br /><br />Ahhhh, I didn't know about static large pages, to avoid the compaction/defrag cost! That's awesome... I need to read up on this. It sounds like it's not "transparent", ie, you must tell the JVM to use large pages (-XX:+UseLargePages). http://www.oracle.com/technetwork/java/javase/tech/largememory-jsp-137182.html has interesting details...Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-66076715356826023992012-11-27T13:13:49.878-05:002012-11-27T13:13:49.878-05:00I don't believe that CMS using parallel GC for...I don't believe that CMS using parallel GC for the new generation by default, unless the defaults have changed.<br /><br />In terms of transparent huge pages, in my testing, it doesn't work that well compared to configuring them statically. So, turning transparent huge pages off was probably the right thing. However, it is not the same as defining them statically. With transparent huge pages, there is a daemon process that has to scan memory and dynamically combine 4k pages into 2MB (for Intel and AMD systems the large page size is usually 2MB).<br /><br />If you define them statically, then you don't have this daemon process trying to do it on the fly, while your workload is running. It's much more efficient. For the latest Intel and AMD hardware you can actually go to 1GB pages. I think with static large pages memory (HugeTLB in Linux parlance), I think you might see substantial improvements.Andrig T Millerhttps://www.blogger.com/profile/05386153547711039401noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-91777058973213875602012-11-27T12:33:24.416-05:002012-11-27T12:33:24.416-05:00Hi Andrig,
I'm pretty sure CMS defaults to pa...Hi Andrig,<br /><br />I'm pretty sure CMS defaults to parallel GC for the new generation.<br /><br />In the first set of tests I did try -XX:+UseParallelOldGC and it had horribly long pauses.<br /><br />I turned off transparent huge pages (I think that's the same as large page memory?) for these tests: with it on I was seeing longish (~ a few seconds) pauses with both Zing and CMS ... I suspect it was due to huge page compaction/defrag, but I'm not certain.Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-53868142752141549222012-11-27T10:04:59.167-05:002012-11-27T10:04:59.167-05:00Why are you not setting the GC to use parallel GC ...Why are you not setting the GC to use parallel GC for the new generation?<br /><br />It would also be interesting to use +XX:+UseParallelOldGC to see how that works.<br /><br />Finally, are you using large page memory on these tests?<br /><br />If your hardware is new enough, you can use 1GB large pages (the trick is you have to set the PermGen to at least 1GB, because it has to be at least one memory page). Large page memory helps GC overhead quite a bit).<br /><br />Using large page memory and ParallelGC with the CMS collector, and large page memory with the throughput collector (UseParallelOldGC) would be an interesting comparison to Zing that seems to be missing here.Andrig T Millerhttps://www.blogger.com/profile/05386153547711039401noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-80593019798422180502012-11-20T06:25:58.349-05:002012-11-20T06:25:58.349-05:00Thanks mindas, jstat -gcutil is very useful too!Thanks mindas, jstat -gcutil is very useful too!Michael McCandlesshttps://www.blogger.com/profile/04277432937861334672noreply@blogger.comtag:blogger.com,1999:blog-8623074010562846957.post-42791528183394067632012-11-20T05:30:55.648-05:002012-11-20T05:30:55.648-05:00Interesting post! Btw, you can monitor GC stoppage...Interesting post! Btw, you can monitor GC stoppage times and region promotions with "jstat -gcutil pid 5s" on Oracle JDK. This should be a bit more convenient than -XX:+PrintGCApplicationStoppedTimemindashttps://www.blogger.com/profile/08826637355577644350noreply@blogger.com