oh sorry about that, 128 is in our configuration file.
On 2021/09/29 15:47:27, Stephen Darlington <stephen.darling...@gridgain.com> wrote: > Correct me if I’m wrong, but I think they set the size of the threadpool to > 128 in their configuration file. > > > On 29 Sep 2021, at 16:33, Zhenya Stanilovsky <arzamas...@mail.ru> wrote: > > > > Ok, i still can`t understand whats the source of 128 value. > > Can you check Runtime.getRuntime().availableProcessors() returning value on > > your side ? > > > > > > > > > > Hi Naveen, > > > > my first change was to change jvm parameters, at first it seemed to be > > resolved but changing jvm parameters only delayed the problem. Before that > > heap problems occured after 14-16 hours after the start, but with jvm > > changes it took up to 36 hours. > > > > while keeping jvm changes i updated threadpool configurations and heap > > problem solved. we can see > > saw pattern in heap usage. > > > > before: https://ibb.co/mqx4kYy <https://ibb.co/mqx4kYy> > > after: https://ibb.co/y8B0hzS <https://ibb.co/y8B0hzS> > > > > > > On 2021/09/29 13:37:53, Naveen Kumar <naveen.band...@gmail.com > > <x-msg://34/compose?To=naveen.band...@gmail.com>> wrote: > > > Good to hear from you , I have had the same issue for quite a long time > > > and > > > am still looking for a fix. > > > > > > What do you think has exactly resolved the heap starvation issue, is it > > > the > > > GC related configuration or the threadpool configuration. ? > > > Default thread pool is the number of the cores of the server, if this is > > > true, we don't need to specify any config for all these thread pool > > > > > > Thanks > > > Naveen > > > > > > > > > > > > On Wed, Sep 29, 2021 at 2:35 PM Ibrahim Altun > > > <ibrahim.al...@segmentify.com > > > <x-msg://34/compose?To=ibrahim.al...@segmentify.com>> > > > wrote: > > > > > > > after many configuration changes and optimizations, i think i've solved > > > > the heap problem. > > > > > > > > here are the changes that i applied to the system; > > > > JVM changes -> > > > > https://medium.com/@hoan.nguyen.it/how-did-g1gc-tuning-flags-affect-our-back-end-web-app-c121d38dfe56 > > > > > > > > <https://medium.com/@hoan.nguyen.it/how-did-g1gc-tuning-flags-affect-our-back-end-web-app-c121d38dfe56> > > > > helped a lot > > > > > > > > nodes are running on 12CORE and 64GB MEM servers, i've added the > > > > following > > > > jvm parameters > > > > > > > > -XX:ParallelGCThreads=6 > > > > -XX:ConcGCThreads=2 > > > > -XX:MaxGCPauseMillis=200 > > > > -XX:InitiatingHeapOccupancyPercent=40 > > > > > > > > on ignite configuration i've changed all thread pool sizes, which were > > > > much more than these; > > > > <property name="systemThreadPoolSize" value="12"/> > > > > <property name="publicThreadPoolSize" value="12"/> > > > > <property name="queryThreadPoolSize" value="12"/> > > > > <property name="serviceThreadPoolSize" value="12"/> > > > > <property name="stripedPoolSize" value="12"/> > > > > <property name="dataStreamerThreadPoolSize" value="12"/> > > > > <property name="rebalanceThreadPoolSize" value="12"/> > > > > > > > > Here is the 16 hours of GC report; > > > > > > > > https://gceasy.io/diamondgc-report.jsp?p=c2hhcmVkLzIwMjEvMDkvMjkvLS1nYy5sb2cuMC5jdXJyZW50LS04LTU4LTMx&channel=WEB > > > > > > > > <https://gceasy.io/diamondgc-report.jsp?p=c2hhcmVkLzIwMjEvMDkvMjkvLS1nYy5sb2cuMC5jdXJyZW50LS04LTU4LTMx&channel=WEB> > > > > > > > > > > > > > > > > On 2021/09/27 17:11:21, Ilya Korol <llivezk...@gmail.com > > > > <x-msg://34/compose?To=llivezk...@gmail.com>> wrote: > > > > > Actually Query interface doesn't define close() method, but > > > > > QueryCursor > > > > > does. > > > > > In your snippets you're using try-with-resource construction for > > > > > SELECT > > > > > queries which is good, but when you run MERGE INTO query you would > > > > > also > > > > > get an QueryCursor as a result of > > > > > > > > > > igniteCacheService.getCache(ID, > > > > IgniteCacheType.LABEL).query(insertQuery); > > > > > > > > > > so maybe this QueryCursor objects still hold some resources/memory. > > > > > Javadoc for QueryCursor states that you should always close cursors. > > > > > > > > > > To simplify cursor closing there is a cursor.getAll() method that will > > > > > do this for you under the hood. > > > > > > > > > > > > > > > On 2021/09/13 06:17:21, Ibrahim Altun <i...@segmentify.com > > > > > <x-msg://34/compose?To=i...@segmentify.com>> wrote: > > > > > > Hi Ilya,> > > > > > > > > > > > > since this is production environment i could not risk to take heap > > > > > dump for now, but i will try to convince my superiors to get one and > > > > > analyze it.> > > > > > > > > > > > > Queries are heavily used in our system but aren't they autoclosable > > > > > objects? do we have to close them anyway?> > > > > > > > > > > > > here are some usage examples on our system;> > > > > > > --insert query is like this; MERGE INTO "ProductLabel" ("productId", > > > > > "label", "language") VALUES (?, ?, ?)> > > > > > > igniteCacheService.getCache(ID, > > > > > IgniteCacheType.LABEL).query(insertQuery);> > > > > > > > > > > > > another usage example;> > > > > > > --sqlFieldsQuery is like this; > > > > > > > String sql = "SELECT _val FROM \"UserRecord\" WHERE \"email\" IN > > > > (?)";> > > > > > > SqlFieldsQuery sqlFieldsQuery = new SqlFieldsQuery(sql);> > > > > > > sqlFieldsQuery.setLazy(true);> > > > > > > sqlFieldsQuery.setArgs(emails.toArray());> > > > > > > > > > > > > try (QueryCursor<List<?>> ignored = igniteCacheService.getCache(ID, > > > > > IgniteCacheType.USER).query(sqlFieldsQuery)) {...}> > > > > > > > > > > > > > > > > > > > > > > > > On 2021/09/12 20:28:09, Shishkov Ilya <sh...@gmail.com > > > > > > <x-msg://34/compose?To=sh...@gmail.com>> wrote: > > > > > > > > Hi, Ibrahim!> > > > > > > > Have you analyzed the heap dump of the server node JVMs?> > > > > > > > In case your application executes queries are their cursors > > > > > > > closed?> > > > > > > > > > > > > > > > пт, 10 сент. 2021 г. в 11:54, Ibrahim Altun <ib...@segmentify.com > > > > > > > <x-msg://34/compose?To=ib...@segmentify.com> > > > > >:> > > > > > > > > > > > > > > > > Igniters any comment on this issue, we are facing huge GC > > > > > problems on> > > > > > > > > production environment, please advise.> > > > > > > > >> > > > > > > > > On 2021/09/07 14:11:09, Ibrahim Altun <ib...@segmentify.com > > > > > > > > <x-msg://34/compose?To=ib...@segmentify.com>>> > > > > > > > > wrote:> > > > > > > > > > Hi,> > > > > > > > > >> > > > > > > > > > totally 400 - 600K reads/writes/updates> > > > > > > > > > 12core> > > > > > > > > > 64GB RAM> > > > > > > > > > no iowait> > > > > > > > > > 10 nodes> > > > > > > > > >> > > > > > > > > > On 2021/09/07 12:51:28, Piotr Jagielski <pj...@touk.pl > > > > > > > > > <x-msg://34/compose?To=pj...@touk.pl>> wrote:> > > > > > > > > > > Hi,> > > > > > > > > > > Can you provide some information on how you use the cluster? > > > > > How many> > > > > > > > > reads/writes/updates per second? Also CPU / RAM spec of cluster > > > > > nodes?> > > > > > > > > > >> > > > > > > > > > > We observed full GC / CPU load / OOM killer when loading big > > > > > amount of> > > > > > > > > data (15 mln records, data streamer + allowOverwrite=true). > > > > > > > > We've > > > > > seen> > > > > > > > > 200-400k updates per sec on JMX metrics, but load up to 10 on > > > > > nodes, iowait> > > > > > > > > to 30%. Our cluster is 3 x 4CPU, 16GB RAM (already upgradingto > > > > > 8CPU, 32GB> > > > > > > > > RAM). Ignite 2.10> > > > > > > > > > >> > > > > > > > > > > Regards,> > > > > > > > > > > Piotr> > > > > > > > > > >> > > > > > > > > > > On 2021/09/02 08:36:07, Ibrahim Altun <ib...@segmentify.com > > > > > > > > > > <x-msg://34/compose?To=ib...@segmentify.com>>> > > > > > > > > wrote:> > > > > > > > > > > > After upgrading from 2.7.1 version to 2.10.0 version > > > > > > > > > > > ignite > > > > > nodes> > > > > > > > > facing> > > > > > > > > > > > huge full GC operations after 24-36 hours after node > > > > > > > > > > > start.> > > > > > > > > > > >> > > > > > > > > > > > We try to increase heap size but no luck, here is the > > > > > > > > > > > start> > > > > > > > > configuration> > > > > > > > > > > > for nodes;> > > > > > > > > > > >> > > > > > > > > > > > JVM_OPTS="$JVM_OPTS -Xms12g -Xmx12g -server> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > -javaagent:/etc/prometheus/jmx_prometheus_javaagent-0.14.0.jar=8090:/etc/prometheus/jmx.yml> > > > > > > > > > > > > > > > > > > > > -Dcom.sun.management.jmxremote> > > > > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false> > > > > > > > > > > > -Dcom.sun.management.jmxremote.port=49165> > > > > > > > > > > > -Dcom.sun.management.jmxremote.host=localhost> > > > > > > > > > > > -XX:MaxMetaspaceSize=256m -XX:MaxDirectMemorySize=1g> > > > > > > > > > > > -DIGNITE_SKIP_CONFIGURATION_CONSISTENCY_CHECK=true> > > > > > > > > > > > -DIGNITE_WAL_MMAP=true > > > > > -DIGNITE_BPLUS_TREE_LOCK_RETRIES=100000> > > > > > > > > > > > -Djava.net.preferIPv4Stack=true"> > > > > > > > > > > >> > > > > > > > > > > > JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch -XX:+UseG1GC> > > > > > > > > > > > -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC> > > > > > > > > > > > -XX:+UseStringDeduplication > > > > > -Xloggc:/var/log/apache-ignite/gc.log> > > > > > > > > > > > -XX:+PrintGCDetails -XX:+PrintGCDateStamps> > > > > > > > > > > > -XX:+PrintTenuringDistribution -XX:+PrintGCCause> > > > > > > > > > > > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10> > > > > > > > > > > > -XX:GCLogFileSize=100M"> > > > > > > > > > > >> > > > > > > > > > > > here is the 80 hours of GC analyize report:> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjEvMDgvMzEvLS1nYy5sb2cuMC5jdXJyZW50LnppcC0tNS01MS0yOQ==&channel=WEB > > > > > > > > <https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjEvMDgvMzEvLS1nYy5sb2cuMC5jdXJyZW50LnppcC0tNS01MS0yOQ==&channel=WEB>> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > do we need more heap size or is there a BUG that we need > > > > > > > > > > > to > > > > > be aware?> > > > > > > > > > > >> > > > > > > > > > > > here is the node configuration:> > > > > > > > > > > >> > > > > > > > > > > > <?xml version="1.0" encoding="UTF-8"?>> > > > > > > > > > > > <beans xmlns="http://www.springframework.org/schema/beans > > > > > > > > > > > <http://www.springframework.org/schema/beans>"> > > > > > > > > > > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance > > > > > > > > > > > <http://www.w3.org/2001/XMLSchema-instance>"> > > > > > > > > > > > xsi:schemaLocation="> > > > > > > > > > > > http://www.springframework.org/schema/beans > > > > > > > > > > > <http://www.springframework.org/schema/beans>> > > > > > > > > > > > > > > > > http://www.springframework.org/schema/beans/spring-beans.xsd > > > > > <http://www.springframework.org/schema/beans/spring-beans.xsd>">> > > > > > > > > > > > <bean id="ignite.cfg"> > > > > > > > > > > > > > > > class="org.apache.ignite.configuration.IgniteConfiguration">> > > > > > > > > > > > <property name="gridLogger">> > > > > > > > > > > > <bean > > > > > > > > > > > class="org.apache.ignite.logger.log4j2.Log4J2Logger">> > > > > > > > > > > > <constructor-arg type="java.lang.String"> > > > > > > > > > > > value="/etc/apache-ignite/ignite-log4j2.xml"/>> > > > > > > > > > > > </bean>> > > > > > > > > > > > </property>> > > > > > > > > > > > <property name="communicationSpi">> > > > > > > > > > > > <bean> > > > > > > > > > > > > > class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">> > > > > > > > > > > > <property name="usePairedConnections" value="true"/>> > > > > > > > > > > > </bean>> > > > > > > > > > > > </property>> > > > > > > > > > > > <property name="failureDetectionTimeout" value="60000"/>> > > > > > > > > > > > <property name="systemThreadPoolSize" value="128"/>> > > > > > > > > > > > <property name="publicThreadPoolSize" value="128"/>> > > > > > > > > > > > <property name="queryThreadPoolSize" value="128"/>> > > > > > > > > > > > <property name="serviceThreadPoolSize" value="128"/>> > > > > > > > > > > > <property name="stripedPoolSize" value="128"/>> > > > > > > > > > > > <property name="dataStreamerThreadPoolSize" value="4"/>> > > > > > > > > > > > <property name="rebalanceThreadPoolSize" value="16"/>> > > > > > > > > > > >> > > > > > > > > > > > <!-- Explicitly enable peer class loading. -->> > > > > > > > > > > > <property name="peerClassLoadingEnabled" value="true"/>> > > > > > > > > > > >> > > > > > > > > > > > <!-- Enable deploymentSpi,> > > > > > > > > > > > /usr/share/apache-ignite/libs/segmentify directory will be > > > > > checked> > > > > > > > > > > > every 5 seconds for changed files-->> > > > > > > > > > > > <property name="deploymentSpi">> > > > > > > > > > > > <bean> > > > > > > > > class="org.apache.ignite.spi.deployment.uri.UriDeploymentSpi">> > > > > > > > > > > > <property name="temporaryDirectoryPath"> > > > > > > > > > > > value="/tmp/temp_ignite_libs"/>> > > > > > > > > > > > <property name="uriList">> > > > > > > > > > > > <list>> > > > > > > > > > > >> > > > > > > > > > > > <value>file://freq=5000@localhost> > > > > > > > > /usr/share/apache-ignite/libs/segmentify/</value>> > > > > > > > > > > > </list>> > > > > > > > > > > > </property>> > > > > > > > > > > > </bean>> > > > > > > > > > > > </property>> > > > > > > > > > > >> > > > > > > > > > > > <property name="cacheConfiguration">> > > > > > > > > > > > <list>> > > > > > > > > > > > <!-- Partitioned cache example configuration (Atomic> > > > > > > > > mode). -->> > > > > > > > > > > > <bean> > > > > > > > > class="org.apache.ignite.configuration.CacheConfiguration">> > > > > > > > > > > > <property name="name" value="default"/>> > > > > > > > > > > > <property name="atomicityMode" value="ATOMIC"/>> > > > > > > > > > > > <property name="backups" value="1"/>> > > > > > > > > > > > </bean>> > > > > > > > > > > > </list>> > > > > > > > > > > > </property>> > > > > > > > > > > >> > > > > > > > > > > > <!-- Explicitly configure TCP discovery SPI to provide > > > > > > > > > > > list > > > > > of> > > > > > > > > > > > initial nodes. -->> > > > > > > > > > > > <property name="discoverySpi">> > > > > > > > > > > > <bean> > > > > > > > > class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">> > > > > > > > > > > > <property name="networkTimeout" value="60000"/>> > > > > > > > > > > > <property name="ipFinder">> > > > > > > > > > > > <bean> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">> > > > > > > > > > > > > > > > > > > > > <property name="addresses">> > > > > > > > > > > > <list>> > > > > > > > > > > > <!-- THERE ARE 10 NODES -->> > > > > > > > > > > > </list>> > > > > > > > > > > > </property>> > > > > > > > > > > > </bean>> > > > > > > > > > > > </property>> > > > > > > > > > > > </bean>> > > > > > > > > > > > </property>> > > > > > > > > > > >> > > > > > > > > > > > <!-- Enabling Apache Ignite native persistence. -->> > > > > > > > > > > > <property name="dataStorageConfiguration">> > > > > > > > > > > > <bean> > > > > > > > > class="org.apache.ignite.configuration.DataStorageConfiguration">> > > > > > > > > > > > <property name="defaultDataRegionConfiguration">> > > > > > > > > > > > <bean> > > > > > > > > > > > > > > > > class="org.apache.ignite.configuration.DataRegionConfiguration">> > > > > > > > > > > > <property name="persistenceEnabled"> > > > > > > > > value="true"/>> > > > > > > > > > > > <property name="checkpointPageBufferSize"> > > > > > > > > > > > value="#{ 2L * 1024 * 1024 * 1024}"/>> > > > > > > > > > > > <property name="maxSize" value="#{ 40L * 1024 *> > > > > > > > > > > > 1024 * 1024 }"/>> > > > > > > > > > > > </bean>> > > > > > > > > > > > </property>> > > > > > > > > > > > <property name="storagePath"> > > > > > > > > value="/srv/ignite/persist"/>> > > > > > > > > > > > <property name="walPath" value="/srv/ignite/wal"/>> > > > > > > > > > > > <property name="walArchivePath" value="/srv/ignite/wal"/>> > > > > > > > > > > > <property name="walMode" value="LOG_ONLY"/>> > > > > > > > > > > > <property name="walSegmentSize" value="#{ 256L * 1024 *> > > > > > > > > 1024 }"/>> > > > > > > > > > > > <property name="walFlushFrequency" value="5000"/>> > > > > > > > > > > > <property name="maxWalArchiveSize" value="#{ 512L * 1024> > > > > > > > > * 1024 }"/>> > > > > > > > > > > > <property name="writeThrottlingEnabled" value="true"/>> > > > > > > > > > > > <property name="checkpointFrequency" value="300000"/>> > > > > > > > > > > > <property name="checkpointWriteOrder" value="SEQUENTIAL"> > > > > > > > > />> > > > > > > > > > > > </bean>> > > > > > > > > > > > </property>> > > > > > > > > > > > </bean>> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > --> > > > > > > > > > > > <https://www.segmentify.com/ > > > > > > > > > > > <https://www.segmentify.com/>>İbrahim Halil AltunSenior > > > > > Software> > > > > > > > > Engineer+90> > > > > > > > > > > > 536 3327510 • segmentify.com → > > > > > <https://www.segmentify.com/ <https://www.segmentify.com/>>UK •> > > > > > > > > Germany •> > > > > > > > > > > > Turkey <https://www.segmentify.com/ecommerce-growth-show > > > > > > > > > > > <https://www.segmentify.com/ecommerce-growth-show>>> > > > > > > > > > > > <https://www.g2.com/products/segmentify/reviews > > > > > > > > > > > <https://www.g2.com/products/segmentify/reviews>>> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Thanks & Regards, > > > Naveen Bandaru > > > > > > > > > > > > > >