Ok, i still can`t understand whats the source of 128 value.
Can you check Runtime.getRuntime().availableProcessors() returning value on
your side ?
>
>>
>>>Hi Naveen,
>>>
>>>my first change was to change jvm parameters, at first it seemed to be
>>>resolved but changing jvm parameters only delayed the problem. Before that
>>>heap problems occured after 14-16 hours after the start, but with jvm
>>>changes it took up to 36 hours.
>>>
>>>while keeping jvm changes i updated threadpool configurations and heap
>>>problem solved. we can see
>>>saw pattern in heap usage.
>>>
>>>before: https://ibb.co/mqx4kYy
>>>after: https://ibb.co/y8B0hzS
>>>
>>>
>>>On 2021/09/29 13:37:53, Naveen Kumar < naveen.band...@gmail.com > wrote:
>>>> Good to hear from you , I have had the same issue for quite a long time and
>>>> am still looking for a fix.
>>>>
>>>> What do you think has exactly resolved the heap starvation issue, is it the
>>>> GC related configuration or the threadpool configuration. ?
>>>> Default thread pool is the number of the cores of the server, if this is
>>>> true, we don't need to specify any config for all these thread pool
>>>>
>>>> Thanks
>>>> Naveen
>>>>
>>>>
>>>>
>>>> On Wed, Sep 29, 2021 at 2:35 PM Ibrahim Altun <
>>>> ibrahim.al...@segmentify.com >
>>>> wrote:
>>>>
>>>> > after many configuration changes and optimizations, i think i've solved
>>>> > the heap problem.
>>>> >
>>>> > here are the changes that i applied to the system;
>>>> > JVM changes ->
>>>> >
>>>> > https://medium.com/@hoan.nguyen.it/how-did-g1gc-tuning-flags-affect-our-back-end-web-app-c121d38dfe56
>>>> > helped a lot
>>>> >
>>>> > nodes are running on 12CORE and 64GB MEM servers, i've added the
>>>> > following
>>>> > jvm parameters
>>>> >
>>>> > -XX:ParallelGCThreads=6
>>>> > -XX:ConcGCThreads=2
>>>> > -XX:MaxGCPauseMillis=200
>>>> > -XX:InitiatingHeapOccupancyPercent=40
>>>> >
>>>> > on ignite configuration i've changed all thread pool sizes, which were
>>>> > much more than these;
>>>> > <property name="systemThreadPoolSize" value="12"/>
>>>> > <property name="publicThreadPoolSize" value="12"/>
>>>> > <property name="queryThreadPoolSize" value="12"/>
>>>> > <property name="serviceThreadPoolSize" value="12"/>
>>>> > <property name="stripedPoolSize" value="12"/>
>>>> > <property name="dataStreamerThreadPoolSize" value="12"/>
>>>> > <property name="rebalanceThreadPoolSize" value="12"/>
>>>> >
>>>> > Here is the 16 hours of GC report;
>>>> >
>>>> >
>>>> > https://gceasy.io/diamondgc-report.jsp?p=c2hhcmVkLzIwMjEvMDkvMjkvLS1nYy5sb2cuMC5jdXJyZW50LS04LTU4LTMx&channel=WEB
>>>> >
>>>> >
>>>> >
>>>> > On 2021/09/27 17:11:21, Ilya Korol < llivezk...@gmail.com > wrote:
>>>> > > Actually Query interface doesn't define close() method, but QueryCursor
>>>> > > does.
>>>> > > In your snippets you're using try-with-resource construction for SELECT
>>>> > > queries which is good, but when you run MERGE INTO query you would also
>>>> > > get an QueryCursor as a result of
>>>> > >
>>>> > > igniteCacheService.getCache(ID,
>>>> > IgniteCacheType.LABEL).query(insertQuery);
>>>> > >
>>>> > > so maybe this QueryCursor objects still hold some resources/memory.
>>>> > > Javadoc for QueryCursor states that you should always close cursors.
>>>> > >
>>>> > > To simplify cursor closing there is a cursor.getAll() method that will
>>>> > > do this for you under the hood.
>>>> > >
>>>> > >
>>>> > > On 2021/09/13 06:17:21, Ibrahim Altun < i...@segmentify.com > wrote:
>>>> > > > Hi Ilya,>
>>>> > > >
>>>> > > > since this is production environment i could not risk to take heap
>>>> > > dump for now, but i will try to convince my superiors to get one and
>>>> > > analyze it.>
>>>> > > >
>>>> > > > Queries are heavily used in our system but aren't they autoclosable
>>>> > > objects? do we have to close them anyway?>
>>>> > > >
>>>> > > > here are some usage examples on our system;>
>>>> > > > --insert query is like this; MERGE INTO "ProductLabel" ("productId",
>>>> > > "label", "language") VALUES (?, ?, ?)>
>>>> > > > igniteCacheService.getCache(ID,
>>>> > > IgniteCacheType.LABEL).query(insertQuery);>
>>>> > > >
>>>> > > > another usage example;>
>>>> > > > --sqlFieldsQuery is like this; >
>>>> > > > String sql = "SELECT _val FROM \"UserRecord\" WHERE \"email\" IN
>>>> > (?)";>
>>>> > > > SqlFieldsQuery sqlFieldsQuery = new SqlFieldsQuery(sql);>
>>>> > > > sqlFieldsQuery.setLazy(true);>
>>>> > > > sqlFieldsQuery.setArgs(emails.toArray());>
>>>> > > >
>>>> > > > try (QueryCursor<List<?>> ignored = igniteCacheService.getCache(ID,
>>>> > > IgniteCacheType.USER).query(sqlFieldsQuery)) {...}>
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > On 2021/09/12 20:28:09, Shishkov Ilya < sh...@gmail.com > wrote: >
>>>> > > > > Hi, Ibrahim!>
>>>> > > > > Have you analyzed the heap dump of the server node JVMs?>
>>>> > > > > In case your application executes queries are their cursors
>>>> > > > > closed?>
>>>> > > > > >
>>>> > > > > пт, 10 сент. 2021 г. в 11:54, Ibrahim Altun < ib...@segmentify.com
>>>> > >:>
>>>> > > > > >
>>>> > > > > > Igniters any comment on this issue, we are facing huge GC
>>>> > > problems on>
>>>> > > > > > production environment, please advise.>
>>>> > > > > >>
>>>> > > > > > On 2021/09/07 14:11:09, Ibrahim Altun < ib...@segmentify.com >>
>>>> > > > > > wrote:>
>>>> > > > > > > Hi,>
>>>> > > > > > >>
>>>> > > > > > > totally 400 - 600K reads/writes/updates>
>>>> > > > > > > 12core>
>>>> > > > > > > 64GB RAM>
>>>> > > > > > > no iowait>
>>>> > > > > > > 10 nodes>
>>>> > > > > > >>
>>>> > > > > > > On 2021/09/07 12:51:28, Piotr Jagielski < pj...@touk.pl >
>>>> > > > > > > wrote:>
>>>> > > > > > > > Hi,>
>>>> > > > > > > > Can you provide some information on how you use the cluster?
>>>> > > How many>
>>>> > > > > > reads/writes/updates per second? Also CPU / RAM spec of cluster
>>>> > > nodes?>
>>>> > > > > > > >>
>>>> > > > > > > > We observed full GC / CPU load / OOM killer when loading big
>>>> > > amount of>
>>>> > > > > > data (15 mln records, data streamer + allowOverwrite=true). We've
>>>> > > seen>
>>>> > > > > > 200-400k updates per sec on JMX metrics, but load up to 10 on
>>>> > > nodes, iowait>
>>>> > > > > > to 30%. Our cluster is 3 x 4CPU, 16GB RAM (already upgradingto
>>>> > > 8CPU, 32GB>
>>>> > > > > > RAM). Ignite 2.10>
>>>> > > > > > > >>
>>>> > > > > > > > Regards,>
>>>> > > > > > > > Piotr>
>>>> > > > > > > >>
>>>> > > > > > > > On 2021/09/02 08:36:07, Ibrahim Altun < ib...@segmentify.com
>>>> > > > > > > > >>
>>>> > > > > > wrote:>
>>>> > > > > > > > > After upgrading from 2.7.1 version to 2.10.0 version ignite
>>>> > > nodes>
>>>> > > > > > facing>
>>>> > > > > > > > > huge full GC operations after 24-36 hours after node
>>>> > > > > > > > > start.>
>>>> > > > > > > > >>
>>>> > > > > > > > > We try to increase heap size but no luck, here is the
>>>> > > > > > > > > start>
>>>> > > > > > configuration>
>>>> > > > > > > > > for nodes;>
>>>> > > > > > > > >>
>>>> > > > > > > > > JVM_OPTS="$JVM_OPTS -Xms12g -Xmx12g -server>
>>>> > > > > > > > >>
>>>> > > > > >
>>>> > >
>>>> > -javaagent:/etc/prometheus/jmx_prometheus_javaagent-0.14.0.jar=8090:/etc/prometheus/jmx.yml>
>>>> >
>>>> > >
>>>> > > > > > > > > -Dcom.sun.management.jmxremote>
>>>> > > > > > > > > -Dcom.sun.management.jmxremote.authenticate=false>
>>>> > > > > > > > > -Dcom.sun.management.jmxremote.port=49165>
>>>> > > > > > > > > -Dcom.sun.management.jmxremote.host=localhost>
>>>> > > > > > > > > -XX:MaxMetaspaceSize=256m -XX:MaxDirectMemorySize=1g>
>>>> > > > > > > > > -DIGNITE_SKIP_CONFIGURATION_CONSISTENCY_CHECK=true>
>>>> > > > > > > > > -DIGNITE_WAL_MMAP=true
>>>> > > -DIGNITE_BPLUS_TREE_LOCK_RETRIES=100000>
>>>> > > > > > > > > -Djava.net.preferIPv4Stack=true">
>>>> > > > > > > > >>
>>>> > > > > > > > > JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch -XX:+UseG1GC>
>>>> > > > > > > > > -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC>
>>>> > > > > > > > > -XX:+UseStringDeduplication
>>>> > > -Xloggc:/var/log/apache-ignite/gc.log>
>>>> > > > > > > > > -XX:+PrintGCDetails -XX:+PrintGCDateStamps>
>>>> > > > > > > > > -XX:+PrintTenuringDistribution -XX:+PrintGCCause>
>>>> > > > > > > > > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10>
>>>> > > > > > > > > -XX:GCLogFileSize=100M">
>>>> > > > > > > > >>
>>>> > > > > > > > > here is the 80 hours of GC analyize report:>
>>>> > > > > > > > >>
>>>> > > > > >
>>>> > >
>>>> >
>>>> > https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjEvMDgvMzEvLS1nYy5sb2cuMC5jdXJyZW50LnppcC0tNS01MS0yOQ==&channel=WEB
>>>> > >
>>>> >
>>>> > >
>>>> > > > > > > > >>
>>>> > > > > > > > > do we need more heap size or is there a BUG that we need to
>>>> > > be aware?>
>>>> > > > > > > > >>
>>>> > > > > > > > > here is the node configuration:>
>>>> > > > > > > > >>
>>>> > > > > > > > > <?xml version="1.0" encoding="UTF-8"?>>
>>>> > > > > > > > > <beans xmlns=" http://www.springframework.org/schema/beans
>>>> > > > > > > > > ">
>>>> > > > > > > > > xmlns:xsi=" http://www.w3.org/2001/XMLSchema-instance ">
>>>> > > > > > > > > xsi:schemaLocation=">
>>>> > > > > > > > > http://www.springframework.org/schema/beans >
>>>> > > > > > > > >
>>>> > > http://www.springframework.org/schema/beans/spring-beans.xsd ">>
>>>> > > > > > > > > <bean id="ignite.cfg">
>>>> > > > > > > > >
>>>> > class="org.apache.ignite.configuration.IgniteConfiguration">>
>>>> > > > > > > > > <property name="gridLogger">>
>>>> > > > > > > > > <bean
>>>> > > > > > > > > class="org.apache.ignite.logger.log4j2.Log4J2Logger">>
>>>> > > > > > > > > <constructor-arg type="java.lang.String">
>>>> > > > > > > > > value="/etc/apache-ignite/ignite-log4j2.xml"/>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > > <property name="communicationSpi">>
>>>> > > > > > > > > <bean>
>>>> > > > > >
>>>> > > class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">>
>>>> > > > > > > > > <property name="usePairedConnections" value="true"/>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > > <property name="failureDetectionTimeout" value="60000"/>>
>>>> > > > > > > > > <property name="systemThreadPoolSize" value="128"/>>
>>>> > > > > > > > > <property name="publicThreadPoolSize" value="128"/>>
>>>> > > > > > > > > <property name="queryThreadPoolSize" value="128"/>>
>>>> > > > > > > > > <property name="serviceThreadPoolSize" value="128"/>>
>>>> > > > > > > > > <property name="stripedPoolSize" value="128"/>>
>>>> > > > > > > > > <property name="dataStreamerThreadPoolSize" value="4"/>>
>>>> > > > > > > > > <property name="rebalanceThreadPoolSize" value="16"/>>
>>>> > > > > > > > >>
>>>> > > > > > > > > <!-- Explicitly enable peer class loading. -->>
>>>> > > > > > > > > <property name="peerClassLoadingEnabled" value="true"/>>
>>>> > > > > > > > >>
>>>> > > > > > > > > <!-- Enable deploymentSpi,>
>>>> > > > > > > > > /usr/share/apache-ignite/libs/segmentify directory will be
>>>> > > checked>
>>>> > > > > > > > > every 5 seconds for changed files-->>
>>>> > > > > > > > > <property name="deploymentSpi">>
>>>> > > > > > > > > <bean>
>>>> > > > > > class="org.apache.ignite.spi.deployment.uri.UriDeploymentSpi">>
>>>> > > > > > > > > <property name="temporaryDirectoryPath">
>>>> > > > > > > > > value="/tmp/temp_ignite_libs"/>>
>>>> > > > > > > > > <property name="uriList">>
>>>> > > > > > > > > <list>>
>>>> > > > > > > > >>
>>>> > > > > > > > > <value>file://freq=5000@localhost>
>>>> > > > > > /usr/share/apache-ignite/libs/segmentify/</value>>
>>>> > > > > > > > > </list>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > >>
>>>> > > > > > > > > <property name="cacheConfiguration">>
>>>> > > > > > > > > <list>>
>>>> > > > > > > > > <!-- Partitioned cache example configuration (Atomic>
>>>> > > > > > mode). -->>
>>>> > > > > > > > > <bean>
>>>> > > > > > class="org.apache.ignite.configuration.CacheConfiguration">>
>>>> > > > > > > > > <property name="name" value="default"/>>
>>>> > > > > > > > > <property name="atomicityMode" value="ATOMIC"/>>
>>>> > > > > > > > > <property name="backups" value="1"/>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </list>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > >>
>>>> > > > > > > > > <!-- Explicitly configure TCP discovery SPI to provide list
>>>> > > of>
>>>> > > > > > > > > initial nodes. -->>
>>>> > > > > > > > > <property name="discoverySpi">>
>>>> > > > > > > > > <bean>
>>>> > > > > > class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">>
>>>> > > > > > > > > <property name="networkTimeout" value="60000"/>>
>>>> > > > > > > > > <property name="ipFinder">>
>>>> > > > > > > > > <bean>
>>>> > > > > > > > >>
>>>> > > > > >
>>>> > >
>>>> > class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">>
>>>> >
>>>> > >
>>>> > > > > > > > > <property name="addresses">>
>>>> > > > > > > > > <list>>
>>>> > > > > > > > > <!-- THERE ARE 10 NODES -->>
>>>> > > > > > > > > </list>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > >>
>>>> > > > > > > > > <!-- Enabling Apache Ignite native persistence. -->>
>>>> > > > > > > > > <property name="dataStorageConfiguration">>
>>>> > > > > > > > > <bean>
>>>> > > > > > class="org.apache.ignite.configuration.DataStorageConfiguration">>
>>>> > > > > > > > > <property name="defaultDataRegionConfiguration">>
>>>> > > > > > > > > <bean>
>>>> > > > > > > > >
>>>> > > class="org.apache.ignite.configuration.DataRegionConfiguration">>
>>>> > > > > > > > > <property name="persistenceEnabled">
>>>> > > > > > value="true"/>>
>>>> > > > > > > > > <property name="checkpointPageBufferSize">
>>>> > > > > > > > > value="#{ 2L * 1024 * 1024 * 1024}"/>>
>>>> > > > > > > > > <property name="maxSize" value="#{ 40L * 1024 *>
>>>> > > > > > > > > 1024 * 1024 }"/>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > > <property name="storagePath">
>>>> > > > > > value="/srv/ignite/persist"/>>
>>>> > > > > > > > > <property name="walPath" value="/srv/ignite/wal"/>>
>>>> > > > > > > > > <property name="walArchivePath" value="/srv/ignite/wal"/>>
>>>> > > > > > > > > <property name="walMode" value="LOG_ONLY"/>>
>>>> > > > > > > > > <property name="walSegmentSize" value="#{ 256L * 1024 *>
>>>> > > > > > 1024 }"/>>
>>>> > > > > > > > > <property name="walFlushFrequency" value="5000"/>>
>>>> > > > > > > > > <property name="maxWalArchiveSize" value="#{ 512L * 1024>
>>>> > > > > > * 1024 }"/>>
>>>> > > > > > > > > <property name="writeThrottlingEnabled" value="true"/>>
>>>> > > > > > > > > <property name="checkpointFrequency" value="300000"/>>
>>>> > > > > > > > > <property name="checkpointWriteOrder" value="SEQUENTIAL">
>>>> > > > > > />>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > > </property>>
>>>> > > > > > > > > </bean>>
>>>> > > > > > > > >>
>>>> > > > > > > > >>
>>>> > > > > > > > > -->
>>>> > > > > > > > > < https://www.segmentify.com/ >İbrahim Halil AltunSenior
>>>> > > Software>
>>>> > > > > > Engineer+90>
>>>> > > > > > > > > 536 3327510 • segmentify.com →
>>>> > > < https://www.segmentify.com/ >UK •>
>>>> > > > > > Germany •>
>>>> > > > > > > > > Turkey < https://www.segmentify.com/ecommerce-growth-show
>>>> > > > > > > > > >>
>>>> > > > > > > > > < https://www.g2.com/products/segmentify/reviews >>
>>>> > > > > > > > >>
>>>> > > > > > > >>
>>>> > > > > > >>
>>>> > > > > >>
>>>> > > > > >
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Naveen Bandaru
>>>>
>>
>>
>>
>>