Hello! Direct IO module is experimental and should not be used unless performance is tested first, in your specific use case.
Regards, -- Ilya Kasnacheev пн, 18 мая 2020 г. в 16:47, 38797715 <[email protected]>: > Hi, > > If direct IO is disabled, the startup speed will be doubled, including > some other tests. I find that direct IO has a great impact on the read > performance. > 在 2020/5/14 上午5:16, Evgenii Zhuravlev 写道: > > Can you share full logs from all nodes? > > вт, 12 мая 2020 г. в 18:24, 38797715 <[email protected]>: > >> Hi Evgenii, >> >> The storage used is not SSD. >> >> We will use different versions of ignite for further testing, such as >> ignite2.8. >> Ignite is configured as follows: >> <?xml version="1.0" encoding="UTF-8"?> >> <beans xmlns="http://www.springframework.org/schema/beans" >> <http://www.springframework.org/schema/beans> >> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >> <http://www.w3.org/2001/XMLSchema-instance> xsi:schemaLocation=" >> http://www.springframework.org/schema/beans >> http://www.springframework.org/schema/beans/spring-beans.xsd"> >> <bean id="ignite.cfg" class= >> "org.apache.ignite.configuration.IgniteConfiguration"> >> <property name="peerClassLoadingEnabled" value="true"/> >> <property name="consistentId" value="20"/> >> <property name="failureDetectionTimeout" value="120000"/> >> <property name="workDirectory" value="/appdata/ignite"/> >> <property name="rebalanceBatchSize" value="#{2 * 1024 * 1024}"/> >> <property name="rebalanceThrottle" value="100"/> >> <property name="rebalanceThreadPoolSize" value="4"/> >> <property name="gridLogger"> >> <bean class="org.apache.ignite.logger.log4j2.Log4J2Logger"> >> <constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml" >> /> >> </bean> >> </property> >> <property name="cacheConfiguration"> >> <list> >> <bean id="partitioned-cache-template" abstract="true" class= >> "org.apache.ignite.configuration.CacheConfiguration"> >> <property name="name" value="cache-partitioned*"/> >> <property name="cacheMode" value="PARTITIONED" /> >> <property name="backups" value="1" /> >> <property name="queryParallelism" value="16"/> >> <property name="partitionLossPolicy" value="READ_ONLY_SAFE"/> >> </bean> >> <bean id="replicated-cache-template" abstract="true" class= >> "org.apache.ignite.configuration.CacheConfiguration"> >> <property name="name" value="cache-replicated*"/> >> <property name="cacheMode" value="REPLICATED" /> >> <property name="partitionLossPolicy" value="READ_ONLY_SAFE"/> >> </bean> >> </list> >> </property> >> <!-- Enabling Apache Ignite Persistent Store. --> >> <property name="dataStorageConfiguration"> >> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> >> <property name="defaultDataRegionConfiguration"> >> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> >> <property name="persistenceEnabled" value="true"/> >> <property name="maxSize" value="#{200L * 1024 * 1024 * 1024}"/> >> </bean> >> </property> >> </bean> >> </property> >> </bean> >> </beans> >> 在 2020/5/13 上午4:45, Evgenii Zhuravlev 写道: >> >> Hi, >> >> Can you share full logs and configuration? What disk so you use? >> >> Evgenii >> >> вт, 12 мая 2020 г. в 06:49, 38797715 <[email protected]>: >> >>> Among them: >>> CO_CO_NEW: ~ 48 minutes(partitioned,backup=1,33M) >>> >>> Ignite sys cache: ~ 27 minutes >>> >>> PLM_ITEM:~3 minutes(repicated,1.9K) >>> >>> >>> 在 2020/5/12 下午9:08, 38797715 写道: >>> >>> Hi community, >>> >>> We have 5 servers, 16 cores, 256g memory, and 200g off-heap memory. >>> We have 7 tables to test, and the data volume is >>> respectively:31.8M,495.2M,552.3M,33M,873.3K,28M,1.9K(replicated),others are >>> partitioned(backup = 1) >>> >>> VM args:-server -Xms20g -Xmx20g -XX:+AlwaysPreTouch -XX:+UseG1GC >>> -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC -XX:+PrintGCDetails >>> -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation >>> -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M >>> -Xloggc:/data/gc/logs/gclog.txt -Djava.net.preferIPv4Stack=true >>> -XX:MaxDirectMemorySize=256M -XX:+PrintAdaptiveSizePolicy >>> >>> Today, one of the servers was restarted(kill and then start ignite.sh) >>> for some reason, but the node took 1.5 hours to start, which was much >>> longer than expected. >>> >>> After analyzing the log, the following information is found: >>> [2020-05-12T17:00:05,138][INFO ][main][GridCacheDatabaseSharedManager] >>> Found last checkpoint marker [cpId=7a0564f2-43e5-400b-9439-746fc68a6ccb, >>> pos=FileWALPointer [idx=10511, fileOff=51348888, len=61193]] >>> [2020-05-12T17:00:05,151][INFO ][main][GridCacheDatabaseSharedManager] >>> Binary memory state restored at node startup [restoredPtr=FileWALPointer >>> [idx=10511, fileOff=51410110, len=0]] >>> [2020-05-12T17:00:05,152][INFO ][main][FileWriteAheadLogManager] >>> Resuming logging to WAL segment [file=/appdata/ignite/db/wal/24/ >>> 0000000000000001.wal, offset=51410110, ver=2] >>> [2020-05-12T17:00:06,448][INFO ][main][PageMemoryImpl] Started page >>> memory [memoryAllocated=200.0 GiB, pages=50821088, tableSize=3.9 GiB, >>> checkpointBuffer=2.0 GiB] >>> [2020-05-12T17:02:08,528][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=CO_CO_NEW, id=-189779360, >>> dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, >>> mvcc=false] >>> [2020-05-12T17:50:44,341][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=CO_CO_LINE, id=-1588248812, >>> dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, >>> mvcc=false] >>> [2020-05-12T17:50:44,366][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=ignite-sys-cache, id=-2100569601, >>> dataRegionName=sysMemPlc, mode=REPLICATED, atomicity=TRANSACTIONAL, backups= >>> 2147483647, mvcc=false] >>> [2020-05-12T18:17:57,071][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=CO_CO_LINE_NEW, id=1742991829, >>> dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, >>> mvcc=false] >>> [2020-05-12T18:19:54,910][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=PI_COM_DAY, id=-1904194728, >>> dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, >>> mvcc=false] >>> [2020-05-12T18:19:54,949][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=PLM_ITEM, id=-1283854143, >>> dataRegionName=default, mode=REPLICATED, atomicity=ATOMIC, backups= >>> 2147483647, mvcc=false] >>> [2020-05-12T18:22:53,662][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=CO_CO, id=64322847, >>> dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, >>> mvcc=false] >>> [2020-05-12T18:22:54,876][INFO ][main][GridCacheProcessor] Started >>> cache in recovery mode [name=CO_CUST, id=1684722246, >>> dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, >>> mvcc=false] >>> [2020-05-12T18:22:54,892][INFO ][main][GridCacheDatabaseSharedManager] >>> Binary recovery performed in 4970233 ms. >>> >>> Among them, binary recovery took 4970 seconds. >>> >>> Our question is: >>> >>> 1.Why is the start time so long? >>> >>> 2.Is the current state of ignite, with the growth of single node data >>> volume, the restart time will be longer and longer? >>> >>> 3.Do have any suggestions for optimizing the restart time? >>> >>>
