Re: Exception on CacheEntryProcessor invoke (2.10.0)

2021-05-25 Thread ihalilaltun
Hi, here is the debug log ignite.zip in the mean time i'll try to simplfy use case as you suggested. - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabbl

Re: Exception on CacheEntryProcessor invoke (2.10.0)

2021-05-24 Thread ihalilaltun
Hi, I've run more detailed tests during the weekend and i can surely tell that problem is not related to the migrated data. With a new cluster setup and with 0 data we can still get the error. what i have in my mind is this; with the new version there may be a new configuration parameter that has

Re: Exception on CacheEntryProcessor invoke (2.10.0)

2021-05-21 Thread ihalilaltun
hi the case can be reproduced only by upgrading from 2.7.6 to 2.10.0 with existing data. can you run that kind of reproduce step? - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Exception on CacheEntryProcessor invoke (2.10.0)

2021-05-21 Thread ihalilaltun
Hi Ilya the exact same applications run on the system, there is no way that class is missing. by the way i have run some more tests; with a new clean ignite-cluster setup we did not get the errors and systems runs smoothly. only difference here is the upgrade proccess, there should be a problem w

Re: Exception on CacheEntryProcessor invoke (2.10.0)

2021-05-21 Thread ihalilaltun
Hi, sorry but i cannot share such a project, company policies restrtics it. I tried to reproduce it with new code but no luck (due to different environments and existing data structure). the idea was to call cacheentryprocessor's with in a executorservice but cannot get error. Let me give more

Exception on CacheEntryProcessor invoke (2.10.0)

2021-05-20 Thread ihalilaltun
Hi igniters, recenlty we have upgraded from 2.7.6 to 2.10.0 and some of cacheentryprocessors started to throw following errors on cache.invoke(...) calls. Caused by: java.lang.ClassNotFoundException: com.segmentify.lotr.frodo.cacheentryprocessor.RockScoreUpdateProcessor at java.net.URLCla

Re: CacheEntryProcessor ClassNotFoundException after 2.7.6 -> 2.10.0 Upgrade

2021-05-18 Thread ihalilaltun
what we expect here is that related cacheentryprocessors or any other class should be redeployed in SHARED mode and do the task it should be. - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

CacheEntryProcessor ClassNotFoundException after 2.7.6 -> 2.10.0 Upgrade

2021-05-17 Thread ihalilaltun
Hi igniters, recenlty we have upgraded from 2.7.6 to 2.10.0 and some of cacheentryprocessors started to act wierd. We have following cacheentryprocessor RockScoreUpdateProcessor.java when the processor i

ignite 2.10 index type change

2021-05-07 Thread ihalilaltun
Hello igniters, We've recently upgraded from ignite 2.7.6 to 2.10. With the cluster start we've seen that all indexes are rebuilded which went very well -> no data loss :) After upgrade we've run some tests and encountered following problem; we've following parameter in one of our objects @Q

Re: [ANNOUNCE] Apache Ignite 2.9.1 Released

2021-01-04 Thread ihalilaltun
Hi Yaroslav, We are at v2.7.6 and want to upgrade the latest version, do you have any directives for this kind of upgrade methodology? can we upgrade to latest version without any problem by default or should we upgrade version by version? 2.7.6 -> 2.8 -> 2.8.1 -> 2.9 then 2.9.1 thanks

Critical Workers Health Check on client side

2021-01-04 Thread ihalilaltun
hi there, I am curious about whether we can manage somehow *Critical Workers Health Check*on client side? What i need to do is catch critical workers health check results on client side, can this be done by implementing custom StopNodeOrHaltFailureHandler on client side? We are on ignite v2.7.6

node down after Caught unhandled exception in NIO worker thread (restart the node) log

2019-11-25 Thread ihalilaltun
Hi Igniters, We had a strange node-down incident after getting following log (we've been using ignite in production for almost 1 year and we're getting this error for the first time) [2019-11-22T21:19:54,222][INFO ][grid-nio-worker-tcp-comm-3-#203][TcpCommunicationSpi] Established outgoing commun

Re: excessive timeouts and load on new cache creations

2019-11-25 Thread ihalilaltun
Hi Anton, We have faced the same bug onnon-byte array types also. here is the pojo we use; UserMailInfo.java I've already read the topic you shared, thanks :) - İbrahim Halil Altun Senior Software Engineer @ S

Re: excessive timeouts and load on new cache creations

2019-11-22 Thread ihalilaltun
Hi Pavel, Thanks for you reply and suggestions but currenly we cannot use cache-groups. As you know there is a know bug for it -> https://issues.apache.org/jira/browse/IGNITE-11953 - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x

Re: excessive timeouts and load on new cache creations

2019-11-21 Thread ihalilaltun
Hi Anton, Timeouts can be found at the logs that i shared; [query-#13207879][GridMapQueryExecutor] Failed to execute local query. org.apache.ignite.cache.query.QueryCancelledException: The query was cancelled while executing. huge loads on server nodes are monitored via zabbix agent;

excessive timeouts and load on new cache creations

2019-11-21 Thread ihalilaltun
Hi Igniters, Everytime a new cache is created dynamically we get exessive number of timeouts and huge load on grid nodes. Current grid-node metrics are the followings; [2019-11-21T14:03:16,079][INFO ][grid-timeout-worker-#199][IgniteKernal] Metrics for local node (to disable set 'metricsLogFrequ

Re: excessive timeouts and load on NODE_JOINED and NODE_LEFT events

2019-11-13 Thread ihalilaltun
Hi Maksim, Thanks, i think it will regards - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: excessive timeouts and load on NODE_JOINED and NODE_LEFT events

2019-11-12 Thread ihalilaltun
Hi, Timeouts always starts when NODE_JOINED event has been fired, i am not sure if this event causes PME to take place or not. As I said before, this is a live system and we cannot stop ignite operations while PME is running :( I'll try to change log level to DEBUG, if I can do that, I'll share

Re: excessive timeouts and load on NODE_JOINED and NODE_LEFT events

2019-11-11 Thread ihalilaltun
Hi Ilya, Can we restrict PME operations for client nodes explicitly? As far as I know PME does not occur when client nodes are connected. This is a production environment and as you may expect we have many clients joining and removing the grid-nodes under heavy traffic. Any suggestions except inc

excessive timeouts and load on NODE_JOINED and NODE_LEFT events

2019-11-11 Thread ihalilaltun
Hi igniters, Everytime a client node connects or disconnects from the grid we get exessive number of timeouts and huge load on grid nodes. Current grid-node metrics are the followings; Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=cbdf5b45, uptime=40 days, 08

RE: Ignite node failure after network issue

2019-10-31 Thread ihalilaltun
Hi Alex, Thnaks for the response. We've made some optimizations on thread sizes and reorganize classpaths. I'll write againg if we face the problem again. regards - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

RE: Ignite node failure after network issue

2019-10-28 Thread ihalilaltun
Hi Alex, I've been removed the IP addreses for the sake of security reasons thats why it seems non-standart. I'll try to adjust all thread-pool sizes, I am not sure if we need them or not, since the configurations are made by our previous software architect. I'll look further on the serializatio

Re: Unresponsive cluster after "Checkpoint read lock acquisition has been timed out" Error

2019-10-25 Thread ihalilaltun
Hi Ilya, It is almost impossible for us to get thread dumps since this is production environment we cannot use profiler :( Our biggest object range from 2 to 4 kilobytes. We are planning to shrink the sizes but time for this is not decided yet. regards. - İbrahim Halil Altun Senior Softwa

Ignite node failure after network issue

2019-10-25 Thread ihalilaltun
Hi Igniters, We had a network glitch last night and one node halted itself. Both client and node logs are attached, can someone have a look and tell me the exact problem here; Archive.zip We are on version 2.7.6. Out clien

Re: Unresponsive cluster after "Checkpoint read lock acquisition has been timed out" Error

2019-10-18 Thread ihalilaltun
Hi Ilya, Sorry for the late response. We don't use lock mechanism in our environment. We have a lot of put, get operaitons, as far as i remember these operations does not hold the locks. In addition to these operations, in many update/put operations we use CacheEntryProcessor which also does not h

Re: Starvation in striped pool

2019-10-17 Thread ihalilaltun
Hi Ilya, >From time to time, we have faced exactly the same problem. Is there any best practices for handling network issues? What i mean is, if there is any network issues between client/s and server/s we want the cluster keeps living. As for the clients, they can be disconnected from servers. R

Re: Cluster went down after "Unable to await partitions release latch within timeout" WARN

2019-10-11 Thread ihalilaltun
Hi Pavel, Thank you for detailed explanation. We are discussing hotfix with management, but i think decision will be negative :( I think we'll have to wait 2.8 release, which seems to be released on January 17, 2020. I hope we'll have this issue by then. Regards. - İbrahim Halil Altun Sen

Re: Unresponsive cluster after "Checkpoint read lock acquisition has been timed out" Error

2019-10-11 Thread ihalilaltun
Hi Ilya, Yes we have persistence enabled. OS is not swapping out ignite memory, since we have more than enough resources on the server. The disks used for persistence are ssd ones with 96MB/s read and write speed. Is there any easy way to check if we are r

Re: Cluster went down after "Unable to await partitions release latch within timeout" WARN

2019-10-11 Thread ihalilaltun
Hi Pavel, Here is the logs from node with localId:3561ac09-6752-4e2e-8279-d975c268d045 ignite-2019-10-06.gz cache creation is done with java code on our side, we use getOrCreateCache method, here is the piece of c

Cluster went down after "Unable to await partitions release latch within timeout" WARN

2019-10-09 Thread ihalilaltun
Hi There Igniters, We had a very strange cluster behivour while creating new caches on the fly. Just after caches are created we start get following warnings from all cluster nodes, including coordinator node; [2019-09-27T15:00:17,727][WARN ][exchange-worker-#219][GridDhtPartitionsExchangeFuture]

Unresponsive cluster after "Checkpoint read lock acquisition has been timed out" Error

2019-10-09 Thread ihalilaltun
Hi There, We had a unresponsive cluster today after the following error; [2019-10-09T07:08:13,623][ERROR][sys-stripe-94-#95][GridCacheDatabaseSharedManager] Checkpoint read lock acquisition has been timed out. org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$

Re: Too many file descriptors on ignite node

2019-09-30 Thread ihalilaltun
Hi Denis, Our problem is not related to configuration parameters. We already limited archive size to 8, but some nodes do not release file from filesystem. When we look at the archive directory we only see 8 files, but when we look at the file descriptors on the server, we get thousands of wal fil

Too many file descriptors on ignite node

2019-09-23 Thread ihalilaltun
Hi Igniters it is me again :) We are having a wierd behivor on some of cluster nodes. Cluster uses native persistance with MMAP disabled. Some clusters have too many wal files even if they are already deleted, but for some reason they are stll persisted on the disk. I do not have any logs on clust

Re: [ANNOUNCE] Apache Ignite 2.7.6 Released

2019-09-23 Thread ihalilaltun
We have had some issues with the native persistance, in fact I reported this issue :) https://issues.apache.org/jira/browse/IGNITE-12127 We are hoping to have the upgrade before this week ends. cheers - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ig

Re: Node failure with "Failed to write buffer." error

2019-09-02 Thread ihalilaltun
I am sorry but it has been a long time that we changed the configuration and we do not have any logs or traces :( any estimated date for 2.7.6 release? - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Node failure with "Failed to write buffer." error

2019-09-02 Thread ihalilaltun
Hi mmuzaf, Sorry for late response. When we enabled mmap we had some IO issues, that's why we diseabled it. If there is such a bug like you sad, we can re-enable mmap. - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Node failure with "Failed to write buffer." error

2019-08-23 Thread ihalilaltun
Hi Mmuzaf IGNITE_WAL_MMAP is false in our environment. Here is the configuration; http://www.springframework.org/schema/beans"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.spring

Re: Node failure with "Failed to write buffer." error

2019-08-23 Thread ihalilaltun
Hi Dmagda Here is the all log files that can get from the server; ignite.zip gc.zip gc-logs-continnued

Node failure with "Failed to write buffer." error

2019-08-22 Thread ihalilaltun
Hi folks, We have been experiencing node failures with the error "Failed to write buffer." recently. Any ideas or optimizations not to get the error and node failure? Thanks... [2019-08-22T01:20:55,916][ERROR][wal-write-worker%null-#221][] Critical system error detected. Will be handled accordin

Re: Sudden node failure on Ignite v2.7.5

2019-07-18 Thread ihalilaltun
Hi Ivan Thanks for the reply. I've checked the jira issue and it says it will be released in v2.8, when do you think v2.8 will be released? - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: CacheEntryProcessor causes cluster node to stop

2019-07-17 Thread ihalilaltun
Hi Vladimir, here is logs from other node ignite-3.zip - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Random CorruptedTreeException from Apache Ignite

2019-07-17 Thread ihalilaltun
Hi Maxim, we are facing the exact same problem :( Is it ok/safe to remove cacheGroupName from the code that is already been created on nodes? If so, when we start our applications from the updated code will we be still have access to same caches or new caches will be created on nodes? - İb

CacheEntryProcessor causes cluster node to stop

2019-07-17 Thread ihalilaltun
Hi Igniters, Although class was deployed in SHARED or CONTINUOUS mode node got exception and halted itself. log added; ignite.zip *Ignite version*: 2.7.5 *Cluster size*: 16 *Client size*: 22 *Cluster OS version*: Centos

NegativeArraySizeException on cluster rebalance

2019-07-17 Thread ihalilaltun
Hi Igniters, We are getting negativearraysizeexception on one of our cacheGroups rebalancing period. I am adding the details that I could get from logs; *Ignite version*: 2.7.5 *Cluster size*: 16 *Client size*: 22 *Cluster OS version*: Centos 7 *Cluster Kernel version*: 4.4.185-1.el7.elrepo.x

Re: Sudden node failure on Ignite v2.7.5

2019-07-15 Thread ihalilaltun
Hi Pavel, Thanks for you reply. Since we use the whole sysyem on production environment we cannot apply the second solution. Do you have any estimated time for the first solution/fix? Thanks. - İbrahim Halil Altun Senior Software Engineer @ Segmentify -- Sent from: http://apache-ignite-use

Sudden node failure on Ignite v2.7.5

2019-07-13 Thread ihalilaltun
Hi Igniters, Recently (11.07.2019), we have upgraded our ignite versin from 2.7.0 to 2.7.5. Just like after 11 hours one of our nodes killed itself without any notification. I am adding the details that I could get from the server and the topology we use; *Ignite version*: 2.7.5 *Cluster size*: 1