[jira] [Created] (IGNITE-12873) Hosts names resolution takes place even if all nodes are configured via ip addresses

2020-04-08 Thread Kirill Tkalenko (Jira)
Kirill Tkalenko created IGNITE-12873:


 Summary: Hosts names resolution takes place even if all nodes are 
configured via ip addresses
 Key: IGNITE-12873
 URL: https://issues.apache.org/jira/browse/IGNITE-12873
 Project: Ignite
  Issue Type: Improvement
Reporter: Kirill Tkalenko
Assignee: Kirill Tkalenko
 Fix For: 2.9


We got a problem: When connection with DNS was lost transactions processing 
hanged up. This happened because IgniteUtils.toSocketAddresses resolves 
hostName even if all nodes are configured via ip-addresses.

{code:java}
Thread [name=""utility-#432631%GRID%GridNodeName%"", id=992176, state=RUNNABLE, 
blockCnt=1, waitCnt=16]
 at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
 at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
 at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
 at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
 at java.net.InetAddress.getAllByName(InetAddress.java:1193)
 at java.net.InetAddress.getAllByName(InetAddress.java:1127)
 at java.net.InetAddress.getByName(InetAddress.java:1077)
 at java.net.InetSocketAddress.(InetSocketAddress.java:220)
 at o.a.i.i.util.IgniteUtils.toSocketAddresses(IgniteUtils.java:8982)
 at 
o.a.i.spi.communication.tcp.TcpCommunicationSpi.nodeAddresses(TcpCommunicationSpi.java:3228)
 at 
o.a.i.spi.communication.tcp.TcpCommunicationSpi.nodeAddresses(TcpCommunicationSpi.java:3200)
 at 
o.a.i.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3291)
 at 
o.a.i.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:3027)
 at 
o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2907)
 at 
o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2750)
 at 
o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2709)
 at o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:1643)
 at 
o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:1863)
 at 
o.a.i.i.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1873)
 at 
o.a.i.i.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1844)
 at 
o.a.i.i.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1826)
 at 
o.a.i.i.processors.continuous.GridContinuousProcessor.sendNotification(GridContinuousProcessor.java:1244)
 at 
o.a.i.i.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:1181)
 at 
o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:882)
 at 
o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler.access$600(CacheContinuousQueryHandler.java:85)
 at 
o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler$2.onEntryUpdated(CacheContinuousQueryHandler.java:429)
 at 
o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:400)
 at 
o.a.i.i.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:)
 at 
o.a.i.i.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:659)
 at 
o.a.i.i.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:796)
 at 
o.a.i.i.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:603)
 at 
o.a.i.i.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:475)
 at 
o.a.i.i.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:532)
 at 
o.a.i.i.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1017)
 at 
o.a.i.i.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:896)
 at 
o.a.i.i.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:852)
 at 
o.a.i.i.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:102)
 at 
o.a.i.i.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:196)
 at 
o.a.i.i.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:194)
 at 
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1077)
 at 
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:587)
 at 
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:386)
 at 
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:312)
 at 
o.a.i.i.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:102)
 at 
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(Grid

Active nodes aliveness WatchDog

2020-04-08 Thread Anton Vinogradov
Igniters,
Do we have some feature allows to check nodes aliveness on a regular basis?

Scenario:
Precondition
  The cluster has no load but some node's JVM crashed.

Expected actual
  The user performs an operation (eg. cache put) related to this node (via
another node) and waits for some timeout to gain it's dead.
  The cluster starts the switch to relocate primary partitions to alive
nodes.
  Now user able to retry the operation.

Desired
  Some WatchDog checks nodes aliveness on a regular basis.
  Once a failure detected, the cluster starts the switch.
  Later, the user performs an operation on an already fixed cluster and
waits for nothing.

It would be good news if the "Desired" case is already Actual.
Can somebody point to the feature that performs this check?


Re: Active nodes aliveness WatchDog

2020-04-08 Thread Stephen Darlington
This is one of the functions of the DiscoverySPI. Nodes check on their 
neighbours and notify the remaining nodes if one disappears. When the topology 
changes, it triggers a rebalance, which relocates primary partitions to live 
nodes. This is entirely transparent to clients.

It gets more complex… like there’s the partition loss policy and rebalancing 
doesn’t always happen (configurable, persistence, etc)… but broadly it does as 
you expect.

Regards,
Stephen

> On 8 Apr 2020, at 08:40, Anton Vinogradov  wrote:
> 
> Igniters,
> Do we have some feature allows to check nodes aliveness on a regular basis?
> 
> Scenario:
> Precondition
>  The cluster has no load but some node's JVM crashed.
> 
> Expected actual
>  The user performs an operation (eg. cache put) related to this node (via
> another node) and waits for some timeout to gain it's dead.
>  The cluster starts the switch to relocate primary partitions to alive
> nodes.
>  Now user able to retry the operation.
> 
> Desired
>  Some WatchDog checks nodes aliveness on a regular basis.
>  Once a failure detected, the cluster starts the switch.
>  Later, the user performs an operation on an already fixed cluster and
> waits for nothing.
> 
> It would be good news if the "Desired" case is already Actual.
> Can somebody point to the feature that performs this check?




Re: Active nodes aliveness WatchDog

2020-04-08 Thread Anton Vinogradov
Stephen,

> Nodes check on their neighbours and notify the remaining nodes if one
disappears.
Could you explain how this works in detail?
How can I set/change check frequency?

On Wed, Apr 8, 2020 at 11:13 AM Stephen Darlington <
stephen.darling...@gridgain.com> wrote:

> This is one of the functions of the DiscoverySPI. Nodes check on their
> neighbours and notify the remaining nodes if one disappears. When the
> topology changes, it triggers a rebalance, which relocates primary
> partitions to live nodes. This is entirely transparent to clients.
>
> It gets more complex… like there’s the partition loss policy and
> rebalancing doesn’t always happen (configurable, persistence, etc)… but
> broadly it does as you expect.
>
> Regards,
> Stephen
>
> > On 8 Apr 2020, at 08:40, Anton Vinogradov  wrote:
> >
> > Igniters,
> > Do we have some feature allows to check nodes aliveness on a regular
> basis?
> >
> > Scenario:
> > Precondition
> >  The cluster has no load but some node's JVM crashed.
> >
> > Expected actual
> >  The user performs an operation (eg. cache put) related to this node (via
> > another node) and waits for some timeout to gain it's dead.
> >  The cluster starts the switch to relocate primary partitions to alive
> > nodes.
> >  Now user able to retry the operation.
> >
> > Desired
> >  Some WatchDog checks nodes aliveness on a regular basis.
> >  Once a failure detected, the cluster starts the switch.
> >  Later, the user performs an operation on an already fixed cluster and
> > waits for nothing.
> >
> > It would be good news if the "Desired" case is already Actual.
> > Can somebody point to the feature that performs this check?
>
>
>


Re: Active nodes aliveness WatchDog

2020-04-08 Thread Stephen Darlington
The configuration parameters that I’m aware of are here:

https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html

Other people would be better placed to discuss the internals.

Regards.
Stephen

> On 8 Apr 2020, at 09:32, Anton Vinogradov  wrote:
> 
> Stephen,
> 
>> Nodes check on their neighbours and notify the remaining nodes if one
> disappears.
> Could you explain how this works in detail?
> How can I set/change check frequency?
> 
> On Wed, Apr 8, 2020 at 11:13 AM Stephen Darlington <
> stephen.darling...@gridgain.com> wrote:
> 
>> This is one of the functions of the DiscoverySPI. Nodes check on their
>> neighbours and notify the remaining nodes if one disappears. When the
>> topology changes, it triggers a rebalance, which relocates primary
>> partitions to live nodes. This is entirely transparent to clients.
>> 
>> It gets more complex… like there’s the partition loss policy and
>> rebalancing doesn’t always happen (configurable, persistence, etc)… but
>> broadly it does as you expect.
>> 
>> Regards,
>> Stephen
>> 
>>> On 8 Apr 2020, at 08:40, Anton Vinogradov  wrote:
>>> 
>>> Igniters,
>>> Do we have some feature allows to check nodes aliveness on a regular
>> basis?
>>> 
>>> Scenario:
>>> Precondition
>>> The cluster has no load but some node's JVM crashed.
>>> 
>>> Expected actual
>>> The user performs an operation (eg. cache put) related to this node (via
>>> another node) and waits for some timeout to gain it's dead.
>>> The cluster starts the switch to relocate primary partitions to alive
>>> nodes.
>>> Now user able to retry the operation.
>>> 
>>> Desired
>>> Some WatchDog checks nodes aliveness on a regular basis.
>>> Once a failure detected, the cluster starts the switch.
>>> Later, the user performs an operation on an already fixed cluster and
>>> waits for nothing.
>>> 
>>> It would be good news if the "Desired" case is already Actual.
>>> Can somebody point to the feature that performs this check?
>> 
>> 
>> 




Re: Active nodes aliveness WatchDog

2020-04-08 Thread Anton Vinogradov
It seems you're talking about Failure Detection (Timeouts).
Will it detect node failure on still cluster?

On Wed, Apr 8, 2020 at 11:52 AM Stephen Darlington <
stephen.darling...@gridgain.com> wrote:

> The configuration parameters that I’m aware of are here:
>
>
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html
>
> Other people would be better placed to discuss the internals.
>
> Regards.
> Stephen
>
> > On 8 Apr 2020, at 09:32, Anton Vinogradov  wrote:
> >
> > Stephen,
> >
> >> Nodes check on their neighbours and notify the remaining nodes if one
> > disappears.
> > Could you explain how this works in detail?
> > How can I set/change check frequency?
> >
> > On Wed, Apr 8, 2020 at 11:13 AM Stephen Darlington <
> > stephen.darling...@gridgain.com> wrote:
> >
> >> This is one of the functions of the DiscoverySPI. Nodes check on their
> >> neighbours and notify the remaining nodes if one disappears. When the
> >> topology changes, it triggers a rebalance, which relocates primary
> >> partitions to live nodes. This is entirely transparent to clients.
> >>
> >> It gets more complex… like there’s the partition loss policy and
> >> rebalancing doesn’t always happen (configurable, persistence, etc)… but
> >> broadly it does as you expect.
> >>
> >> Regards,
> >> Stephen
> >>
> >>> On 8 Apr 2020, at 08:40, Anton Vinogradov  wrote:
> >>>
> >>> Igniters,
> >>> Do we have some feature allows to check nodes aliveness on a regular
> >> basis?
> >>>
> >>> Scenario:
> >>> Precondition
> >>> The cluster has no load but some node's JVM crashed.
> >>>
> >>> Expected actual
> >>> The user performs an operation (eg. cache put) related to this node
> (via
> >>> another node) and waits for some timeout to gain it's dead.
> >>> The cluster starts the switch to relocate primary partitions to alive
> >>> nodes.
> >>> Now user able to retry the operation.
> >>>
> >>> Desired
> >>> Some WatchDog checks nodes aliveness on a regular basis.
> >>> Once a failure detected, the cluster starts the switch.
> >>> Later, the user performs an operation on an already fixed cluster and
> >>> waits for nothing.
> >>>
> >>> It would be good news if the "Desired" case is already Actual.
> >>> Can somebody point to the feature that performs this check?
> >>
> >>
> >>
>
>
>


[jira] [Created] (IGNITE-12874) Possible NPE in GridDiscoveryManager#cacheGroupAffinityNode

2020-04-08 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12874:
--

 Summary: Possible NPE in 
GridDiscoveryManager#cacheGroupAffinityNode
 Key: IGNITE-12874
 URL: https://issues.apache.org/jira/browse/IGNITE-12874
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


If "grpId" is invalid then method will throw NPE instead of returning false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Active nodes aliveness WatchDog

2020-04-08 Thread Stephen Darlington
Yes. Nodes are always chatting to each another even if there are no requests 
coming In.

Here’s the status message: 
https://github.com/apache/ignite/blob/e9b3c4cebaecbeec9fa51bd6ec32a879fb89948a/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/messages/TcpDiscoveryStatusCheckMessage.java

Regards,
Stephen

> On 8 Apr 2020, at 10:04, Anton Vinogradov  wrote:
> 
> It seems you're talking about Failure Detection (Timeouts).
> Will it detect node failure on still cluster?
> 
> On Wed, Apr 8, 2020 at 11:52 AM Stephen Darlington <
> stephen.darling...@gridgain.com> wrote:
> 
>> The configuration parameters that I’m aware of are here:
>> 
>> 
>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html
>> 
>> Other people would be better placed to discuss the internals.
>> 
>> Regards.
>> Stephen
>> 
>>> On 8 Apr 2020, at 09:32, Anton Vinogradov  wrote:
>>> 
>>> Stephen,
>>> 
 Nodes check on their neighbours and notify the remaining nodes if one
>>> disappears.
>>> Could you explain how this works in detail?
>>> How can I set/change check frequency?
>>> 
>>> On Wed, Apr 8, 2020 at 11:13 AM Stephen Darlington <
>>> stephen.darling...@gridgain.com> wrote:
>>> 
 This is one of the functions of the DiscoverySPI. Nodes check on their
 neighbours and notify the remaining nodes if one disappears. When the
 topology changes, it triggers a rebalance, which relocates primary
 partitions to live nodes. This is entirely transparent to clients.
 
 It gets more complex… like there’s the partition loss policy and
 rebalancing doesn’t always happen (configurable, persistence, etc)… but
 broadly it does as you expect.
 
 Regards,
 Stephen
 
> On 8 Apr 2020, at 08:40, Anton Vinogradov  wrote:
> 
> Igniters,
> Do we have some feature allows to check nodes aliveness on a regular
 basis?
> 
> Scenario:
> Precondition
> The cluster has no load but some node's JVM crashed.
> 
> Expected actual
> The user performs an operation (eg. cache put) related to this node
>> (via
> another node) and waits for some timeout to gain it's dead.
> The cluster starts the switch to relocate primary partitions to alive
> nodes.
> Now user able to retry the operation.
> 
> Desired
> Some WatchDog checks nodes aliveness on a regular basis.
> Once a failure detected, the cluster starts the switch.
> Later, the user performs an operation on an already fixed cluster and
> waits for nothing.
> 
> It would be good news if the "Desired" case is already Actual.
> Can somebody point to the feature that performs this check?
 
 
 
>> 
>> 
>> 




Re: [DISCUSSION] Hot cache backup

2020-04-08 Thread Surkov
That's cool. 
I'm waiting for this thing.



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/


[jira] [Created] (IGNITE-12875) Implement "EVT_CLUSTER_STATE_CHANGE_STARTED" event

2020-04-08 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12875:
--

 Summary: Implement "EVT_CLUSTER_STATE_CHANGE_STARTED" event
 Key: IGNITE-12875
 URL: https://issues.apache.org/jira/browse/IGNITE-12875
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Active nodes aliveness WatchDog

2020-04-08 Thread Vladimir Steshin

Hi everyone.

I think we should check behavior of failure detection with tests or find 
them if already written. I’ll research this question and rise a ticket 
if a reproducer appears.




08.04.2020 12:19, Stephen Darlington пишет:

Yes. Nodes are always chatting to each another even if there are no requests 
coming In.

Here’s the status message: 
https://github.com/apache/ignite/blob/e9b3c4cebaecbeec9fa51bd6ec32a879fb89948a/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/messages/TcpDiscoveryStatusCheckMessage.java

Regards,
Stephen


On 8 Apr 2020, at 10:04, Anton Vinogradov  wrote:

It seems you're talking about Failure Detection (Timeouts).
Will it detect node failure on still cluster?

On Wed, Apr 8, 2020 at 11:52 AM Stephen Darlington <
stephen.darling...@gridgain.com> wrote:


The configuration parameters that I’m aware of are here:


https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html

Other people would be better placed to discuss the internals.

Regards.
Stephen


On 8 Apr 2020, at 09:32, Anton Vinogradov  wrote:

Stephen,


Nodes check on their neighbours and notify the remaining nodes if one

disappears.
Could you explain how this works in detail?
How can I set/change check frequency?

On Wed, Apr 8, 2020 at 11:13 AM Stephen Darlington <
stephen.darling...@gridgain.com> wrote:


This is one of the functions of the DiscoverySPI. Nodes check on their
neighbours and notify the remaining nodes if one disappears. When the
topology changes, it triggers a rebalance, which relocates primary
partitions to live nodes. This is entirely transparent to clients.

It gets more complex… like there’s the partition loss policy and
rebalancing doesn’t always happen (configurable, persistence, etc)… but
broadly it does as you expect.

Regards,
Stephen


On 8 Apr 2020, at 08:40, Anton Vinogradov  wrote:

Igniters,
Do we have some feature allows to check nodes aliveness on a regular

basis?

Scenario:
Precondition
The cluster has no load but some node's JVM crashed.

Expected actual
The user performs an operation (eg. cache put) related to this node

(via

another node) and waits for some timeout to gain it's dead.
The cluster starts the switch to relocate primary partitions to alive
nodes.
Now user able to retry the operation.

Desired
Some WatchDog checks nodes aliveness on a regular basis.
Once a failure detected, the cluster starts the switch.
Later, the user performs an operation on an already fixed cluster and
waits for nothing.

It would be good news if the "Desired" case is already Actual.
Can somebody point to the feature that performs this check?










Re: Out of memory with eviction failure on persisted cache

2020-04-08 Thread Raymond Wilson
Evgenii,

Have you had a chance to look into the reproducer?

Thanks,
Raymond.

On Fri, Mar 6, 2020 at 2:51 PM Raymond Wilson 
wrote:

> Evgenii,
>
> I have created a reproducer that triggers the error with the buffer size
> set to 64Mb. The program.cs/csproj and log for the run that triggered the
> error are attached.
>
> Thanks,
> Raymond.
>
>
>
> On Fri, Mar 6, 2020 at 1:08 PM Raymond Wilson 
> wrote:
>
>> The reproducer is my development system, which is hard to share.
>>
>> I have increased the size of the buffer to 256Mb, and it copes with the
>> example data load, though I have not tried larger data sets.
>>
>> From an analytical perspective, is this an error that is possible or
>> expected to occur when using a cache with a persistent data region defined?
>>
>> I'll see if I can make a small reproducer.
>>
>> On Fri, Mar 6, 2020 at 11:34 AM Evgenii Zhuravlev <
>> e.zhuravlev...@gmail.com> wrote:
>>
>>> Hi Raymond,
>>>
>>> I tried to reproduce it, but without success. Can you share the
>>> reproducer?
>>>
>>> Also, have you tried to load much more data with 256mb data region? I
>>> think it should work without issues.
>>>
>>> Thanks,
>>> Evgenii
>>>
>>> ср, 4 мар. 2020 г. в 16:14, Raymond Wilson :
>>>
 Hi Evgenii,

 I am individually Put()ing the elements using PutIfAbsent(). Each
 element can range 2kb-35Kb in size.

 Actually, the process that writes the data does not write the data
 directly to the cache, it uses a compute function to send the payload to
 the process that is doing the reading. The compute function applies
 validation logic and uses PutIfAbsent() to write the data into the cache.

 Sorry for the confusion.

 Raymond.


 On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <
 e.zhuravlev...@gmail.com> wrote:

> Hi,
>
> How are you loading the data? Do you use putAll or DataStreamer?
>
> Evgenii
>
> ср, 4 мар. 2020 г. в 15:37, Raymond Wilson  >:
>
>> To add some further detail:
>>
>> There are two processes interacting with the cache. One process is
>> writing
>> data into the cache, while the second process is extracting data from
>> the
>> cache using a continuous query. The process that is the reader of the
>> data
>> is throwing the exception.
>>
>> Increasing the cache size further to 256 Mb resolves the problem for
>> this
>> data set, however we have data sets more than 100 times this size
>> which we
>> will be processing.
>>
>> Thanks,
>> Raymond.
>>
>>
>> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <
>> raymond_wil...@trimble.com>
>> wrote:
>>
>> > I've been having a sporadic issue with the Ignite 2.7.5 JVM halting
>> due to
>> > out of memory error related to a cache with persistence enabled
>> >
>> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up
>> support for
>> > C# affinity functions and now have this issue appearing regularly
>> while
>> > adding around 400Mb of data into the cache which is configured to
>> have
>> > 128Mb of memory (this was 64Mb but I increased it to see if the
>> failure
>> > would resolve.
>> >
>> > The error I get is:
>> >
>> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM
>> will be
>> > halted immediately due to the failure: [failureCtx=FailureContext
>> > [type=CRITICAL_ERROR, err=class
>> o.a.i.i.mem.IgniteOutOfMemoryException:
>> > Failed to find a page for eviction [segmentCapacity=1700,
>> loaded=676,
>> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
>> > failedToPrepare=675]
>> > Out of memory in data region [name=TAGFileBufferQueue,
>> initSize=128.0 MiB,
>> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>> >   ^-- Increase maximum off-heap memory size
>> > (DataRegionConfiguration.maxSize)
>> >   ^-- Enable Ignite persistence
>> > (DataRegionConfiguration.persistenceEnabled)
>> >   ^-- Enable eviction or expiration policies]]
>> >
>> > I'm not running an eviction policy as I thought this was not
>> required for
>> > caches with persistence enabled.
>> >
>> > I'm surprised by this behaviour as I expected the persistence
>> mechanism to
>> > handle it. The error relating to failure to find a page for eviction
>> > suggest the persistence mechanism has fallen behind. If this is the
>> case,
>> > this seems like an unfriendly failure mode.
>> >
>> > Thanks,
>> > Raymond.
>> >
>> >
>> >
>>
>


Re: Active nodes aliveness WatchDog

2020-04-08 Thread Anton Vinogradov
Stephen,
Thanks for the hint.

Vladimir,
Great idea! Let me know if any help needed.

On Wed, Apr 8, 2020 at 2:19 PM Vladimir Steshin  wrote:

> Hi everyone.
>
> I think we should check behavior of failure detection with tests or find
> them if already written. I’ll research this question and rise a ticket
> if a reproducer appears.
>
>
>
> 08.04.2020 12:19, Stephen Darlington пишет:
> > Yes. Nodes are always chatting to each another even if there are no
> requests coming In.
> >
> > Here’s the status message:
> https://github.com/apache/ignite/blob/e9b3c4cebaecbeec9fa51bd6ec32a879fb89948a/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/messages/TcpDiscoveryStatusCheckMessage.java
> >
> > Regards,
> > Stephen
> >
> >> On 8 Apr 2020, at 10:04, Anton Vinogradov  wrote:
> >>
> >> It seems you're talking about Failure Detection (Timeouts).
> >> Will it detect node failure on still cluster?
> >>
> >> On Wed, Apr 8, 2020 at 11:52 AM Stephen Darlington <
> >> stephen.darling...@gridgain.com> wrote:
> >>
> >>> The configuration parameters that I’m aware of are here:
> >>>
> >>>
> >>>
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html
> >>>
> >>> Other people would be better placed to discuss the internals.
> >>>
> >>> Regards.
> >>> Stephen
> >>>
>  On 8 Apr 2020, at 09:32, Anton Vinogradov  wrote:
> 
>  Stephen,
> 
> > Nodes check on their neighbours and notify the remaining nodes if one
>  disappears.
>  Could you explain how this works in detail?
>  How can I set/change check frequency?
> 
>  On Wed, Apr 8, 2020 at 11:13 AM Stephen Darlington <
>  stephen.darling...@gridgain.com> wrote:
> 
> > This is one of the functions of the DiscoverySPI. Nodes check on
> their
> > neighbours and notify the remaining nodes if one disappears. When the
> > topology changes, it triggers a rebalance, which relocates primary
> > partitions to live nodes. This is entirely transparent to clients.
> >
> > It gets more complex… like there’s the partition loss policy and
> > rebalancing doesn’t always happen (configurable, persistence, etc)…
> but
> > broadly it does as you expect.
> >
> > Regards,
> > Stephen
> >
> >> On 8 Apr 2020, at 08:40, Anton Vinogradov  wrote:
> >>
> >> Igniters,
> >> Do we have some feature allows to check nodes aliveness on a regular
> > basis?
> >> Scenario:
> >> Precondition
> >> The cluster has no load but some node's JVM crashed.
> >>
> >> Expected actual
> >> The user performs an operation (eg. cache put) related to this node
> >>> (via
> >> another node) and waits for some timeout to gain it's dead.
> >> The cluster starts the switch to relocate primary partitions to
> alive
> >> nodes.
> >> Now user able to retry the operation.
> >>
> >> Desired
> >> Some WatchDog checks nodes aliveness on a regular basis.
> >> Once a failure detected, the cluster starts the switch.
> >> Later, the user performs an operation on an already fixed cluster
> and
> >> waits for nothing.
> >>
> >> It would be good news if the "Desired" case is already Actual.
> >> Can somebody point to the feature that performs this check?
> >
> >
> >>>
> >>>
> >
>


[jira] [Created] (IGNITE-12876) Test to cover deadlock fix between checkpoint, entry update and ttl-cleanup threads

2020-04-08 Thread Sergey Chugunov (Jira)
Sergey Chugunov created IGNITE-12876:


 Summary: Test to cover deadlock fix between checkpoint, entry 
update and ttl-cleanup threads
 Key: IGNITE-12876
 URL: https://issues.apache.org/jira/browse/IGNITE-12876
 Project: Ignite
  Issue Type: Test
Reporter: Sergey Chugunov
Assignee: Sergey Chugunov
 Fix For: 2.8.1


IGNITE-12594 ticked fixed deadlock between several threads that was 
reproducible with low probability in unrelated tests.

To improve test coverage of the fix new test dedicated for the deadlock 
situation is needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12877) "restorePartitionStates" always logs all meta pages into WAL

2020-04-08 Thread Ivan Bessonov (Jira)
Ivan Bessonov created IGNITE-12877:
--

 Summary: "restorePartitionStates" always logs all meta pages into 
WAL
 Key: IGNITE-12877
 URL: https://issues.apache.org/jira/browse/IGNITE-12877
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Bessonov
Assignee: Ivan Bessonov


{noformat}
2020-01-31T21:09:27,203 [INFO 
][main][org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager]
 - Finished applying WAL changes [updatesApplied=11897, time=183531 ms] 
2020-01-31T21:09:27,203 [INFO 
][main][org.apache.ignite.internal.processors.cache.GridCacheProcessor] - 
Restoring partition state for local groups. 2020-01-31T21:17:49,692 [INFO 
][main][org.apache.ignite.internal.processors.cache.GridCacheProcessor] - 
Finished restoring partition state for local groups [groupsProcessed=32, 
partitionsProcessed=9310, time=502498ms] {noformat}
Main issue is that 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager#updateState
 unconditionally returns true. "stateId" is pretty much always not equal to 
"-1".

UPDATE: that wasn’t the only problem, please look in the fix itself for more 
details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12878) Improve logging for async writing of Binary Metadata

2020-04-08 Thread Sergey Chugunov (Jira)
Sergey Chugunov created IGNITE-12878:


 Summary: Improve logging for async writing of Binary Metadata
 Key: IGNITE-12878
 URL: https://issues.apache.org/jira/browse/IGNITE-12878
 Project: Ignite
  Issue Type: Task
Reporter: Sergey Chugunov
Assignee: Sergey Chugunov
 Fix For: 2.8.1


New implementation of writing binary metadata outside of discovery thread was 
introduced in IGNITE-12099 but sufficient debug logging was missing.

To provide enough information in case of debugging we need to add necessary 
logging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSSION] Hot cache backup

2020-04-08 Thread Andrey Dolmatov
Hi, Maxim!
It is very useful feature, great job!

But could you explain me some aspects?

   - Does snapshot contain only primary data or backup partitions or both?
   - Could I create snapshot from m-node cluster and apply it to n-node
   cluster (n<>m)?
   - Should data node has extra space on persistent store to create
   snapshot? Or, from another point of view, woild size of temporary file be
   equal to size of all data on cluster node?
   - What resulted snapshot is, single file or collection of files (one for
   every data node)?

I apologize for my questions, but i really interested in such feature.


вт, 7 апр. 2020 г. в 22:10, Maxim Muzafarov :

> Igniters,
>
>
> I'd like to back to the discussion of a snapshot operation for Apache
> Ignite for persistence cache groups and I propose my changes below. I
> have prepared everything so that the discussion is as meaningful and
> specific as much as possible:
>
> - IEP-43: Cluster snapshot [1]
> - The Jira task IGNITE-11073 [2]
> - PR with described changes, Patch Available [4]
>
> Changes are ready for review.
>
>
> Here are a few implementation details and my thoughts:
>
> 1. Snapshot restore assumed to be manual at the first step. The
> process will be described on our documentation pages, but it is
> possible to start node right from the snapshot directory since the
> directory structure is preserved (check
> `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some
> options here about how the restore process must look like:
> - fully manual snapshot restore (will be documented)
> - ansible or shell scripts for restore
> - Java API for restore (I doubt we should go this way).
>
> 3. The snapshot `create` procedure creates a snapshot of all
> persistent caches available on the cluster (see limitations [1]).
>
> 2. The snapshot `create` procedure is available through Java API and
> JMX (control.sh may be implemented further).
>
> Java API:
> IgniteFuture fut = ignite.snapshot()
> .createSnapshot(name);
>
> JMX:
> SnapshotMXBean mxBean = getMBean(ignite.name());
> mxBean.createSnapshot(name);
>
> 3. The Distribute Process [3] is used to perform a cluster-wide
> snapshot procedure, so we've avoided a lot of boilerplate code here.
>
> 4. The design document [1] contains also an internal API for creating
> a consistent local snapshot of requested cache groups and transfer it
> to another node using the FileTransmission protocol [6]. This is one
> of the parts of IEP-28 [5] for cluster rebalancing via partition files
> and an important part for understanding the whole design.
>
> Java API:
> public IgniteInternalFuture createRemoteSnapshot(
> UUID rmtNodeId,
> Map> parts,
> BiConsumer partConsumer);
>
>
> Please, share your thoughts and take a loot at my changes [4].
>
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots
> [2] https://issues.apache.org/jira/browse/IGNITE-11073
> [3]
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java#L49
> [4] https://github.com/apache/ignite/pull/7607
> [5]
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-Filetransferbetweennodes
> [6]
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/managers/communication/TransmissionHandler.java#L42
>
>
> On Thu, 28 Feb 2019 at 14:43, Dmitriy Pavlov  wrote:
> >
> > Hi Maxim,
> >
> > I agree with Denis and I have just one concern here.
> >
> > Apache Ignite has quite a long story (started even before Apache), and
> now
> > it has a way too huge number of features. Some of these features
> > - are developed and well known by community members,
> > - some of them were contributed a long time ago and nobody develops it,
> > - and, actually, in some rare cases, nobody in the community knows how it
> > works and how to change it.
> >
> > Such features may attract users, but a bug in it may ruin impression
> about
> > the product. Even worse, nobody can help to solve it, and only user
> himself
> > or herself may be encouraged to contribute a fix.
> >
> > And my concern here, such a big feature should have a number of
> interested
> > contributors, who can support it in case if others lost interest. I will
> be
> > happy if 3-5 members will come and say, yes, I will do a review/I will
> help
> > with further changes.
> >
> > Just to be clear, I'm not against it, and I'll never cast -1 for it, but
> it
> > would be more comfortable to develop this feature with understanding that
> > this work will not be useless.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > ср, 27 февр. 2019 г. в 23:36, Denis Magda :
> >
> > > Maxim,
> > >
> > > GridGain has this exact feature available for Ignite native persistence
> > > deployments. It's not as easy as it might have been seen from the
> > > enablement perspective. Took us many y

[jira] [Created] (IGNITE-12879) Change DiscoveryHook methods naming.

2020-04-08 Thread PetrovMikhail (Jira)
PetrovMikhail created IGNITE-12879:
--

 Summary: Change DiscoveryHook methods naming.
 Key: IGNITE-12879
 URL: https://issues.apache.org/jira/browse/IGNITE-12879
 Project: Ignite
  Issue Type: Improvement
Reporter: PetrovMikhail
Assignee: PetrovMikhail


It's needed to change DiscoveryHook class method naming to the following:

{code:java}
public void beforeDiscovery(DiscoverySpiCustomMessage msg)

public void afterDiscovery(DiscoverySpiCustomMessage msg)
{code}

It helps to clarify the purpose of the methods.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSSION] Hot cache backup

2020-04-08 Thread Maxim Muzafarov
Andrey,


Thanks for your questions, I've also clarified some details on the
IEP-43 [1] page according to them.

> Does snapshot contain only primary data or backup partitions or both?

A snapshot contains a full copy of persistence data on each local
node. This means all primary, backup partitions and the SQL index file
available on the local node are copied to snapshot.

> Could I create snapshot from m-node cluster and apply it to n-node cluster 
> (n<>m)?

Currently, the restore procedure is fully manual, but it is possible
to restore on different topology in general. There are a few options
here:
- m == n, the easiest and fastest way
- m < n, cluster will start and the rebalance will happen (see
testClusterSnapshotWithRebalancing in PR). If some SQL indexes exist
it may take a quite a long time to complete.
- m > n, the hardest case. For instance, if backups > 1 you can start
a cluster and remove node one by one from baseline. I think this case
should be covered by additional recovery scripts which will be
developed further.

> - Should data node has extra space on persistent store to create snapshot? 
> Or, from another point of view, woild size of temporary file be equal to size 
> of all data on cluster node?

If a cluster has no load you will need only a free space to store
snapshot which is almost equal to the node `db` directory size.

If a cluster is under the load it needs some extra space to store
intermediate snapshot results. The amount of such space depends on how
fast cache partition files are copied to snapshot directory (if disks
are slow). The maximum size of the temporary file per each partition
is equal to the size of the appropriate partition file. So, the worst
case you need x3 extra disk size. But according to my measurements
assume SSD is used and size of each partition is 300MB it will require
no more than 1-3% to a cluster under high load.

- What resulted snapshot is, single file or collection of files (one
for every data node)?

Check the example of the snapshot directory structure on the IEP-43
page [1], this is how a completed snapshot will look like.

[1] 
https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Restoresnapshot(manually)

On Wed, 8 Apr 2020 at 17:18, Andrey Dolmatov  wrote:
>
> Hi, Maxim!
> It is very useful feature, great job!
>
> But could you explain me some aspects?
>
>- Does snapshot contain only primary data or backup partitions or both?
>- Could I create snapshot from m-node cluster and apply it to n-node
>cluster (n<>m)?
>- Should data node has extra space on persistent store to create
>snapshot? Or, from another point of view, woild size of temporary file be
>equal to size of all data on cluster node?
>- What resulted snapshot is, single file or collection of files (one for
>every data node)?
>
> I apologize for my questions, but i really interested in such feature.
>
>
> вт, 7 апр. 2020 г. в 22:10, Maxim Muzafarov :
>
> > Igniters,
> >
> >
> > I'd like to back to the discussion of a snapshot operation for Apache
> > Ignite for persistence cache groups and I propose my changes below. I
> > have prepared everything so that the discussion is as meaningful and
> > specific as much as possible:
> >
> > - IEP-43: Cluster snapshot [1]
> > - The Jira task IGNITE-11073 [2]
> > - PR with described changes, Patch Available [4]
> >
> > Changes are ready for review.
> >
> >
> > Here are a few implementation details and my thoughts:
> >
> > 1. Snapshot restore assumed to be manual at the first step. The
> > process will be described on our documentation pages, but it is
> > possible to start node right from the snapshot directory since the
> > directory structure is preserved (check
> > `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some
> > options here about how the restore process must look like:
> > - fully manual snapshot restore (will be documented)
> > - ansible or shell scripts for restore
> > - Java API for restore (I doubt we should go this way).
> >
> > 3. The snapshot `create` procedure creates a snapshot of all
> > persistent caches available on the cluster (see limitations [1]).
> >
> > 2. The snapshot `create` procedure is available through Java API and
> > JMX (control.sh may be implemented further).
> >
> > Java API:
> > IgniteFuture fut = ignite.snapshot()
> > .createSnapshot(name);
> >
> > JMX:
> > SnapshotMXBean mxBean = getMBean(ignite.name());
> > mxBean.createSnapshot(name);
> >
> > 3. The Distribute Process [3] is used to perform a cluster-wide
> > snapshot procedure, so we've avoided a lot of boilerplate code here.
> >
> > 4. The design document [1] contains also an internal API for creating
> > a consistent local snapshot of requested cache groups and transfer it
> > to another node using the FileTransmission protocol [6]. This is one
> > of the parts of IEP-28 [5] for cluster rebalancing via partition files
> > and an important part for u

Re: Apache Ignite 2.8.1 RELEASE [Time, Scope, Manager]

2020-04-08 Thread Вячеслав Коптилин
Folks,

I'd like to add ticket IGNITE-12805 "NullPointerException on node restart
when 3rd party persistence and Ignite native persistence are used" to
ignite-2.8.1 scope.

[1]  https://issues.apache.org/jira/browse/IGNITE-12805

Thanks,
S.

вт, 7 апр. 2020 г. в 19:57, Ilya Kasnacheev :

> Hello!
>
> Done!
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> вт, 7 апр. 2020 г. в 12:31, Sergey :
>
> > Hi,
> >
> > I'm proposing to add
> > https://issues.apache.org/jira/browse/IGNITE-12549  (fix iterators/scan
> > queries for replicated caches)
> > to 2.8.1.
> >
> > Best regards,
> > Sergey Kosarev.
> >
> >
> > вс, 5 апр. 2020 г. в 01:22, Saikat Maitra :
> >
> > > Hi,
> > >
> > > I observed that we already have release 2.8.1 branch
> > > https://github.com/apache/ignite/tree/ignite-2.8.1
> > >
> > > In that case we should be ok to merge these 2 open PRs in master to
> make
> > it
> > > available for 2.9.0 release.
> > >
> > > https://github.com/apache/ignite/pull/7240
> > > https://github.com/apache/ignite/pull/7227
> > >
> > > Can you please review and confirm?
> > >
> > > Regards,
> > > Saikat
> > >
> > > On Fri, Mar 20, 2020 at 8:19 AM Maxim Muzafarov 
> > wrote:
> > >
> > > > Igniters,
> > > >
> > > > I support Nikolay Izhikov as the release manager of 2.8.1 Apache
> > > > Ignite release. Since no one else of committers, PMCs expressed a
> > > > desire to lead this release I think we can close this question and
> > > > focus on the release scope and dates.
> > > >
> > > >
> > > > Ivan,
> > > >
> > > > You helped me configuring TC.Bot that time, can you please help again
> > > > and set `ignite-2.8.1` branch for guard under TC.Bot [1]? We should
> > > > start collecting TC statistics for the release branch as early as
> > > > possible.
> > > >
> > > >
> > > > [1] https://mtcga.gridgain.com/guard.html
> > > >
> > > > On Fri, 20 Mar 2020 at 14:48, Taras Ledkov 
> > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I propose to add the issue [1] related to SQL query execution to
> this
> > > > scope.
> > > > >
> > > > > We had omitted this case and Ignite 2.8 contains serious SQL issue:
> > > > > cursor of a local query is not thread-safe.
> > > > > It is root cause of several SQL issue, e.g. JDBC thin client cannot
> > > > > execute query  from replicated cache,
> > > > > PME may hang after execute such queries from JDBC thin, etc.
> > > > >
> > > > > [1]. https://issues.apache.org/jira/browse/IGNITE-12800
> > > > >
> > > > > On 19.03.2020 17:52, Denis Magda wrote:
> > > > > > Igniters,
> > > > > >
> > > > > > As long as 2.8.1 is inevitable and we already keep adding
> critical
> > > > issues
> > > > > > to the working queue, let's settle on the release time frames and
> > > > decide
> > > > > > who will be a release manager. This is the time proposed by Maxim
> > > and,
> > > > > > personally, I concur with such a schedule:
> > > > > >
> > > > > > - Scope Freeze: April 15, 2020
> > > > > > - Code Freeze: April 22, 2020
> > > > > > - Voting Date: April 27, 2020
> > > > > > - Release Date: May 1, 2020
> > > > > >
> > > > > > Do we agree on this time? Is there anybody who ready to drive the
> > > > release
> > > > > > as a release manager?
> > > > > >
> > > > > > -
> > > > > > Denis
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 19, 2020 at 5:50 AM Sergey Antonov <
> > > > antonovserge...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> Folks,
> > > > > >>
> > > > > >> I'd like to add ticket IGNITE-12774 Transaction hangs after too
> > many
> > > > open
> > > > > >> files NIO exception [1] to ignite-2.8.1 scope.
> > > > > >>
> > > > > >> [1] https://issues.apache.org/jira/browse/IGNITE-12774
> > > > > >>
> > > > > >> ср, 18 мар. 2020 г. в 16:53, Maxim Muzafarov  >:
> > > > > >>
> > > > > >>> Folks,
> > > > > >>>
> > > > > >>> Can we add ignite-2.8.1 [2] branch under TC.Bot protection [1]?
> > > > > >>>
> > > > > >>>
> > > > > >>> [1] https://mtcga.gridgain.com/guard.html
> > > > > >>> [2] https://github.com/apache/ignite/tree/ignite-2.8.1
> > > > > >>>
> > > > > >>> On Mon, 16 Mar 2020 at 16:32, Alexey Goncharuk
> > > > > >>>  wrote:
> > > > >  Folks,
> > > > > 
> > > > >  I've walked through all the commits to master since 2.8 branch
> > was
> > > > cut
> > > > > >>> and
> > > > >  filtered some tickets that in my opinion are worth including
> to
> > > > 2.8.1
> > > > >  release below (note that they are ready end the effort of
> > > including
> > > > > >> them
> > > > > >>> to
> > > > >  the release should be low as long as there are no implicit
> > > > dependencies
> > > > >  between tickets). Please share your opinion on whether we
> should
> > > > > >> include
> > > > >  them to the 2.8.1.
> > > > > 
> > > > >  IGNITE-12717 SQL: index creation refactoring
> > > > >  IGNITE-12590 MERGE INTO query is failing on Ignite client node
> > > > >  IGNITE-12671 Update of partition's states can stuck when
> > rebalance
> > > > >  completed during excha

Re: Ignite Website is Moved to Git

2020-04-08 Thread Denis Magda
Alright folks, we've approached the end of the migration process and here
is an updated instruction for those who will push changes:
https://cwiki.apache.org/confluence/display/IGNITE/Website+Development

-
Denis


On Mon, Apr 6, 2020 at 5:05 PM Denis Magda  wrote:

> A minor correction, that's the valid address, some of you got 404 by
> opening the link shared before:
> https://github.com/apache/ignite-website/
> 
> -
> Denis
>
>
> On Mon, Apr 6, 2020 at 2:48 PM Denis Magda  wrote:
>
>> Just a heads-up for you that the website is now hosted in Git and serves
>> the content from the "master" branch:
>> https://github.com/apache/ignite-website/blob/master
>>
>> I'll update our "Website Development Instructions" in the next couple of
>> days and send a note here.
>>
>> -
>> Denis
>>
>


Remove "This operating system has been tested less rigorously" diagnostic

2020-04-08 Thread Pavel Tupitsyn
Igniters,

Let's remove "This operating system has been tested less rigorously"
diagnostic [1] [2].
It does not make sense:
* All Linux and macOS versions are considered the same
* Windows versions are differentiated
* Windows 10 and all Windows Servers are considered badly tested

None of that is correct. We barely test on macOS. We don't test all the
different Linux distros, old kernels, and so on.

It is hardly possible to make this diagnostic useful. Let's remove it.
Any objections?

[1]
https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/GridDiagnostic.java#L94
[2]
https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/IgniteUtils.java#L6864


[jira] [Created] (IGNITE-12880) IllegalArgumentException on activation of LogExporterSpi

2020-04-08 Thread Denis A. Magda (Jira)
Denis A. Magda created IGNITE-12880:
---

 Summary: IllegalArgumentException on activation of LogExporterSpi
 Key: IGNITE-12880
 URL: https://issues.apache.org/jira/browse/IGNITE-12880
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.8
Reporter: Denis A. Magda


I tried to enable `LogExporterSpi` and getting an error below. Running on 
jdk1.8.0_77.jdk and macOS Catalina. See a source file attached.

{noformat}
Apr 08, 2020 1:07:00 PM org.apache.ignite.logger.java.JavaLogger error
SEVERE: Got exception while starting (will rollback startup routine).
java.lang.IllegalArgumentException
at 
java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:589)
at 
org.apache.ignite.internal.processors.metric.PushMetricsExporterAdapter.spiStart(PushMetricsExporterAdapter.java:56)
at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
at 
org.apache.ignite.internal.processors.metric.GridMetricManager.start(GridMetricManager.java:277)
at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1960)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1171)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2038)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1703)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1117)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:637)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:563)
at org.apache.ignite.Ignition.start(Ignition.java:321)
at ServerNodeStartup.main(ServerNodeStartup.java:43)

Apr 08, 2020 1:07:00 PM org.apache.ignite.logger.java.JavaLogger error
SEVERE: Failed to stop component (ignoring): GridManagerAdapter [enabled=true, 
name=o.a.i.i.processors.metric.GridMetricManager]
java.lang.NullPointerException
at 
org.apache.ignite.internal.processors.metric.PushMetricsExporterAdapter.spiStop(PushMetricsExporterAdapter.java:71)
at 
org.apache.ignite.internal.managers.GridManagerAdapter.stopSpi(GridManagerAdapter.java:330)
at 
org.apache.ignite.internal.processors.metric.GridMetricManager.stop(GridMetricManager.java:314)
at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2627)
at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2499)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1395)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2038)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1703)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1117)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:637)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:563)
at org.apache.ignite.Ignition.start(Ignition.java:321)
at ServerNodeStartup.main(ServerNodeStartup.java:43)

[13:07:00] Ignite node stopped wih ERRORS [uptime=00:00:01.303]
Exception in thread "main" class org.apache.ignite.IgniteException: null
at 
org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1067)
at org.apache.ignite.Ignition.start(Ignition.java:324)
at ServerNodeStartup.main(ServerNodeStartup.java:43)
Caused by: class org.apache.ignite.IgniteCheckedException: null
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1402)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2038)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1703)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1117)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:637)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:563)
at org.apache.ignite.Ignition.start(Ignition.java:321)
... 1 more
Caused by: java.lang.IllegalArgumentException
at 
java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:589)
at 
org.apache.ignite.internal.processors.metric.PushMetricsExporterAdapter.spiStart(PushMetricsExporterAdapter.java:56)
at 
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:297)
at 
org.apache.ignite.internal.processors.metric.GridMetricManager.start(GridMetricManager.java:277)
at 
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1960)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1171)
... 7 more

Process finished with 

New Monitoring System: items to complete before GA

2020-04-08 Thread Denis Magda
Igniters,

There was a private discussion between Nikolay Izhikov, Andrey Gura, Alex
Goncharuk and I some time ago where we attempted to outline a list of tasks
to complete before announcing the feature in the GA state (removing the
experimental flag from the new APIs and deprecating the legacy ones).
Folks, with your permission, let me share the list first and next we can
talk out each item in detail separately:

1. Add ability to enable\disable subset of the metrics for collection:
https://issues.apache.org/jira/browse/IGNITE-11927
Am I right that the task will let us perform a desired configuration
MetricExporterSPI and its specific implementation? Frankly, I couldn't
figure out how to do that
with "setExportFilter(Predicate filter);" method --
probably, that's just the matter of documentation.

2. Extension of IgniteSpiContext in order to allow Ignite developers to
implement their own exporters (with no usage of the internal API).

3. Adding/extending/removing public metrics related to the API
facade. Andrey, please elaborate on this item if needed.

Also, I had a chance to experience the API in wearing a hat of an
application developer while preparing for the Ignite 2.8 webinar. Please
take into account my feedback below. Overall, I was impressed with the way
the new system is designed and operates. All I had to do is just to
register exporters and connect my favorite tool to the cluster. Kudos to
Nikolay, Andrey and everyone else involved. So, that's my list and I ready
to create JIRA tickets whenever is relevant:

4. After registering an exporter with a node only a subset of metrics gets
updated. For instance, "GetTime_X" set of metrics belonging to a specific
cache get updated all the times while CacheGets/Puts/Hits require me to
enable "setStatisticsEnable(true)" for the cache. This is extremely
confusing. Any way we can remove this ambiguity? Guess the same applies for
memory and persistence metrics - there are old ones that might require to
turn on "metricsEnabled" flag for data regions.

5. The system view that should track compute tasks (views.tasks) is always
empty and I can see a number of tasks running via the compute.jobs registry
or in "views.views" mxbean. Again, confusing from the application developer
standpoint.

6. Ignite MBeans are listed under "org.apache.{node_id}" but should be
placed under "org.apache.ignite.{node_id}". Guess this is left as is for
now to preserve backward compatibility.

7. You see hundreds of metrics once connect through JMX and it's hard to
sort out the valuable ones. The documentation can do a great job outlining
the right ones but we can also add descriptions to MXBeans explaining what
a metric is used for.

8. Failed to use LogExporterSpi:
https://issues.apache.org/jira/browse/IGNITE-12880

-
Denis


Re: [DISCUSSION] Hot cache backup

2020-04-08 Thread Andrey Dolmatov
I would like to understand your solution deeper. Hope, that my questions
are interesting not only for me:

   - What about primary/backup node data consistency. I found, that [1]
   Cassandra uses eventually consistent backups, so some backup data could
   miss from snapshot. If I apply snapshot, would Ignite detect and rebalance
   data to backup nodes?
   - I cant quite picture how persistence rebalancing works, but according
   to [2] it uses WAL logs. Snapshot doesn't contain WAL data, correct? Did
   You analyze alternative snapshot solutions based on WAL?

[1]
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsAboutSnapshots.html
[2]
https://cwiki.apache.org/confluence/display/IGNITE/Persistent+Store+Architecture#PersistentStoreArchitecture-Rebalancing

ср, 8 апр. 2020 г. в 18:22, Maxim Muzafarov :

> Andrey,
>
>
> Thanks for your questions, I've also clarified some details on the
> IEP-43 [1] page according to them.
>
> > Does snapshot contain only primary data or backup partitions or both?
>
> A snapshot contains a full copy of persistence data on each local
> node. This means all primary, backup partitions and the SQL index file
> available on the local node are copied to snapshot.
>
> > Could I create snapshot from m-node cluster and apply it to n-node
> cluster (n<>m)?
>
> Currently, the restore procedure is fully manual, but it is possible
> to restore on different topology in general. There are a few options
> here:
> - m == n, the easiest and fastest way
> - m < n, cluster will start and the rebalance will happen (see
> testClusterSnapshotWithRebalancing in PR). If some SQL indexes exist
> it may take a quite a long time to complete.
> - m > n, the hardest case. For instance, if backups > 1 you can start
> a cluster and remove node one by one from baseline. I think this case
> should be covered by additional recovery scripts which will be
> developed further.
>
> > - Should data node has extra space on persistent store to create
> snapshot? Or, from another point of view, woild size of temporary file be
> equal to size of all data on cluster node?
>
> If a cluster has no load you will need only a free space to store
> snapshot which is almost equal to the node `db` directory size.
>
> If a cluster is under the load it needs some extra space to store
> intermediate snapshot results. The amount of such space depends on how
> fast cache partition files are copied to snapshot directory (if disks
> are slow). The maximum size of the temporary file per each partition
> is equal to the size of the appropriate partition file. So, the worst
> case you need x3 extra disk size. But according to my measurements
> assume SSD is used and size of each partition is 300MB it will require
> no more than 1-3% to a cluster under high load.
>
> - What resulted snapshot is, single file or collection of files (one
> for every data node)?
>
> Check the example of the snapshot directory structure on the IEP-43
> page [1], this is how a completed snapshot will look like.
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Restoresnapshot(manually)
>
> On Wed, 8 Apr 2020 at 17:18, Andrey Dolmatov  wrote:
> >
> > Hi, Maxim!
> > It is very useful feature, great job!
> >
> > But could you explain me some aspects?
> >
> >- Does snapshot contain only primary data or backup partitions or
> both?
> >- Could I create snapshot from m-node cluster and apply it to n-node
> >cluster (n<>m)?
> >- Should data node has extra space on persistent store to create
> >snapshot? Or, from another point of view, woild size of temporary
> file be
> >equal to size of all data on cluster node?
> >- What resulted snapshot is, single file or collection of files (one
> for
> >every data node)?
> >
> > I apologize for my questions, but i really interested in such feature.
> >
> >
> > вт, 7 апр. 2020 г. в 22:10, Maxim Muzafarov :
> >
> > > Igniters,
> > >
> > >
> > > I'd like to back to the discussion of a snapshot operation for Apache
> > > Ignite for persistence cache groups and I propose my changes below. I
> > > have prepared everything so that the discussion is as meaningful and
> > > specific as much as possible:
> > >
> > > - IEP-43: Cluster snapshot [1]
> > > - The Jira task IGNITE-11073 [2]
> > > - PR with described changes, Patch Available [4]
> > >
> > > Changes are ready for review.
> > >
> > >
> > > Here are a few implementation details and my thoughts:
> > >
> > > 1. Snapshot restore assumed to be manual at the first step. The
> > > process will be described on our documentation pages, but it is
> > > possible to start node right from the snapshot directory since the
> > > directory structure is preserved (check
> > > `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some
> > > options here about how the restore process must look like:
> > > - fully manual snapshot restore (will be documented)
> > > - ansible

Re: Out of memory with eviction failure on persisted cache

2020-04-08 Thread Evgenii Zhuravlev
Raymond,

I've seen this behaviour before, it occurs on massive data loading to a
cluster with a small data region. It's not reproducible with data regions
with normal sizes, I think that this is the reason why this issue is not
fixed yet.

Best Regards,
Evgenii

ср, 8 апр. 2020 г. в 04:23, Raymond Wilson :

> Evgenii,
>
> Have you had a chance to look into the reproducer?
>
> Thanks,
> Raymond.
>
> On Fri, Mar 6, 2020 at 2:51 PM Raymond Wilson 
> wrote:
>
>> Evgenii,
>>
>> I have created a reproducer that triggers the error with the buffer size
>> set to 64Mb. The program.cs/csproj and log for the run that triggered the
>> error are attached.
>>
>> Thanks,
>> Raymond.
>>
>>
>>
>> On Fri, Mar 6, 2020 at 1:08 PM Raymond Wilson 
>> wrote:
>>
>>> The reproducer is my development system, which is hard to share.
>>>
>>> I have increased the size of the buffer to 256Mb, and it copes with the
>>> example data load, though I have not tried larger data sets.
>>>
>>> From an analytical perspective, is this an error that is possible or
>>> expected to occur when using a cache with a persistent data region defined?
>>>
>>> I'll see if I can make a small reproducer.
>>>
>>> On Fri, Mar 6, 2020 at 11:34 AM Evgenii Zhuravlev <
>>> e.zhuravlev...@gmail.com> wrote:
>>>
 Hi Raymond,

 I tried to reproduce it, but without success. Can you share the
 reproducer?

 Also, have you tried to load much more data with 256mb data region? I
 think it should work without issues.

 Thanks,
 Evgenii

 ср, 4 мар. 2020 г. в 16:14, Raymond Wilson >>> >:

> Hi Evgenii,
>
> I am individually Put()ing the elements using PutIfAbsent(). Each
> element can range 2kb-35Kb in size.
>
> Actually, the process that writes the data does not write the data
> directly to the cache, it uses a compute function to send the payload to
> the process that is doing the reading. The compute function applies
> validation logic and uses PutIfAbsent() to write the data into the cache.
>
> Sorry for the confusion.
>
> Raymond.
>
>
> On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <
> e.zhuravlev...@gmail.com> wrote:
>
>> Hi,
>>
>> How are you loading the data? Do you use putAll or DataStreamer?
>>
>> Evgenii
>>
>> ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <
>> raymond_wil...@trimble.com>:
>>
>>> To add some further detail:
>>>
>>> There are two processes interacting with the cache. One process is
>>> writing
>>> data into the cache, while the second process is extracting data
>>> from the
>>> cache using a continuous query. The process that is the reader of
>>> the data
>>> is throwing the exception.
>>>
>>> Increasing the cache size further to 256 Mb resolves the problem for
>>> this
>>> data set, however we have data sets more than 100 times this size
>>> which we
>>> will be processing.
>>>
>>> Thanks,
>>> Raymond.
>>>
>>>
>>> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <
>>> raymond_wil...@trimble.com>
>>> wrote:
>>>
>>> > I've been having a sporadic issue with the Ignite 2.7.5 JVM
>>> halting due to
>>> > out of memory error related to a cache with persistence enabled
>>> >
>>> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up
>>> support for
>>> > C# affinity functions and now have this issue appearing regularly
>>> while
>>> > adding around 400Mb of data into the cache which is configured to
>>> have
>>> > 128Mb of memory (this was 64Mb but I increased it to see if the
>>> failure
>>> > would resolve.
>>> >
>>> > The error I get is:
>>> >
>>> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM
>>> will be
>>> > halted immediately due to the failure: [failureCtx=FailureContext
>>> > [type=CRITICAL_ERROR, err=class
>>> o.a.i.i.mem.IgniteOutOfMemoryException:
>>> > Failed to find a page for eviction [segmentCapacity=1700,
>>> loaded=676,
>>> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
>>> > failedToPrepare=675]
>>> > Out of memory in data region [name=TAGFileBufferQueue,
>>> initSize=128.0 MiB,
>>> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>>> >   ^-- Increase maximum off-heap memory size
>>> > (DataRegionConfiguration.maxSize)
>>> >   ^-- Enable Ignite persistence
>>> > (DataRegionConfiguration.persistenceEnabled)
>>> >   ^-- Enable eviction or expiration policies]]
>>> >
>>> > I'm not running an eviction policy as I thought this was not
>>> required for
>>> > caches with persistence enabled.
>>> >
>>> > I'm surprised by this behaviour as I expected the persistence
>>> mechanism to
>>> > handle it. The error relating to failure to find a page for
>>> eviction
>

Apache Ignite 2.8.1 RELEASE [Time, Scope, Manager]

2020-04-08 Thread Zhenya Stanilovsky


This ticket is very important too, as for me [1] Anton Kalashnikov, what do you 
think ?
 
[1] https://issues.apache.org/jira/browse/IGNITE-12801
>Folks,
>
>I'd like to add ticket IGNITE-12805 "NullPointerException on node restart
>when 3rd party persistence and Ignite native persistence are used" to
>ignite-2.8.1 scope.
>
>[1]  https://issues.apache.org/jira/browse/IGNITE-12805
>
>Thanks,
>S.
>
>вт, 7 апр. 2020 г. в 19:57, Ilya Kasnacheev < ilya.kasnach...@gmail.com >:
>
>> Hello!
>>
>> Done!
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> вт, 7 апр. 2020 г. в 12:31, Sergey < macrerg...@gmail.com >:
>>
>> > Hi,
>> >
>> > I'm proposing to add
>> >  https://issues.apache.org/jira/browse/IGNITE-12549 (fix iterators/scan
>> > queries for replicated caches)
>> > to 2.8.1.
>> >
>> > Best regards,
>> > Sergey Kosarev.
>> >
>> >
>> > вс, 5 апр. 2020 г. в 01:22, Saikat Maitra < saikat.mai...@gmail.com >:
>> >
>> > > Hi,
>> > >
>> > > I observed that we already have release 2.8.1 branch
>> > >  https://github.com/apache/ignite/tree/ignite-2.8.1
>> > >
>> > > In that case we should be ok to merge these 2 open PRs in master to
>> make
>> > it
>> > > available for 2.9.0 release.
>> > >
>> > >  https://github.com/apache/ignite/pull/7240
>> > >  https://github.com/apache/ignite/pull/7227
>> > >
>> > > Can you please review and confirm?
>> > >
>> > > Regards,
>> > > Saikat
>> > >
>> > > On Fri, Mar 20, 2020 at 8:19 AM Maxim Muzafarov < mmu...@apache.org >
>> > wrote:
>> > >
>> > > > Igniters,
>> > > >
>> > > > I support Nikolay Izhikov as the release manager of 2.8.1 Apache
>> > > > Ignite release. Since no one else of committers, PMCs expressed a
>> > > > desire to lead this release I think we can close this question and
>> > > > focus on the release scope and dates.
>> > > >
>> > > >
>> > > > Ivan,
>> > > >
>> > > > You helped me configuring TC.Bot that time, can you please help
>> again
>> > > > and set `ignite-2.8.1` branch for guard under TC.Bot [1]? We should
>> > > > start collecting TC statistics for the release branch as early as
>> > > > possible.
>> > > >
>> > > >
>> > > > [1]  https://mtcga.gridgain.com/guard.html
>> > > >
>> > > > On Fri, 20 Mar 2020 at 14:48, Taras Ledkov < tled...@gridgain.com >
>> > wrote:
>> > > > >
>> > > > > Hi,
>> > > > >
>> > > > > I propose to add the issue [1] related to SQL query execution to
>> this
>> > > > scope.
>> > > > >
>> > > > > We had omitted this case and Ignite 2.8 contains serious SQL
>> issue:
>> > > > > cursor of a local query is not thread-safe.
>> > > > > It is root cause of several SQL issue, e.g. JDBC thin client
>> cannot
>> > > > > execute query from replicated cache,
>> > > > > PME may hang after execute such queries from JDBC thin, etc.
>> > > > >
>> > > > > [1].  https://issues.apache.org/jira/browse/IGNITE-12800
>> > > > >
>> > > > > On 19.03.2020 17:52, Denis Magda wrote:
>> > > > > > Igniters,
>> > > > > >
>> > > > > > As long as 2.8.1 is inevitable and we already keep adding
>> critical
>> > > > issues
>> > > > > > to the working queue, let's settle on the release time frames
>> and
>> > > > decide
>> > > > > > who will be a release manager. This is the time proposed by
>> Maxim
>> > > and,
>> > > > > > personally, I concur with such a schedule:
>> > > > > >
>> > > > > > - Scope Freeze: April 15, 2020
>> > > > > > - Code Freeze: April 22, 2020
>> > > > > > - Voting Date: April 27, 2020
>> > > > > > - Release Date: May 1, 2020
>> > > > > >
>> > > > > > Do we agree on this time? Is there anybody who ready to drive
>> the
>> > > > release
>> > > > > > as a release manager?
>> > > > > >
>> > > > > > -
>> > > > > > Denis
>> > > > > >
>> > > > > >
>> > > > > > On Thu, Mar 19, 2020 at 5:50 AM Sergey Antonov <
>> > > >  antonovserge...@gmail.com >
>> > > > > > wrote:
>> > > > > >
>> > > > > >> Folks,
>> > > > > >>
>> > > > > >> I'd like to add ticket IGNITE-12774 Transaction hangs after
>> too
>> > many
>> > > > open
>> > > > > >> files NIO exception [1] to ignite-2.8.1 scope.
>> > > > > >>
>> > > > > >> [1]  https://issues.apache.org/jira/browse/IGNITE-12774
>> > > > > >>
>> > > > > >> ср, 18 мар. 2020 г. в 16:53, Maxim Muzafarov
>> < mmu...@apache.org
>> >:
>> > > > > >>
>> > > > > >>> Folks,
>> > > > > >>>
>> > > > > >>> Can we add ignite-2.8.1 [2] branch under TC.Bot protection
>> [1]?
>> > > > > >>>
>> > > > > >>>
>> > > > > >>> [1]  https://mtcga.gridgain.com/guard.html
>> > > > > >>> [2]  https://github.com/apache/ignite/tree/ignite-2.8.1
>> > > > > >>>
>> > > > > >>> On Mon, 16 Mar 2020 at 16:32, Alexey Goncharuk
>> > > > > >>> < alexey.goncha...@gmail.com > wrote:
>> > > > >  Folks,
>> > > > > 
>> > > > >  I've walked through all the commits to master since 2.8
>> branch
>> > was
>> > > > cut
>> > > > > >>> and
>> > > > >  filtered some tickets that in my opinion are worth including
>> to
>> > > > 2.8.1
>> > > > >  release below (note that they are ready end the effort of
>> > > including
>> > > > > >> them
>> > > > > >>> to
>> >