Re: Thin client: compute support
> I can't see any usage of request id in query cursors You are right, cursor id is a separate thing. Anyway, my point stands. > client sends long term tasks to nodes and wants to do it with load balancing I still don't get it. Can you please provide equivalent use case with existing "thick" client? On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov wrote: > > And it is fine to use request ID to identify compute tasks (as we do with > query cursors). > I can't see any usage of request id in query cursors. We send query request > and get cursor id in response. After that, we only use cursor id (to get > next pages and to close the resource). Did I miss something? > > > Looks like I'm missing something - how is topology change relevant to > executing compute tasks from client? > It's not relevant directly. But there are some cases where it will be > helpful. For example, if client sends long term tasks to nodes and wants to > do it with load balancing it will detect topology change only after some > time in the future with the first response, so load balancing will no work. > Perhaps we can add optional "topology version" field to the > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn : > > > Alex, > > > > > we will mix entities from different layers (transport layer and request > > body) > > I would not call our message header (which includes the id) "transport > > layer". > > TCP is our transport layer. And it is fine to use request ID to identify > > compute tasks (as we do with query cursors). > > > > > we still can't be sure that the task is successfully started on a > server > > The request to start the task will fail and we'll get a response > indicating > > that right away > > > > > we won't ever know about topology change > > Looks like I'm missing something - how is topology change relevant to > > executing compute tasks from client? > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov > > wrote: > > > > > Pavel, in this case, we will mix entities from different layers > > (transport > > > layer and request body), it's not very good. The same behavior we can > > > achieve with generated on client-side task id, but there will be no > > > inter-layer data intersection and I think it will be easier to > implement > > on > > > both client and server-side. But we still can't be sure that the task > is > > > successfully started on a server. We won't ever know about topology > > change, > > > because topology changed flag will be sent from server to client only > > with > > > a response when the task will be completed. Are we accept that? > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn : > > > > > > > Alex, > > > > > > > > I have a simpler idea. We already do request id handling in the > > protocol, > > > > so: > > > > - Client sends a normal request to execute compute task. Request ID > is > > > > generated as usual. > > > > - As soon as task is completed, a response is received. > > > > > > > > As for cancellation - client can send a new request (with new request > > ID) > > > > and (in the body) pass the request ID from above > > > > as a task identifier. As a result, there are two responses: > > > > - Cancellation response > > > > - Task response (with proper cancelled status) > > > > > > > > That's it, no need to modify the core of the protocol. One request - > > one > > > > response. > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > plehanov.a...@gmail.com > > > > > > > wrote: > > > > > > > > > Pavel, we need to inform the client when the task is completed, we > > need > > > > the > > > > > ability to cancel the task. I see several ways to implement this: > > > > > > > > > > 1. Сlient sends a request to the server to start a task, server > > return > > > > task > > > > > id in response. Server notifies client when task is completed with > a > > > new > > > > > request (from server to client). Client can cancel the task by > > sending > > > a > > > > > new request with operation type "cancel" and task id. In this case, > > we > > > > > should implement 2-ways requests. > > > > > 2. Client generates unique task id and sends a request to the > server > > to > > > > > start a task, server don't reply immediately but wait until task is > > > > > completed. Client can cancel task by sending new request with > > operation > > > > > type "cancel" and task id. In this case, we should decouple request > > and > > > > > response on the server-side (currently response is sent right after > > > > request > > > > > was processed). Also, we can't be sure that task is successfully > > > started > > > > on > > > > > a server. > > > > > 3. Client sends a request to the server to start a task, server > > return > > > id > > > > > in response. Client periodically asks the server about task status. > > > > Client > > > > > can cancel the task by sending new request with operation type > > "cancel" > > > > and > > > > > task id. This c
Re: Thin client: compute support
> Anyway, my point stands. I can't agree. Why you don't want to use task id for this? After all, we don't cancel request (request is already processed), we cancel the task. So it's more convenient to use task id here. > Can you please provide equivalent use case with existing "thick" client? For example: Cluster consists of one server node. Client uses some cluster group filtration (for example forServers() cluster group). Client starts to send periodically (for example 1 per minute) long-term (for example 1 hour long) tasks to the cluster. Meanwhile, several server nodes joined the cluster. In case of thick client: All server nodes will be used, tasks will be load balanced. In case of thin client: Only one server node will be used, client will detect topology change after an hour. вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn : > > I can't see any usage of request id in query cursors > You are right, cursor id is a separate thing. > Anyway, my point stands. > > > client sends long term tasks to nodes and wants to do it with load > balancing > I still don't get it. Can you please provide equivalent use case with > existing "thick" client? > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov > wrote: > > > > And it is fine to use request ID to identify compute tasks (as we do > with > > query cursors). > > I can't see any usage of request id in query cursors. We send query > request > > and get cursor id in response. After that, we only use cursor id (to get > > next pages and to close the resource). Did I miss something? > > > > > Looks like I'm missing something - how is topology change relevant to > > executing compute tasks from client? > > It's not relevant directly. But there are some cases where it will be > > helpful. For example, if client sends long term tasks to nodes and wants > to > > do it with load balancing it will detect topology change only after some > > time in the future with the first response, so load balancing will no > work. > > Perhaps we can add optional "topology version" field to the > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn : > > > > > Alex, > > > > > > > we will mix entities from different layers (transport layer and > request > > > body) > > > I would not call our message header (which includes the id) "transport > > > layer". > > > TCP is our transport layer. And it is fine to use request ID to > identify > > > compute tasks (as we do with query cursors). > > > > > > > we still can't be sure that the task is successfully started on a > > server > > > The request to start the task will fail and we'll get a response > > indicating > > > that right away > > > > > > > we won't ever know about topology change > > > Looks like I'm missing something - how is topology change relevant to > > > executing compute tasks from client? > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > plehanov.a...@gmail.com> > > > wrote: > > > > > > > Pavel, in this case, we will mix entities from different layers > > > (transport > > > > layer and request body), it's not very good. The same behavior we can > > > > achieve with generated on client-side task id, but there will be no > > > > inter-layer data intersection and I think it will be easier to > > implement > > > on > > > > both client and server-side. But we still can't be sure that the task > > is > > > > successfully started on a server. We won't ever know about topology > > > change, > > > > because topology changed flag will be sent from server to client only > > > with > > > > a response when the task will be completed. Are we accept that? > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn : > > > > > > > > > Alex, > > > > > > > > > > I have a simpler idea. We already do request id handling in the > > > protocol, > > > > > so: > > > > > - Client sends a normal request to execute compute task. Request ID > > is > > > > > generated as usual. > > > > > - As soon as task is completed, a response is received. > > > > > > > > > > As for cancellation - client can send a new request (with new > request > > > ID) > > > > > and (in the body) pass the request ID from above > > > > > as a task identifier. As a result, there are two responses: > > > > > - Cancellation response > > > > > - Task response (with proper cancelled status) > > > > > > > > > > That's it, no need to modify the core of the protocol. One request > - > > > one > > > > > response. > > > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > plehanov.a...@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > Pavel, we need to inform the client when the task is completed, > we > > > need > > > > > the > > > > > > ability to cancel the task. I see several ways to implement this: > > > > > > > > > > > > 1. Сlient sends a request to the server to start a task, server > > > return > > > > > task > > > > > > id in response. Server notifies client when task is completed > with
Re: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Hello! I have a hunch that we are trying to build Apache Solr (or Solr Cloud) into Apache Ignite. I think that's a lot of effort that is not very justified. I don't think we should try to implement sorting in Apache Ignite, because it is a lot of work, and a lot of code in our code base which we don't really want. Regards, -- Ilya Kasnacheev пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga : > Dear Igniters, > > The first part of TextQuery improvement - a result limit - was developed > and merged. > Now we have to develop most important functionality here - proper sorting > of Lucene index response and correct reducing of them for distributed > queries. > > *There are two Lucene based aspects* > > 1. In case of using no sorting fields, the documents in response are still > ordered by relevance. > Actually this is ScoreDoc.score value. > In order to reduce the distributed results correctly, the score should be > passed with response. > > 2. When sorting by conventional fields, then Lucene should have these > fields properly indexed and > corresponding Sort object should be applied to Lucene's search call. > In order to mark those fields a new annotation like '@SortField' may be > introduced. > > *Reducing on Ignite * > > The obvious point of distributed response reduction is class > GridCacheDistributedQueryFuture. > Though, @Ivan Pavlukhin mentioned class with similar functionality: > ReduceIndexSorted > What I see here, that it is tangled with H2 related classes ( > org.h2.result.Row) and might not be unified with TextQuery reduction. > > Still need a support here. > > Overall, the goal of this letter is to initiate discussion on TextQuery > Sorting implementation and come closer to ticket creation. > > BR, > Yuriy Shuliha > > вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov > пише: > > > Hi Dmitry, Yuriy. > > > > I've found GridCacheQueryFutureAdapter has newly added AtomicInteger > > 'total' field and 'limit; field as primitive int. > > > > Both fields are used inside synchronized block only. > > So, we can make both private and downgrade AtomicInteger to primitive > int. > > > > Most likely, these fields can be replaced with one field. > > > > > > > > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov > > wrote: > > > > > Hi Andrey, > > > > > > I've checked this ticket comments, and there is a TC Bot visa (with no > > > blockers). > > > > > > Do you have any concerns related to this patch? > > > > > > Sincerely, > > > Dmitriy Pavlov > > > > > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga : > > > > > >> Andrey, > > >> > > >> Per you request, I created ticket > > >> https://issues.apache.org/jira/browse/IGNITE-12291 linked to > > >> https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189 > > >> > > >> Could you please proceed with PR merge ? > > >> > > >> BR, > > >> Yuriy Shuliha > > >> > > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov > > > >> пише: > > >> > > >> > Hi Yuri, > > >> > > > >> > To get access to TC Bot you should register as TeamCity user [1], if > > you > > >> > didn't do this already. > > >> > Then you will be able to authorize on Ignite TC Bot page with same > > >> > credentials. > > >> > > > >> > [1] https://ci.ignite.apache.org/registerUser.html > > >> > > > >> > On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga > > wrote: > > >> > > > >> >> Andrew, > > >> >> > > >> >> I have corrected PR according to your notes. Please review. > > >> >> What will be the next steps in order to merge in? > > >> >> > > >> >> Y. > > >> >> > > >> >> чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov < > > andrey.mashen...@gmail.com> > > >> >> пише: > > >> >> > > >> >> > Yuri, > > >> >> > > > >> >> > I've done with review. > > >> >> > No crime found, but trivial compatibility bug. > > >> >> > > > >> >> > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga > > >> wrote: > > >> >> > > > >> >> > > Denis, > > >> >> > > > > >> >> > > Thank you for your attention to this. > > >> >> > > as for now, the > > https://issues.apache.org/jira/browse/IGNITE-12189 > > >> >> > ticket > > >> >> > > is still pending review. > > >> >> > > Do we have a chance to move it forward somehow? > > >> >> > > > > >> >> > > BR, > > >> >> > > Yuriy Shuliha > > >> >> > > > > >> >> > > пн, 30 вер. 2019 о 23:35 Denis Magda пише: > > >> >> > > > > >> >> > > > Yuriy, > > >> >> > > > > > >> >> > > > I've seen you opening a pull-request with the first changes: > > >> >> > > > https://issues.apache.org/jira/browse/IGNITE-12189 > > >> >> > > > > > >> >> > > > Alex Scherbakov and Ivan are you the right guys to do the > > review? > > >> >> > > > > > >> >> > > > - > > >> >> > > > Denis > > >> >> > > > > > >> >> > > > > > >> >> > > > On Fri, Sep 27, 2019 at 8:48 AM Павлухин Иван < > > >> vololo...@gmail.com> > > >> >> > > wrote: > > >> >> > > > > > >> >> > > > > Yuriy, > > >> >> > > > > > > >> >> > > > > Thank you for providing details! Quite interesting. > > >> >> > > > > > > >> >> > > > > Yes, we already have support of distributed limit and > merging > >
Re[2]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Ilya Kasnacheev, what a problem in Solr with Ignite functionality ? thanks ! >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev >: > >Hello! > >I have a hunch that we are trying to build Apache Solr (or Solr Cloud) into >Apache Ignite. I think that's a lot of effort that is not very justified. > >I don't think we should try to implement sorting in Apache Ignite, because >it is a lot of work, and a lot of code in our code base which we don't >really want. > >Regards, >-- >Ilya Kasnacheev > > >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga < shul...@gmail.com >: > >> Dear Igniters, >> >> The first part of TextQuery improvement - a result limit - was developed >> and merged. >> Now we have to develop most important functionality here - proper sorting >> of Lucene index response and correct reducing of them for distributed >> queries. >> >> *There are two Lucene based aspects* >> >> 1. In case of using no sorting fields, the documents in response are still >> ordered by relevance. >> Actually this is ScoreDoc.score value. >> In order to reduce the distributed results correctly, the score should be >> passed with response. >> >> 2. When sorting by conventional fields, then Lucene should have these >> fields properly indexed and >> corresponding Sort object should be applied to Lucene's search call. >> In order to mark those fields a new annotation like '@SortField' may be >> introduced. >> >> *Reducing on Ignite * >> >> The obvious point of distributed response reduction is class >> GridCacheDistributedQueryFuture. >> Though, @Ivan Pavlukhin mentioned class with similar functionality: >> ReduceIndexSorted >> What I see here, that it is tangled with H2 related classes ( >> org.h2.result.Row) and might not be unified with TextQuery reduction. >> >> Still need a support here. >> >> Overall, the goal of this letter is to initiate discussion on TextQuery >> Sorting implementation and come closer to ticket creation. >> >> BR, >> Yuriy Shuliha >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov < andrey.mashen...@gmail.com > >> пише: >> >> > Hi Dmitry, Yuriy. >> > >> > I've found GridCacheQueryFutureAdapter has newly added AtomicInteger >> > 'total' field and 'limit; field as primitive int. >> > >> > Both fields are used inside synchronized block only. >> > So, we can make both private and downgrade AtomicInteger to primitive >> int. >> > >> > Most likely, these fields can be replaced with one field. >> > >> > >> > >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov < dpav...@apache.org > >> > wrote: >> > >> > > Hi Andrey, >> > > >> > > I've checked this ticket comments, and there is a TC Bot visa (with no >> > > blockers). >> > > >> > > Do you have any concerns related to this patch? >> > > >> > > Sincerely, >> > > Dmitriy Pavlov >> > > >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga < shul...@gmail.com >: >> > > >> > >> Andrey, >> > >> >> > >> Per you request, I created ticket >> > >> https://issues.apache.org/jira/browse/IGNITE-12291 linked to >> > >> https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189 >> > >> >> > >> Could you please proceed with PR merge ? >> > >> >> > >> BR, >> > >> Yuriy Shuliha >> > >> >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov < andrey.mashen...@gmail.com >> > >> > >> пише: >> > >> >> > >> > Hi Yuri, >> > >> > >> > >> > To get access to TC Bot you should register as TeamCity user [1], if >> > you >> > >> > didn't do this already. >> > >> > Then you will be able to authorize on Ignite TC Bot page with same >> > >> > credentials. >> > >> > >> > >> > [1] https://ci.ignite.apache.org/registerUser.html >> > >> > >> > >> > On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga < shul...@gmail.com > >> > wrote: >> > >> > >> > >> >> Andrew, >> > >> >> >> > >> >> I have corrected PR according to your notes. Please review. >> > >> >> What will be the next steps in order to merge in? >> > >> >> >> > >> >> Y. >> > >> >> >> > >> >> чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov < >> > andrey.mashen...@gmail.com > >> > >> >> пише: >> > >> >> >> > >> >> > Yuri, >> > >> >> > >> > >> >> > I've done with review. >> > >> >> > No crime found, but trivial compatibility bug. >> > >> >> > >> > >> >> > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga < shul...@gmail.com > >> > >> wrote: >> > >> >> > >> > >> >> > > Denis, >> > >> >> > > >> > >> >> > > Thank you for your attention to this. >> > >> >> > > as for now, the >> > https://issues.apache.org/jira/browse/IGNITE-12189 >> > >> >> > ticket >> > >> >> > > is still pending review. >> > >> >> > > Do we have a chance to move it forward somehow? >> > >> >> > > >> > >> >> > > BR, >> > >> >> > > Yuriy Shuliha >> > >> >> > > >> > >> >> > > пн, 30 вер. 2019 о 23:35 Denis Magda < dma...@apache.org > пише: >> > >> >> > > >> > >> >> > > > Yuriy, >> > >> >> > > > >> > >> >> > > > I've seen you opening a pull-request with the first changes: >> > >> >> > > > https://issues.apache.org/jira/browse/IGNITE-12189 >> > >> >> > > > >> > >> >> > > > Alex Scherbako
Re: Re[2]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Hello! The problem here is that Solr is a multi-year effort by a lot of people. We can't match that. Maybe we could integrate with Solr/Solr Cloud instead, by feeding our cache information into their storage for indexing and relying on their own mechanisms for distributed IR sorting? Regards, -- Ilya Kasnacheev вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky : > > Ilya Kasnacheev, what a problem in Solr with Ignite functionality ? > > thanks ! > > >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev < > ilya.kasnach...@gmail.com>: > > > >Hello! > > > >I have a hunch that we are trying to build Apache Solr (or Solr Cloud) > into > >Apache Ignite. I think that's a lot of effort that is not very justified. > > > >I don't think we should try to implement sorting in Apache Ignite, because > >it is a lot of work, and a lot of code in our code base which we don't > >really want. > > > >Regards, > >-- > >Ilya Kasnacheev > > > > > >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga < shul...@gmail.com >: > > > >> Dear Igniters, > >> > >> The first part of TextQuery improvement - a result limit - was developed > >> and merged. > >> Now we have to develop most important functionality here - proper > sorting > >> of Lucene index response and correct reducing of them for distributed > >> queries. > >> > >> *There are two Lucene based aspects* > >> > >> 1. In case of using no sorting fields, the documents in response are > still > >> ordered by relevance. > >> Actually this is ScoreDoc.score value. > >> In order to reduce the distributed results correctly, the score should > be > >> passed with response. > >> > >> 2. When sorting by conventional fields, then Lucene should have these > >> fields properly indexed and > >> corresponding Sort object should be applied to Lucene's search call. > >> In order to mark those fields a new annotation like '@SortField' may be > >> introduced. > >> > >> *Reducing on Ignite * > >> > >> The obvious point of distributed response reduction is class > >> GridCacheDistributedQueryFuture. > >> Though, @Ivan Pavlukhin mentioned class with similar functionality: > >> ReduceIndexSorted > >> What I see here, that it is tangled with H2 related classes ( > >> org.h2.result.Row) and might not be unified with TextQuery reduction. > >> > >> Still need a support here. > >> > >> Overall, the goal of this letter is to initiate discussion on TextQuery > >> Sorting implementation and come closer to ticket creation. > >> > >> BR, > >> Yuriy Shuliha > >> > >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov < andrey.mashen...@gmail.com > > > >> пише: > >> > >> > Hi Dmitry, Yuriy. > >> > > >> > I've found GridCacheQueryFutureAdapter has newly added AtomicInteger > >> > 'total' field and 'limit; field as primitive int. > >> > > >> > Both fields are used inside synchronized block only. > >> > So, we can make both private and downgrade AtomicInteger to primitive > >> int. > >> > > >> > Most likely, these fields can be replaced with one field. > >> > > >> > > >> > > >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov < dpav...@apache.org > > > >> > wrote: > >> > > >> > > Hi Andrey, > >> > > > >> > > I've checked this ticket comments, and there is a TC Bot visa (with > no > >> > > blockers). > >> > > > >> > > Do you have any concerns related to this patch? > >> > > > >> > > Sincerely, > >> > > Dmitriy Pavlov > >> > > > >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga < shul...@gmail.com >: > >> > > > >> > >> Andrey, > >> > >> > >> > >> Per you request, I created ticket > >> > >> https://issues.apache.org/jira/browse/IGNITE-12291 linked to > >> > >> > https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189 > >> > >> > >> > >> Could you please proceed with PR merge ? > >> > >> > >> > >> BR, > >> > >> Yuriy Shuliha > >> > >> > >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov < > andrey.mashen...@gmail.com > >> > > >> > >> пише: > >> > >> > >> > >> > Hi Yuri, > >> > >> > > >> > >> > To get access to TC Bot you should register as TeamCity user > [1], if > >> > you > >> > >> > didn't do this already. > >> > >> > Then you will be able to authorize on Ignite TC Bot page with > same > >> > >> > credentials. > >> > >> > > >> > >> > [1] https://ci.ignite.apache.org/registerUser.html > >> > >> > > >> > >> > On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga < shul...@gmail.com > > > >> > wrote: > >> > >> > > >> > >> >> Andrew, > >> > >> >> > >> > >> >> I have corrected PR according to your notes. Please review. > >> > >> >> What will be the next steps in order to merge in? > >> > >> >> > >> > >> >> Y. > >> > >> >> > >> > >> >> чт, 3 жовт. 2019 о 17:47 Andrey Mashenkov < > >> > andrey.mashen...@gmail.com > > >> > >> >> пише: > >> > >> >> > >> > >> >> > Yuri, > >> > >> >> > > >> > >> >> > I've done with review. > >> > >> >> > No crime found, but trivial compatibility bug. > >> > >> >> > > >> > >> >> > On Thu, Oct 3, 2019 at 3:54 PM Yuriy Shuliga < > shul...@gmail.com > > >> > >> wrote: > >> > >> >> > > >> > >> >> >
Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Ok, lets forgot Solr and go through ASF way, if Yuriy prove this functionality is helpful and PR it, why not ? isn`t it ? >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev >: > >Hello! > >The problem here is that Solr is a multi-year effort by a lot of people. We >can't match that. > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our cache >information into their storage for indexing and relying on their own >mechanisms for distributed IR sorting? > >Regards, >-- >Ilya Kasnacheev > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky < arzamas...@mail.ru.invalid >>: > >> >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ? >> >> thanks ! >> >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev < >> ilya.kasnach...@gmail.com >: >> > >> >Hello! >> > >> >I have a hunch that we are trying to build Apache Solr (or Solr Cloud) >> into >> >Apache Ignite. I think that's a lot of effort that is not very justified. >> > >> >I don't think we should try to implement sorting in Apache Ignite, because >> >it is a lot of work, and a lot of code in our code base which we don't >> >really want. >> > >> >Regards, >> >-- >> >Ilya Kasnacheev >> > >> > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga < shul...@gmail.com >: >> > >> >> Dear Igniters, >> >> >> >> The first part of TextQuery improvement - a result limit - was developed >> >> and merged. >> >> Now we have to develop most important functionality here - proper >> sorting >> >> of Lucene index response and correct reducing of them for distributed >> >> queries. >> >> >> >> *There are two Lucene based aspects* >> >> >> >> 1. In case of using no sorting fields, the documents in response are >> still >> >> ordered by relevance. >> >> Actually this is ScoreDoc.score value. >> >> In order to reduce the distributed results correctly, the score should >> be >> >> passed with response. >> >> >> >> 2. When sorting by conventional fields, then Lucene should have these >> >> fields properly indexed and >> >> corresponding Sort object should be applied to Lucene's search call. >> >> In order to mark those fields a new annotation like '@SortField' may be >> >> introduced. >> >> >> >> *Reducing on Ignite * >> >> >> >> The obvious point of distributed response reduction is class >> >> GridCacheDistributedQueryFuture. >> >> Though, @Ivan Pavlukhin mentioned class with similar functionality: >> >> ReduceIndexSorted >> >> What I see here, that it is tangled with H2 related classes ( >> >> org.h2.result.Row) and might not be unified with TextQuery reduction. >> >> >> >> Still need a support here. >> >> >> >> Overall, the goal of this letter is to initiate discussion on TextQuery >> >> Sorting implementation and come closer to ticket creation. >> >> >> >> BR, >> >> Yuriy Shuliha >> >> >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov < andrey.mashen...@gmail.com >> > >> >> пише: >> >> >> >> > Hi Dmitry, Yuriy. >> >> > >> >> > I've found GridCacheQueryFutureAdapter has newly added AtomicInteger >> >> > 'total' field and 'limit; field as primitive int. >> >> > >> >> > Both fields are used inside synchronized block only. >> >> > So, we can make both private and downgrade AtomicInteger to primitive >> >> int. >> >> > >> >> > Most likely, these fields can be replaced with one field. >> >> > >> >> > >> >> > >> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov < dpav...@apache.org >> > >> >> > wrote: >> >> > >> >> > > Hi Andrey, >> >> > > >> >> > > I've checked this ticket comments, and there is a TC Bot visa (with >> no >> >> > > blockers). >> >> > > >> >> > > Do you have any concerns related to this patch? >> >> > > >> >> > > Sincerely, >> >> > > Dmitriy Pavlov >> >> > > >> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga < shul...@gmail.com >: >> >> > > >> >> > >> Andrey, >> >> > >> >> >> > >> Per you request, I created ticket >> >> > >> https://issues.apache.org/jira/browse/IGNITE-12291 linked to >> >> > >> >> https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189 >> >> > >> >> >> > >> Could you please proceed with PR merge ? >> >> > >> >> >> > >> BR, >> >> > >> Yuriy Shuliha >> >> > >> >> >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov < >> andrey.mashen...@gmail.com >> >> > >> >> > >> пише: >> >> > >> >> >> > >> > Hi Yuri, >> >> > >> > >> >> > >> > To get access to TC Bot you should register as TeamCity user >> [1], if >> >> > you >> >> > >> > didn't do this already. >> >> > >> > Then you will be able to authorize on Ignite TC Bot page with >> same >> >> > >> > credentials. >> >> > >> > >> >> > >> > [1] https://ci.ignite.apache.org/registerUser.html >> >> > >> > >> >> > >> > On Fri, Oct 4, 2019 at 3:10 PM Yuriy Shuliga < shul...@gmail.com >> > >> >> > wrote: >> >> > >> > >> >> > >> >> Andrew, >> >> > >> >> >> >> > >> >> I have corrected PR according to your notes. Please review. >> >> > >> >> What will be the next steps in order to merge in? >> >> > >> >> >> >> > >> >> Y. >> >> > >> >> >> >> > >> >> чт,
Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Hello! ASF way should probably start with an IEP :) Regards, -- Ilya Kasnacheev вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky : > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this > functionality is helpful and PR it, why not ? > > isn`t it ? > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev < > ilya.kasnach...@gmail.com>: > > > >Hello! > > > >The problem here is that Solr is a multi-year effort by a lot of people. > We > >can't match that. > > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our > cache > >information into their storage for indexing and relying on their own > >mechanisms for distributed IR sorting? > > > >Regards, > >-- > >Ilya Kasnacheev > > > > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky < > arzamas...@mail.ru.invalid > >>: > > > >> > >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ? > >> > >> thanks ! > >> > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev < > >> ilya.kasnach...@gmail.com >: > >> > > >> >Hello! > >> > > >> >I have a hunch that we are trying to build Apache Solr (or Solr Cloud) > >> into > >> >Apache Ignite. I think that's a lot of effort that is not very > justified. > >> > > >> >I don't think we should try to implement sorting in Apache Ignite, > because > >> >it is a lot of work, and a lot of code in our code base which we don't > >> >really want. > >> > > >> >Regards, > >> >-- > >> >Ilya Kasnacheev > >> > > >> > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga < shul...@gmail.com >: > >> > > >> >> Dear Igniters, > >> >> > >> >> The first part of TextQuery improvement - a result limit - was > developed > >> >> and merged. > >> >> Now we have to develop most important functionality here - proper > >> sorting > >> >> of Lucene index response and correct reducing of them for distributed > >> >> queries. > >> >> > >> >> *There are two Lucene based aspects* > >> >> > >> >> 1. In case of using no sorting fields, the documents in response are > >> still > >> >> ordered by relevance. > >> >> Actually this is ScoreDoc.score value. > >> >> In order to reduce the distributed results correctly, the score > should > >> be > >> >> passed with response. > >> >> > >> >> 2. When sorting by conventional fields, then Lucene should have these > >> >> fields properly indexed and > >> >> corresponding Sort object should be applied to Lucene's search call. > >> >> In order to mark those fields a new annotation like '@SortField' may > be > >> >> introduced. > >> >> > >> >> *Reducing on Ignite * > >> >> > >> >> The obvious point of distributed response reduction is class > >> >> GridCacheDistributedQueryFuture. > >> >> Though, @Ivan Pavlukhin mentioned class with similar functionality: > >> >> ReduceIndexSorted > >> >> What I see here, that it is tangled with H2 related classes ( > >> >> org.h2.result.Row) and might not be unified with TextQuery reduction. > >> >> > >> >> Still need a support here. > >> >> > >> >> Overall, the goal of this letter is to initiate discussion on > TextQuery > >> >> Sorting implementation and come closer to ticket creation. > >> >> > >> >> BR, > >> >> Yuriy Shuliha > >> >> > >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov < > andrey.mashen...@gmail.com > >> > > >> >> пише: > >> >> > >> >> > Hi Dmitry, Yuriy. > >> >> > > >> >> > I've found GridCacheQueryFutureAdapter has newly added > AtomicInteger > >> >> > 'total' field and 'limit; field as primitive int. > >> >> > > >> >> > Both fields are used inside synchronized block only. > >> >> > So, we can make both private and downgrade AtomicInteger to > primitive > >> >> int. > >> >> > > >> >> > Most likely, these fields can be replaced with one field. > >> >> > > >> >> > > >> >> > > >> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov < > dpav...@apache.org > >> > > >> >> > wrote: > >> >> > > >> >> > > Hi Andrey, > >> >> > > > >> >> > > I've checked this ticket comments, and there is a TC Bot visa > (with > >> no > >> >> > > blockers). > >> >> > > > >> >> > > Do you have any concerns related to this patch? > >> >> > > > >> >> > > Sincerely, > >> >> > > Dmitriy Pavlov > >> >> > > > >> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga < shul...@gmail.com > >: > >> >> > > > >> >> > >> Andrey, > >> >> > >> > >> >> > >> Per you request, I created ticket > >> >> > >> https://issues.apache.org/jira/browse/IGNITE-12291 linked to > >> >> > >> > >> https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-12189 > >> >> > >> > >> >> > >> Could you please proceed with PR merge ? > >> >> > >> > >> >> > >> BR, > >> >> > >> Yuriy Shuliha > >> >> > >> > >> >> > >> ср, 9 жовт. 2019 о 12:52 Andrey Mashenkov < > >> andrey.mashen...@gmail.com > >> >> > > >> >> > >> пише: > >> >> > >> > >> >> > >> > Hi Yuri, > >> >> > >> > > >> >> > >> > To get access to TC Bot you should register as TeamCity user > >> [1], if > >> >> > you > >> >> > >> > didn't do this already. > >> >> > >> > Then you will be able to authorize on Ignite TC Bo
Re: Thin client: compute support
> After all, we don't cancel request We do cancel a request to perform a task. We may and should use this to cancel any other request in future. > Client uses some cluster group filtration (for example forServers() cluster group) Please see above - Aleksandr Shapkin described how we store filtered cluster groups on client. We don't store node IDs, we store actual filters. So every new request will apply those filters on server side, using the most recent set of nodes. var myGrp = cluster.forServers().forAttribute("foo"); // This does not issue any server requests, just builds an object with filters on client while (true) myGrp.compute().executeTask("bar"); // Every request includes filters, and filters are applied on the server side On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov wrote: > > Anyway, my point stands. > I can't agree. Why you don't want to use task id for this? After all, we > don't cancel request (request is already processed), we cancel the task. So > it's more convenient to use task id here. > > > Can you please provide equivalent use case with existing "thick" client? > For example: > Cluster consists of one server node. > Client uses some cluster group filtration (for example forServers() cluster > group). > Client starts to send periodically (for example 1 per minute) long-term > (for example 1 hour long) tasks to the cluster. > Meanwhile, several server nodes joined the cluster. > > In case of thick client: All server nodes will be used, tasks will be load > balanced. > In case of thin client: Only one server node will be used, client will > detect topology change after an hour. > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn : > > > > I can't see any usage of request id in query cursors > > You are right, cursor id is a separate thing. > > Anyway, my point stands. > > > > > client sends long term tasks to nodes and wants to do it with load > > balancing > > I still don't get it. Can you please provide equivalent use case with > > existing "thick" client? > > > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov > > wrote: > > > > > > And it is fine to use request ID to identify compute tasks (as we do > > with > > > query cursors). > > > I can't see any usage of request id in query cursors. We send query > > request > > > and get cursor id in response. After that, we only use cursor id (to > get > > > next pages and to close the resource). Did I miss something? > > > > > > > Looks like I'm missing something - how is topology change relevant to > > > executing compute tasks from client? > > > It's not relevant directly. But there are some cases where it will be > > > helpful. For example, if client sends long term tasks to nodes and > wants > > to > > > do it with load balancing it will detect topology change only after > some > > > time in the future with the first response, so load balancing will no > > work. > > > Perhaps we can add optional "topology version" field to the > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn : > > > > > > > Alex, > > > > > > > > > we will mix entities from different layers (transport layer and > > request > > > > body) > > > > I would not call our message header (which includes the id) > "transport > > > > layer". > > > > TCP is our transport layer. And it is fine to use request ID to > > identify > > > > compute tasks (as we do with query cursors). > > > > > > > > > we still can't be sure that the task is successfully started on a > > > server > > > > The request to start the task will fail and we'll get a response > > > indicating > > > > that right away > > > > > > > > > we won't ever know about topology change > > > > Looks like I'm missing something - how is topology change relevant to > > > > executing compute tasks from client? > > > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > plehanov.a...@gmail.com> > > > > wrote: > > > > > > > > > Pavel, in this case, we will mix entities from different layers > > > > (transport > > > > > layer and request body), it's not very good. The same behavior we > can > > > > > achieve with generated on client-side task id, but there will be no > > > > > inter-layer data intersection and I think it will be easier to > > > implement > > > > on > > > > > both client and server-side. But we still can't be sure that the > task > > > is > > > > > successfully started on a server. We won't ever know about topology > > > > change, > > > > > because topology changed flag will be sent from server to client > only > > > > with > > > > > a response when the task will be completed. Are we accept that? > > > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn >: > > > > > > > > > > > Alex, > > > > > > > > > > > > I have a simpler idea. We already do request id handling in the > > > > protocol, > > > > > > so: > > > > > > - Client sends a normal request to execute compute task. Request > ID > > > is > > > > > > generated a
[jira] [Created] (IGNITE-12395) Client nodes fail on SPARC: No session found at TcpCommunicationSpi.createNioSession
Ilya Kasnacheev created IGNITE-12395: Summary: Client nodes fail on SPARC: No session found at TcpCommunicationSpi.createNioSession Key: IGNITE-12395 URL: https://issues.apache.org/jira/browse/IGNITE-12395 Project: Ignite Issue Type: Bug Components: general Reporter: Ilya Kasnacheev Assignee: Ilya Kasnacheev This happens when running client nodes from tests with startClient(), since it does not do optimize() and this causes socket binding issues on Sparc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
Folks, IEP is an Ignite-specific thing. In fact, I suppose that we are already doing it in ASF way by having this dev-list discussion =) As for me, implementing "limit" feature for text queries is not so big to make an IEP. But we might need to create one for next features. вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev : > > Hello! > > ASF way should probably start with an IEP :) > > Regards, > -- > Ilya Kasnacheev > > > вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky >: > > > > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this > > functionality is helpful and PR it, why not ? > > > > isn`t it ? > > > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev < > > ilya.kasnach...@gmail.com>: > > > > > >Hello! > > > > > >The problem here is that Solr is a multi-year effort by a lot of people. > > We > > >can't match that. > > > > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our > > cache > > >information into their storage for indexing and relying on their own > > >mechanisms for distributed IR sorting? > > > > > >Regards, > > >-- > > >Ilya Kasnacheev > > > > > > > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky < > > arzamas...@mail.ru.invalid > > >>: > > > > > >> > > >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ? > > >> > > >> thanks ! > > >> > > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev < > > >> ilya.kasnach...@gmail.com >: > > >> > > > >> >Hello! > > >> > > > >> >I have a hunch that we are trying to build Apache Solr (or Solr Cloud) > > >> into > > >> >Apache Ignite. I think that's a lot of effort that is not very > > justified. > > >> > > > >> >I don't think we should try to implement sorting in Apache Ignite, > > because > > >> >it is a lot of work, and a lot of code in our code base which we don't > > >> >really want. > > >> > > > >> >Regards, > > >> >-- > > >> >Ilya Kasnacheev > > >> > > > >> > > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga < shul...@gmail.com >: > > >> > > > >> >> Dear Igniters, > > >> >> > > >> >> The first part of TextQuery improvement - a result limit - was > > developed > > >> >> and merged. > > >> >> Now we have to develop most important functionality here - proper > > >> sorting > > >> >> of Lucene index response and correct reducing of them for distributed > > >> >> queries. > > >> >> > > >> >> *There are two Lucene based aspects* > > >> >> > > >> >> 1. In case of using no sorting fields, the documents in response are > > >> still > > >> >> ordered by relevance. > > >> >> Actually this is ScoreDoc.score value. > > >> >> In order to reduce the distributed results correctly, the score > > should > > >> be > > >> >> passed with response. > > >> >> > > >> >> 2. When sorting by conventional fields, then Lucene should have these > > >> >> fields properly indexed and > > >> >> corresponding Sort object should be applied to Lucene's search call. > > >> >> In order to mark those fields a new annotation like '@SortField' may > > be > > >> >> introduced. > > >> >> > > >> >> *Reducing on Ignite * > > >> >> > > >> >> The obvious point of distributed response reduction is class > > >> >> GridCacheDistributedQueryFuture. > > >> >> Though, @Ivan Pavlukhin mentioned class with similar functionality: > > >> >> ReduceIndexSorted > > >> >> What I see here, that it is tangled with H2 related classes ( > > >> >> org.h2.result.Row) and might not be unified with TextQuery reduction. > > >> >> > > >> >> Still need a support here. > > >> >> > > >> >> Overall, the goal of this letter is to initiate discussion on > > TextQuery > > >> >> Sorting implementation and come closer to ticket creation. > > >> >> > > >> >> BR, > > >> >> Yuriy Shuliha > > >> >> > > >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov < > > andrey.mashen...@gmail.com > > >> > > > >> >> пише: > > >> >> > > >> >> > Hi Dmitry, Yuriy. > > >> >> > > > >> >> > I've found GridCacheQueryFutureAdapter has newly added > > AtomicInteger > > >> >> > 'total' field and 'limit; field as primitive int. > > >> >> > > > >> >> > Both fields are used inside synchronized block only. > > >> >> > So, we can make both private and downgrade AtomicInteger to > > primitive > > >> >> int. > > >> >> > > > >> >> > Most likely, these fields can be replaced with one field. > > >> >> > > > >> >> > > > >> >> > > > >> >> > On Mon, Oct 21, 2019 at 10:01 PM Dmitriy Pavlov < > > dpav...@apache.org > > >> > > > >> >> > wrote: > > >> >> > > > >> >> > > Hi Andrey, > > >> >> > > > > >> >> > > I've checked this ticket comments, and there is a TC Bot visa > > (with > > >> no > > >> >> > > blockers). > > >> >> > > > > >> >> > > Do you have any concerns related to this patch? > > >> >> > > > > >> >> > > Sincerely, > > >> >> > > Dmitriy Pavlov > > >> >> > > > > >> >> > > чт, 17 окт. 2019 г. в 16:43, Yuriy Shuliga < shul...@gmail.com > > >: > > >> >> > > > > >> >> > >> Andrey, > > >> >> > >> > > >> >> > >> Per you request, I created ticket > > >> >> > >> https://issues.apache.or
[jira] [Created] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets
Alexey Zinoviev created IGNITE-12396: Summary: [ML] Random Forest generates NaN for a part of models on small datasets Key: IGNITE-12396 URL: https://issues.apache.org/jira/browse/IGNITE-12396 Project: Ignite Issue Type: Bug Components: ml Affects Versions: 3.0 Reporter: Alexey Zinoviev Assignee: Alexey Zinoviev Fix For: 3.0 @Override public Double predict(Vector features) { double[] predictions = new double[models.size()]; for (int i = 0; i < models.size(); i++) predictions[i] = models.get(i).predict(features); return predictionsAggregator.apply(predictions); } predictionAggreagtor gets a lot of models and part of them returns null and it could be aggregated, first of all handle this in Aggregator (using threshold for amount of broken models before aggregation) also RandomForest trees should return Double.NaN - it should fail or throw message after the training I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows RF generates a few models with one LEAF node with empty val (Double.NaN by default) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12397) Apache spark - org.apache.ignite.IgniteIllegalStateException
Shenson Joseph created IGNITE-12397: --- Summary: Apache spark - org.apache.ignite.IgniteIllegalStateException Key: IGNITE-12397 URL: https://issues.apache.org/jira/browse/IGNITE-12397 Project: Ignite Issue Type: Bug Components: spark Affects Versions: 2.7 Reporter: Shenson Joseph Running spark application with apache ignite-2.7.0 throws following exception This causes spark applications to die.Do we have a work around for this problem 9/11/26 16:38:50 INFO CoarseGrainedExecutorBackend: Got assigned task 1568109/11/26 16:38:50 INFO CoarseGrainedExecutorBackend: Got assigned task 15681019/11/26 16:38:50 INFO Executor: Running task 0.2 in stage 1564.0 (TID 156810)19/11/26 16:38:50 ERROR Executor: Exception in task 0.2 in stage 1564.0 (TID 156810)class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. Did you call Ignition.start(..) to start an Ignite instance? [name=shared-grid] at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1390) at org.apache.ignite.Ignition.ignite(Ignition.java:531) at org.apache.ignite.spark.impl.package$.ignite(package.scala:86) at org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:226) at org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:223) at org.apache.ignite.spark.Once.apply(IgniteContext.scala:224) at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:145) at org.apache.ignite.spark.impl.IgniteSQLDataFrameRDD.compute(IgniteSQLDataFrameRDD.scala:65) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)19/11/26 16:38:50 INFO CoarseGrainedExecutorBackend: Got assigned task 15681119/11/26 16:38:50 INFO Executor: Running task 0.3 in stage 1562.0 (TID 156811)19/11/26 16:38:50 ERROR Executor: Exception in task 0.3 in stage 1562.0 (TID 156811)class org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided name doesn't exist. -- This message was sent by Atlassian Jira (v8.3.4#803005)
RE: Re: Thin client: compute support
Alex, >Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks overcomplicated. Do >we need server-side filtering at all? Wouldn't it be better to send basic >info (ids, order, flags) for all nodes (there is relatively small amount of >data) and extended info (attributes) for selected list of nodes? In this >case, we can do basic node filtration on client-side (forClients(), >forServers(), forNodeIds(), forOthers(), etc). I think it's ok to have a server-side filtering. This allows us to have a single endpoint for all clients and thus ensures that they all will get the same consistent list of nodes in return regardless of their internal implementations. The only protocol change here in comparison to GetNodes() - an optional filter object that in most cases is represented by a list of key-value attribute pairs. >Do you use standard ClusterNode serialization? There are also metrics >serialized with ClusterNode, do we need it on thin client? There are other >interfaces exist to show metrics, I think it's redundant to export metrics >to thin clients too. Alongside with the node ids, we could pass a flag indicating whether we are interested in the detailed node representation, say with metrics, or only in a basic format. This flag should be disabled by default. We could implement a GetNodeMetrics(nodeId) method later on if we decide to. *From: *Pavel Tupitsyn *Sent: *Tuesday, November 26, 2019 5:44 PM *To: *dev *Subject: *Re: Thin client: compute support > After all, we don't cancel request We do cancel a request to perform a task. We may and should use this to cancel any other request in future. > Client uses some cluster group filtration (for example forServers() cluster group) Please see above - Aleksandr Shapkin described how we store filtered cluster groups on client. We don't store node IDs, we store actual filters. So every new request will apply those filters on server side, using the most recent set of nodes. var myGrp = cluster.forServers().forAttribute("foo"); // This does not issue any server requests, just builds an object with filters on client while (true) myGrp.compute().executeTask("bar"); // Every request includes filters, and filters are applied on the server side On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov wrote: > > Anyway, my point stands. > I can't agree. Why you don't want to use task id for this? After all, we > don't cancel request (request is already processed), we cancel the task. So > it's more convenient to use task id here. > > > Can you please provide equivalent use case with existing "thick" client? > For example: > Cluster consists of one server node. > Client uses some cluster group filtration (for example forServers() cluster > group). > Client starts to send periodically (for example 1 per minute) long-term > (for example 1 hour long) tasks to the cluster. > Meanwhile, several server nodes joined the cluster. > > In case of thick client: All server nodes will be used, tasks will be load > balanced. > In case of thin client: Only one server node will be used, client will > detect topology change after an hour. > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn : > > > > I can't see any usage of request id in query cursors > > You are right, cursor id is a separate thing. > > Anyway, my point stands. > > > > > client sends long term tasks to nodes and wants to do it with load > > balancing > > I still don't get it. Can you please provide equivalent use case with > > existing "thick" client? > > > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov > > wrote: > > > > > > And it is fine to use request ID to identify compute tasks (as we do > > with > > > query cursors). > > > I can't see any usage of request id in query cursors. We send query > > request > > > and get cursor id in response. After that, we only use cursor id (to > get > > > next pages and to close the resource). Did I miss something? > > > > > > > Looks like I'm missing something - how is topology change relevant to > > > executing compute tasks from client? > > > It's not relevant directly. But there are some cases where it will be > > > helpful. For example, if client sends long term tasks to nodes and > wants > > to > > > do it with load balancing it will detect topology change only after > some > > > time in the future with the first response, so load balancing will no > > work. > > > Perhaps we can add optional "topology version" field to the > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn : > > > > > > > Alex, > > > > > > > > > we will mix entities from different layers (transport layer and > > request > > > > body) > > > > I would not call our message header (which includes the id) > "transport > > > > layer". > > > > TCP is our transport layer. And it is fine to use request ID to > > identify > > > > compu
[MTCGA]: new failures in builds [4792987] needs to be handled
Hi Igniters, I've detected some new issue on TeamCity to be handled. You are more than welcomed to help. *New Critical Failure in master Platform C++ (Linux)* https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_PlatformCLinux?branch=%3Cdefault%3E No changes in the build - Here's a reminder of what contributors were agreed to do https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute - Should you have any questions please contact dev@ignite.apache.org Best Regards, Apache Ignite TeamCity Bot https://github.com/apache/ignite-teamcity-bot Notification generated at 21:29:59 26-11-2019
Re: [DISCUSS] Pub/Sub Streamer Implementation
Hi all! If someone could give me read access to the Job of Ignite Extensions on Team City will greatly help me (username gkatzioura). I suppose the builds get triggered automatically on a pull request. Kind regards, Emmanouil *Emmanouil Gkatziouras* https://egkatzioura.com/ | https://www.linkedin.com/in/gkatziourasemmanouil/ https://github.com/gkatzioura On Sun, 24 Nov 2019 at 21:11, Emmanouil Gkatziouras wrote: > Hi Saikat! > > I just rebased with the flink branch. Unfortunately I do not have read > access to the team city link you provided and thus evaluate the tests for > my pull request. > I guess it has to do with ignite extensions being new. > My username is gkatzioura and email the one I am using on this email > > Thank you for your time. > > > *Emmanouil Gkatziouras* > https://egkatzioura.com/ | > https://www.linkedin.com/in/gkatziourasemmanouil/ > https://github.com/gkatzioura > > > On Sun, 24 Nov 2019 at 18:03, Saikat Maitra > wrote: > >> Hi Emmanouil, >> >> The latest build on teamcity has passed on Flink pull request. >> >> >> https://ci.ignite.apache.org/viewLog.html?buildId=4788928&buildTypeId=IgniteExtensions_Build&tab=buildResultsDiv&branch_IgniteExtensions=pull%2F1%2Fhead >> >> you should be able to take the changes I made in my PR and run build on >> your pull request. >> >> Please let me know if you have any questions. >> >> Regards, >> Saikat >> >> On Sun, Nov 24, 2019 at 10:44 AM Saikat Maitra >> wrote: >> >> > Hi Emmanouil, >> > >> > I have fixed the Flink module testsuites and have setup new project in >> > teamcity. >> > >> > >> > >> https://ci.ignite.apache.org/project.html?projectId=IgniteExtensions&tab=projectOverview >> > >> > I am looking into setting up the build. >> > >> > Regards, >> > Saikat >> > >> > On Sat, Nov 23, 2019 at 9:52 AM Saikat Maitra >> > wrote: >> > >> >> Hi Emmanouil, >> >> >> >> Thank you for your email. Yes, the plan is once the Flink PR is merged >> it >> >> will provide the Licence, parent POMs etc and you can rebase from >> master >> >> branch and apply the changes on top of it. >> >> >> >> I am getting some issues with test failures in local with >> >> GridTestProperties as the test.properties is not present in this new >> >> project but should be available from dependencies. >> >> >> >> Once I address this issue, I will go ahead and merge the changes and >> then >> >> we can take it from there. >> >> >> >> Regards, >> >> Saikat >> >> >> >> On Fri, Nov 22, 2019 at 5:23 PM Emmanouil Gkatziouras < >> >> gkatzio...@gmail.com> wrote: >> >> >> >>> Hi all, >> >>> >> >>> I made my first pull request [1]. Since this project is brand new (no >> >>> parent poms, licensing), I reckoned it was best to use Saikat's >> branch on >> >>> Flink. >> >>> I suppose the Flink branch will be merged first. If not please give me >> >>> guidelines on how I should proceed next. >> >>> >> >>> Kind regards >> >>> Emmanouil >> >>> >> >>> [1] https://github.com/apache/ignite-extensions/pull/2 >> >>> >> >>> *Emmanouil Gkatziouras* >> >>> https://egkatzioura.com/ | >> >>> https://www.linkedin.com/in/gkatziourasemmanouil/ >> >>> https://github.com/gkatzioura >> >>> >> >>> >> >>> On Fri, 22 Nov 2019 at 20:55, Denis Magda wrote: >> >>> >> >>> > Awesome, ping us whenever you're ready! >> >>> > >> >>> > - >> >>> > Denis >> >>> > >> >>> > >> >>> > On Fri, Nov 22, 2019 at 12:52 PM Emmanouil Gkatziouras < >> >>> > gkatzio...@gmail.com> >> >>> > wrote: >> >>> > >> >>> > > Hi all! >> >>> > > >> >>> > > I am sorry for being late on that. I was trying to refactor the >> test >> >>> in >> >>> > > order not to be in need of any external tools or spinning up a >> >>> server. >> >>> > > I did forked the new repo and indeed my changes there, so a pull >> >>> request >> >>> > is >> >>> > > a matter of time! >> >>> > > >> >>> > > Kind regards >> >>> > > >> >>> > > *Emmanouil Gkatziouras* >> >>> > > https://egkatzioura.com/ | >> >>> > > https://www.linkedin.com/in/gkatziourasemmanouil/ >> >>> > > https://github.com/gkatzioura >> >>> > > >> >>> > > >> >>> > > On Fri, 22 Nov 2019 at 20:45, Denis Magda >> wrote: >> >>> > > >> >>> > > > Hi Emmanouil, >> >>> > > > >> >>> > > > Do you have any questions or need any support from the >> community? >> >>> > > > >> >>> > > > - >> >>> > > > Denis >> >>> > > > >> >>> > > > >> >>> > > > On Sun, Nov 10, 2019 at 3:07 PM Saikat Maitra < >> >>> saikat.mai...@gmail.com >> >>> > > >> >>> > > > wrote: >> >>> > > > >> >>> > > >> Hi Emmanouil, >> >>> > > >> >> >>> > > >> Can you please take a looks at dev utils, if this is something >> >>> you are >> >>> > > >> looking for >> >>> > > >> https://github.com/apache/ignite/tree/master/modules/dev-utils >> ? >> >>> > > >> >> >>> > > >> IMO, if you can release Pub/Sub server in maven and then use >> it as >> >>> > > >> dependency, that would be great. >> >>> > > >> >> >>> > > >> Regards, >> >>> > > >> Saikat >> >>> > > >> >> >>> > > >> On Sun, Nov 10, 2019 at 5:00 PM Saikat Maitra < >> >>> > saikat.mai...@gmail.com> >>
Re: Re[4]: Text queries/indexes (GridLuceneIndex, @QueryTextFiled)
I don't see anything wrong if Yuriy is willing to carry on and keep enhancing our full-text search support that lacks basic capabilities. The basics should be available. If anybody needs an advanced feature they can introduce Solr or ElastiSearch into the final architecture of the app. Folks, who of us can help Yuriy with the questions asked? Most like the SQL experts are the best candidates here. - Denis On Tue, Nov 26, 2019 at 8:52 AM Ivan Pavlukhin wrote: > Folks, > > IEP is an Ignite-specific thing. In fact, I suppose that we are > already doing it in ASF way by having this dev-list discussion =) > > As for me, implementing "limit" feature for text queries is not so big > to make an IEP. But we might need to create one for next features. > > вт, 26 нояб. 2019 г. в 15:06, Ilya Kasnacheev : > > > > Hello! > > > > ASF way should probably start with an IEP :) > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > вт, 26 нояб. 2019 г. в 14:12, Zhenya Stanilovsky > > >: > > > > > > > > Ok, lets forgot Solr and go through ASF way, if Yuriy prove this > > > functionality is helpful and PR it, why not ? > > > > > > isn`t it ? > > > > > > >Вторник, 26 ноября 2019, 14:06 +03:00 от Ilya Kasnacheev < > > > ilya.kasnach...@gmail.com>: > > > > > > > >Hello! > > > > > > > >The problem here is that Solr is a multi-year effort by a lot of > people. > > > We > > > >can't match that. > > > > > > > >Maybe we could integrate with Solr/Solr Cloud instead, by feeding our > > > cache > > > >information into their storage for indexing and relying on their own > > > >mechanisms for distributed IR sorting? > > > > > > > >Regards, > > > >-- > > > >Ilya Kasnacheev > > > > > > > > > > > >вт, 26 нояб. 2019 г. в 13:59, Zhenya Stanilovsky < > > > arzamas...@mail.ru.invalid > > > >>: > > > > > > > >> > > > >> Ilya Kasnacheev, what a problem in Solr with Ignite functionality ? > > > >> > > > >> thanks ! > > > >> > > > >> >Вторник, 26 ноября 2019, 13:50 +03:00 от Ilya Kasnacheev < > > > >> ilya.kasnach...@gmail.com >: > > > >> > > > > >> >Hello! > > > >> > > > > >> >I have a hunch that we are trying to build Apache Solr (or Solr > Cloud) > > > >> into > > > >> >Apache Ignite. I think that's a lot of effort that is not very > > > justified. > > > >> > > > > >> >I don't think we should try to implement sorting in Apache Ignite, > > > because > > > >> >it is a lot of work, and a lot of code in our code base which we > don't > > > >> >really want. > > > >> > > > > >> >Regards, > > > >> >-- > > > >> >Ilya Kasnacheev > > > >> > > > > >> > > > > >> >пт, 22 нояб. 2019 г. в 20:59, Yuriy Shuliga < shul...@gmail.com > >: > > > >> > > > > >> >> Dear Igniters, > > > >> >> > > > >> >> The first part of TextQuery improvement - a result limit - was > > > developed > > > >> >> and merged. > > > >> >> Now we have to develop most important functionality here - proper > > > >> sorting > > > >> >> of Lucene index response and correct reducing of them for > distributed > > > >> >> queries. > > > >> >> > > > >> >> *There are two Lucene based aspects* > > > >> >> > > > >> >> 1. In case of using no sorting fields, the documents in response > are > > > >> still > > > >> >> ordered by relevance. > > > >> >> Actually this is ScoreDoc.score value. > > > >> >> In order to reduce the distributed results correctly, the score > > > should > > > >> be > > > >> >> passed with response. > > > >> >> > > > >> >> 2. When sorting by conventional fields, then Lucene should have > these > > > >> >> fields properly indexed and > > > >> >> corresponding Sort object should be applied to Lucene's search > call. > > > >> >> In order to mark those fields a new annotation like '@SortField' > may > > > be > > > >> >> introduced. > > > >> >> > > > >> >> *Reducing on Ignite * > > > >> >> > > > >> >> The obvious point of distributed response reduction is class > > > >> >> GridCacheDistributedQueryFuture. > > > >> >> Though, @Ivan Pavlukhin mentioned class with similar > functionality: > > > >> >> ReduceIndexSorted > > > >> >> What I see here, that it is tangled with H2 related classes ( > > > >> >> org.h2.result.Row) and might not be unified with TextQuery > reduction. > > > >> >> > > > >> >> Still need a support here. > > > >> >> > > > >> >> Overall, the goal of this letter is to initiate discussion on > > > TextQuery > > > >> >> Sorting implementation and come closer to ticket creation. > > > >> >> > > > >> >> BR, > > > >> >> Yuriy Shuliha > > > >> >> > > > >> >> вт, 22 жовт. 2019 о 13:31 Andrey Mashenkov < > > > andrey.mashen...@gmail.com > > > >> > > > > >> >> пише: > > > >> >> > > > >> >> > Hi Dmitry, Yuriy. > > > >> >> > > > > >> >> > I've found GridCacheQueryFutureAdapter has newly added > > > AtomicInteger > > > >> >> > 'total' field and 'limit; field as primitive int. > > > >> >> > > > > >> >> > Both fields are used inside synchronized block only. > > > >> >> > So, we can make both private and downgrade AtomicInteger to > > > primitive > > > >> >> int. > > > >> >> > > >
Re: [DISCUSS] Pub/Sub Streamer Implementation
Hi Emmanouil, I have added you as contributor in Ignite Extensions project. Can you please check and confirm if you are able to see the project and execute build on your pull request. Regards, Saikat On Tue, Nov 26, 2019 at 12:49 PM Emmanouil Gkatziouras wrote: > Hi all! > > If someone could give me read access to the Job of Ignite Extensions on > Team City will greatly help me (username gkatzioura). > I suppose the builds get triggered automatically on a pull request. > > Kind regards, > Emmanouil > > *Emmanouil Gkatziouras* > https://egkatzioura.com/ | > https://www.linkedin.com/in/gkatziourasemmanouil/ > https://github.com/gkatzioura > > > On Sun, 24 Nov 2019 at 21:11, Emmanouil Gkatziouras > wrote: > > > Hi Saikat! > > > > I just rebased with the flink branch. Unfortunately I do not have read > > access to the team city link you provided and thus evaluate the tests for > > my pull request. > > I guess it has to do with ignite extensions being new. > > My username is gkatzioura and email the one I am using on this email > > > > Thank you for your time. > > > > > > *Emmanouil Gkatziouras* > > https://egkatzioura.com/ | > > https://www.linkedin.com/in/gkatziourasemmanouil/ > > https://github.com/gkatzioura > > > > > > On Sun, 24 Nov 2019 at 18:03, Saikat Maitra > > wrote: > > > >> Hi Emmanouil, > >> > >> The latest build on teamcity has passed on Flink pull request. > >> > >> > >> > https://ci.ignite.apache.org/viewLog.html?buildId=4788928&buildTypeId=IgniteExtensions_Build&tab=buildResultsDiv&branch_IgniteExtensions=pull%2F1%2Fhead > >> > >> you should be able to take the changes I made in my PR and run build on > >> your pull request. > >> > >> Please let me know if you have any questions. > >> > >> Regards, > >> Saikat > >> > >> On Sun, Nov 24, 2019 at 10:44 AM Saikat Maitra > > >> wrote: > >> > >> > Hi Emmanouil, > >> > > >> > I have fixed the Flink module testsuites and have setup new project in > >> > teamcity. > >> > > >> > > >> > > >> > https://ci.ignite.apache.org/project.html?projectId=IgniteExtensions&tab=projectOverview > >> > > >> > I am looking into setting up the build. > >> > > >> > Regards, > >> > Saikat > >> > > >> > On Sat, Nov 23, 2019 at 9:52 AM Saikat Maitra < > saikat.mai...@gmail.com> > >> > wrote: > >> > > >> >> Hi Emmanouil, > >> >> > >> >> Thank you for your email. Yes, the plan is once the Flink PR is > merged > >> it > >> >> will provide the Licence, parent POMs etc and you can rebase from > >> master > >> >> branch and apply the changes on top of it. > >> >> > >> >> I am getting some issues with test failures in local with > >> >> GridTestProperties as the test.properties is not present in this new > >> >> project but should be available from dependencies. > >> >> > >> >> Once I address this issue, I will go ahead and merge the changes and > >> then > >> >> we can take it from there. > >> >> > >> >> Regards, > >> >> Saikat > >> >> > >> >> On Fri, Nov 22, 2019 at 5:23 PM Emmanouil Gkatziouras < > >> >> gkatzio...@gmail.com> wrote: > >> >> > >> >>> Hi all, > >> >>> > >> >>> I made my first pull request [1]. Since this project is brand new > (no > >> >>> parent poms, licensing), I reckoned it was best to use Saikat's > >> branch on > >> >>> Flink. > >> >>> I suppose the Flink branch will be merged first. If not please give > me > >> >>> guidelines on how I should proceed next. > >> >>> > >> >>> Kind regards > >> >>> Emmanouil > >> >>> > >> >>> [1] https://github.com/apache/ignite-extensions/pull/2 > >> >>> > >> >>> *Emmanouil Gkatziouras* > >> >>> https://egkatzioura.com/ | > >> >>> https://www.linkedin.com/in/gkatziourasemmanouil/ > >> >>> https://github.com/gkatzioura > >> >>> > >> >>> > >> >>> On Fri, 22 Nov 2019 at 20:55, Denis Magda > wrote: > >> >>> > >> >>> > Awesome, ping us whenever you're ready! > >> >>> > > >> >>> > - > >> >>> > Denis > >> >>> > > >> >>> > > >> >>> > On Fri, Nov 22, 2019 at 12:52 PM Emmanouil Gkatziouras < > >> >>> > gkatzio...@gmail.com> > >> >>> > wrote: > >> >>> > > >> >>> > > Hi all! > >> >>> > > > >> >>> > > I am sorry for being late on that. I was trying to refactor the > >> test > >> >>> in > >> >>> > > order not to be in need of any external tools or spinning up a > >> >>> server. > >> >>> > > I did forked the new repo and indeed my changes there, so a pull > >> >>> request > >> >>> > is > >> >>> > > a matter of time! > >> >>> > > > >> >>> > > Kind regards > >> >>> > > > >> >>> > > *Emmanouil Gkatziouras* > >> >>> > > https://egkatzioura.com/ | > >> >>> > > https://www.linkedin.com/in/gkatziourasemmanouil/ > >> >>> > > https://github.com/gkatzioura > >> >>> > > > >> >>> > > > >> >>> > > On Fri, 22 Nov 2019 at 20:45, Denis Magda > >> wrote: > >> >>> > > > >> >>> > > > Hi Emmanouil, > >> >>> > > > > >> >>> > > > Do you have any questions or need any support from the > >> community? > >> >>> > > > > >> >>> > > > - > >> >>> > > > Denis > >> >>> > > > > >> >>> > > > > >> >>> > > > On Sun, Nov 10, 2019 at 3:07 PM Saikat Maitra < > >> >>> s
[DISCUSS] dependencies and release process for Ignite Extensions
Hello, I wanted to connect and discuss on the release process for ignite-extensions. As of today all our integrations since released together were able to run build based on latest snapshot for example the current build depends on 2.8.0-SNAPSHOT. If we are making ignite-extensions as separate project with different release cycle then it make sense to have dependencies on core modules based on released artifact for example the dependency for ignite-core would be 2.7.6 Please review and share your thoughts. PR https://github.com/apache/ignite-extensions/pull/1 Regards Saikat