Re: Question about discover sortByRelevancy

Aleix Pol Sat, 11 Apr 2020 11:43:45 -0700

On Fri, Apr 10, 2020 at 8:59 AM S. Champailler <schampail...@skynet.be> wrote:
>
> Thx for your answer. But I'm afradi I don't fully understand it.
>
> > We do get sorted feeds that at
> > the moment we're merging naively.
>
> Ok, that's the I way I understand the problem as well.
>
> > One thing to check first of all is to make sure that the exact
> > resource you're fetching is already being provided first at its own
> > feed.
>
> Let's have an example. Imagine we have 3 feeds : A,B,C.
> In each of these, we have 3 resources : a1, a2, a3; b1,b2,b3; c1,c2,c3. They 
> have names and descriptions (separated by a dash; a star indicates when the 
> titanium word is present):
>
> a1* = TitaddOns - An extension to Titanium
> a2* = The titanium engine - lorem ipsum ...
> a3 = plasma - lorem ipsum ...
> b1 = konqueror - lorem ipsum ...
> b2 = mozilla - lorem ipsum ...
> b3 = dillo - lorem ipsum ...
> c1* = titanium - lorem ipsum ...
> c2* = titanium italics - not to confuse with titanium
> c3* = titanium black - lorem ipsum ...
>
> The search string in Discover is : "titanium". I assume it's a single word.
>
> I assume the content providers will return : a1, c1, c2, a2, c3 (in no 
> particular order; for example a content provider might not sort, may take 
> long to answer, etc). Basically, what happened in this step is just a 
> filtering, no sorting. I assume that filtering is based on the presence of 
> the search term in the name of the resource, or in its description. The 
> filtering is done at the provider level because if it wasn't, then we'd have 
> to ask the complete list of resources of each content providers which could 
> potentially be a lot.
>
> Now that the filtering as occurred, we can sort by relevancy. That sort is 
> done entirely in Discover because "relevancy" makes sense only in the UI of 
> discover.
>
> I define a relevancy score :
> - 1 = The search term is the only term in the name
> - 2 = The search term is present in the name
> - 3 = The search term is present in the description
>
> With that in mind, I sort by descending scores and, if scores are equal, I 
> sort alphabetically on names :
>
> c1* = titanium - lorem ipsum ...
> a2* = The titanium engine - lorem ipsum ...
> a1* = TitaddOns - An extension to Titanium
> c3* = titanium black - lorem ipsum ...
> c2* = titanium italics - not to confuse with titanium
>
> All the information needed to do that sort is available in Discover itself.
> Now, it's perfectly clear that scores don't answer all the questions. Having 
> said that, I get back at your comment :
>
> > One thing to check first of all is to make sure that the exact
> > resource you're fetching is already being provided first at its own
> > feed.
>
> I don't understand :
>
> - Do you mean that, when the relevancy ordering doesn't discriminate enough, 
> we should preserve the order of the contents provider ? c1, a2, a1, c2, c3 ?
>
> - do you mean that we should leave the search results grouped by feed and 
> just sort the feeds separately : c1,c2,c3,a2,a1 ?
>
> or even something else ?
>
> Again, my point is that discover does the sort, totally irrelevant of how the 
> content providers order their stuff...
> In that case, I'd look in the ResourcesProxyModel of Discover and order 
> things there (with something like the lessThan method).


We don't know how the servers are sorting things, but there it is
where it has the most information, so we should trust it. Which is
part of why it's not trivial to implement this.

Also you need to take into account that some feeds (e.g. KNS) are
paginated and will need to be triggered to fetch further at some
point.

So for example if we're sorting alphabetically descending, it needs to
be at least coordinated between server and client. Otherwise we'll be
getting A-D and sorting it D-A but when we fetch further we'll be
getting resources that should be moved to the top.

Aleix

Re: Question about discover sortByRelevancy

Reply via email to