If we are talking about adding new things to achieve this, would it be possible to add the nextCursorMark data as a HTTP response header? This is good because there is no change to the CSV content which means it is backwards compatible.
> It should be easy, ha? :-) It should be possible but if there is an already well tested CSV output already in Solr I'd like to use it. Thanks, James On Tue, 10 Jun 2025 at 11:48, Mikhail Khludnev <m...@apache.org> wrote: > It seems cursorMark is not supported in CSV format, and it could be > developed after discussing a particular format. > Another approach is to develop a result transformer, which prints cursor > mark at every CSV row as a field value, I'm not 100% sure it may work. > > Btw, couldn't you just export docs to json lines via > > https://solr.apache.org/guide/solr/latest/deployment-guide/solr-control-script-reference.html#exporting-documents-to-a-file > > and then transform it to csv. > It should be easy, ha? > > > On Tue, Jun 10, 2025 at 12:01 PM James Baster < > james.bas...@opendataservices.coop> wrote: > > > Hello Rahul, > > > > If I do that all I get back is the CSV. There is no "nextCursorMark" data > > available. If I want to get the "nextCursorMark" data I seem to have to > use > > JSON output. This makes it impossible to combine the 2 features and get > > more than 1 page of information. > > > > (I did think of a slightly better workaround right after posting, but am > > curious if there is any way to combine these 2 features I've just > missed?) > > > > Thanks, > > James > > > > > > > > On Wed, 4 Jun 2025 at 17:41, Rahul Goswami <rahul196...@gmail.com> > wrote: > > > > > Can you please explain why the 2 calls? Are you not able to get the > > result > > > the first time with wt=csv and cursorMark=* ? > > > > > > Rahul > > > > > > > > > On Wed, Jun 4, 2025 at 10:45 AM James Baster < > > > james.bas...@opendataservices.coop> wrote: > > > > > > > I know that when paging through a big set of results, using > cursorMark > > is > > > > better than using start/rows pagination because cursorMark works > better > > > > when data may be inserted/updated/deleted during pagination and it > can > > > have > > > > better performance. > > > > > > > > > > > > > > https://solr.apache.org/guide/solr/latest/query-guide/pagination-of-results.html > > > > > > > > I know that there are Response Writers, so that if I want to get my > > > results > > > > in CSV I can, just by changing the wt parameter. > > > > > > > > > > https://solr.apache.org/guide/solr/latest/query-guide/response-writers.html > > > > > > > > So my question is, what if I want to combine them? Get a bunch of > CSV's > > > > nicely paginated with cursorMark? > > > > > > > > I can't see any options to do this - are there any? > > > > > > > > Are there any good workarounds? > > > > > > > > I could just page with start/row and accept the problems with that. > > > However > > > > if a row is inserted/deleted/moved above my current position, my data > > > will > > > > shift by 1 and that's not great. > > > > > > > > I could use cursorMark with 2 queries per page, like: > > > > > > > > * set cursorMark to last known cursorMark or "*" if it's the start. > > > > * call API once with JSON response writer. Note the value of > > > > nextCursorMark. > > > > * call API a second time with CSV response writer. Save my CSV result > > > > somewhere. > > > > * maybe pause a second to avoid rate limiting. > > > > * If nextCursorMark is different from last cursorMark there are more > > > > results so loop over again. > > > > > > > > With this system, if a row is inserted/deleted/moved above my current > > > > position, my data will not shift - great. However if a row is > > > > inserted/deleted/moved in my current page between the 2 queries, I > may > > > miss > > > > a row or double count a row. > > > > > > > > Any better options? > > > > > > > > Thank you in advance, > > > > James > > > > > > > > > > > > -- > Sincerely yours > Mikhail Khludnev >