Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios

Hello

I am imagining a system that can parse papers from various sources 
(web/files/etc) and in various formats (text, pdf, etc) and can store 
metadata for this paper ,some kind of global ID if applicable, authors, 
areas of research, whether the paper is "new", "highlighted", 
"historical", type (e.g. Case reports, Clinical trials), symptoms (e.g. 
tics, GI pain, psychological changes, anxiety, ), and other key 
attributes (I guess dynamic), it must be full text searchable, etc.


I am at the very beginning in this and it is done on a fully volunteer 
basis.


Lots of questions : is there any scientific/scholar analysis software 
already available? If yes and is really good and open source , then this 
will influence the rest of decisions. Otherwise , I'll have to form a 
team that can write one, in this case I'll have to decide DB, language, 
etc. I work 20 years with pgsql so it is the natural choice for any kind 
of data, I just ask this for the sake of completeness.


All ideas welcome.





Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Laura Smith


‐‐‐ Original Message ‐‐‐
On Saturday, 5 June 2021 10:49, Achilleas Mantzios 
 wrote:

> Hello
>
> I am imagining a system that can parse papers from various sources
> (web/files/etc) and in various formats (text, pdf, etc) and can store
> metadata for this paper ,some kind of global ID if applicable, authors,
> areas of research, whether the paper is "new", "highlighted",
> "historical", type (e.g. Case reports, Clinical trials), symptoms (e.g.
> tics, GI pain, psychological changes, anxiety, ), and other key
> attributes (I guess dynamic), it must be full text searchable, etc.
>
> I am at the very beginning in this and it is done on a fully volunteer
> basis.
>
> Lots of questions : is there any scientific/scholar analysis software
> already available? If yes and is really good and open source , then this
> will influence the rest of decisions. Otherwise , I'll have to form a
> team that can write one, in this case I'll have to decide DB, language,
> etc. I work 20 years with pgsql so it is the natural choice for any kind
> of data, I just ask this for the sake of completeness.
>
> All ideas welcome.

Hello Achilleas

Not wishing to be discouraging, but you have very ambitious goals for what 
sounds like a one-person project ?

You are effectively looking at competing with platforms such as Elsevier 
Scopus/Scival which are market-leaders in the area for good reason (i.e. it 
takes a lot of manpower to write algorithms, manage metadata etc., and the only 
way to consistently maintain that manpower is to employ people, lots of them).  
 There are also things like Google Scholar around the place.

I think before starting on the technical side of Postgres etc., the honest 
truth is that you need to do more planning, both in terms of implementation and 
long-term sustainability.

For example, before we even get to metadata, you talk of various sources and 
formats.  Have you considered licensing issues ?  Have you considered how to 
keep the dataset clean ? (If you are thinking you can just scrape the web, then 
you'll be in for a surprise).

Laura




Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios

Στις 5/6/21 1:52 μ.μ., ο/η Laura Smith έγραψε:


‐‐‐ Original Message ‐‐‐
On Saturday, 5 June 2021 10:49, Achilleas Mantzios 
 wrote:


Hello

I am imagining a system that can parse papers from various sources
(web/files/etc) and in various formats (text, pdf, etc) and can store
metadata for this paper ,some kind of global ID if applicable, authors,
areas of research, whether the paper is "new", "highlighted",
"historical", type (e.g. Case reports, Clinical trials), symptoms (e.g.
tics, GI pain, psychological changes, anxiety, ), and other key
attributes (I guess dynamic), it must be full text searchable, etc.

I am at the very beginning in this and it is done on a fully volunteer
basis.

Lots of questions : is there any scientific/scholar analysis software
already available? If yes and is really good and open source , then this
will influence the rest of decisions. Otherwise , I'll have to form a
team that can write one, in this case I'll have to decide DB, language,
etc. I work 20 years with pgsql so it is the natural choice for any kind
of data, I just ask this for the sake of completeness.

All ideas welcome.

Hello Achilleas

Not wishing to be discouraging, but you have very ambitious goals for what 
sounds like a one-person project ?

You are effectively looking at competing with platforms such as Elsevier 
Scopus/Scival which are market-leaders in the area for good reason (i.e. it 
takes a lot of manpower to write algorithms, manage metadata etc., and the only 
way to consistently maintain that manpower is to employ people, lots of them).  
 There are also things like Google Scholar around the place.

I think before starting on the technical side of Postgres etc., the honest 
truth is that you need to do more planning, both in terms of implementation and 
long-term sustainability.

For example, before we even get to metadata, you talk of various sources and 
formats.  Have you considered licensing issues ?  Have you considered how to 
keep the dataset clean ? (If you are thinking you can just scrape the web, then 
you'll be in for a surprise).


All I got is some very vague descriptions coming from either ppl from 
the advocacy side or the medical side.


I got no idea on the legal status of those documents, as you know some 
are covered by the artistic license (a few in PubMed) some not,


I am not a lawyer. The data are not to be stored locally AFAIK, so only 
metadata will be kept locally and can be reset, refreshed, amended, etc


Parsing will be equivalent to a one-off human reading the article on the 
web. There is a lawyer handling all those. From the whole network of ppl 
interested in this whole endeavor,  I am the only guy with DB/software 
knowledge, hence why I volunteered.


I know its a huge work, but you are missing a point. Nobody wishes to 
compete with anyone. This is a about a project, a parent-advocacy 
non-profit that *ONLY* aims to save the sick children (or maybe also 
very young adults) of a certain spectrum . So the goal is to make the 
right tools for researchers, clinicians and parents. This market is too 
small to even consider making any money out of it, but the research is 
still very expensive and the progress slower than optimum.



Laura





Aw: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Karsten Hilbert
> I am imagining a system that can parse papers from various sources
> (web/files/etc) and in various formats (text, pdf, etc) and can store
> metadata for this paper ,some kind of global ID if applicable, authors,
> areas of research, whether the paper is "new", "highlighted",
> "historical", type

Those three categories won't help much. I'm sure though you had
something specific in mind with them ?

Karsten




Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Laura Smith




Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Saturday, 5 June 2021 12:14, Achilleas Mantzios 
 wrote:


>
> I know its a huge work, but you are missing a point. Nobody wishes to
> compete with anyone. This is a about a project, a parent-advocacy
> non-profit that ONLY aims to save the sick children (or maybe also
> very young adults) of a certain spectrum . So the goal is to make the
> right tools for researchers, clinicians and parents. This market is too
> small to even consider making any money out of it, but the research is
> still very expensive and the progress slower than optimum.


Unfortunately I'm not "missing a point", your final paragraph summarises your 
position.

You have been taken in by the very charitable goal of saving sick children.

Unfortunately your head has been disconnected from your heart.

If we put the charitable purpose to one side and take a purely objective view 
at what you want to do, my original statement still stands, i.e. the certainty 
that you are grossly underestimating the technical and practical complexities 
of what you want to achieve.




Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Vijaykumar Jain
http://tika.apache.org/

To get started with collecting doc metadata. It looks this tool can help
you started.
postgres does support fuzzy text search, so I do think dumping meta data
/abstract in postgresql and then using trigram tsearch etc like extensions
it should work well for a POC.
this being a pg mailing list :) what would be your expectation of type of
data and growth of data would be your queries.
If you store data to support multiple lingual papers, will postgresql be
able to handle ?
Ideally the docs would be stored somewhere on a object storage etc and the
link of the same would be stored in the db when someone would request to
read the whole paper.
Long before I read this
https://www.citusdata.com/blog/2017/04/20/analyzing-postgresql-email-archives/

So if this could work, your POC should too :) with postgresql.


On Sat, 5 Jun 2021 at 5:14 PM Laura Smith <
n5d9xq3ti233xiyif...@protonmail.ch> wrote:

>
>
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐ Original Message ‐‐‐
> On Saturday, 5 June 2021 12:14, Achilleas Mantzios <
> ach...@matrix.gatewaynet.com> wrote:
>
>
> >
> > I know its a huge work, but you are missing a point. Nobody wishes to
> > compete with anyone. This is a about a project, a parent-advocacy
> > non-profit that ONLY aims to save the sick children (or maybe also
> > very young adults) of a certain spectrum . So the goal is to make the
> > right tools for researchers, clinicians and parents. This market is too
> > small to even consider making any money out of it, but the research is
> > still very expensive and the progress slower than optimum.
>
>
> Unfortunately I'm not "missing a point", your final paragraph summarises
> your position.
>
> You have been taken in by the very charitable goal of saving sick children.
>
> Unfortunately your head has been disconnected from your heart.
>
> If we put the charitable purpose to one side and take a purely objective
> view at what you want to do, my original statement still stands, i.e. the
> certainty that you are grossly underestimating the technical and practical
> complexities of what you want to achieve.
>
>
> --
Thanks,
Vijay
Mumbai, India


Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Adrian Klaver

On 6/5/21 2:49 AM, Achilleas Mantzios wrote:

Hello

I am imagining a system that can parse papers from various sources 
(web/files/etc) and in various formats (text, pdf, etc) and can store 
metadata for this paper ,some kind of global ID if applicable, authors, 
areas of research, whether the paper is "new", "highlighted", 
"historical", type (e.g. Case reports, Clinical trials), symptoms (e.g. 
tics, GI pain, psychological changes, anxiety, ), and other key 
attributes (I guess dynamic), it must be full text searchable, etc.


I am at the very beginning in this and it is done on a fully volunteer 
basis.


Lots of questions : is there any scientific/scholar analysis software 
already available? If yes and is really good and open source , then this 
will influence the rest of decisions. Otherwise , I'll have to form a 
team that can write one, in this case I'll have to decide DB, language, 
etc. I work 20 years with pgsql so it is the natural choice for any kind 
of data, I just ask this for the sake of completeness.


All ideas welcome.


A quick search found this:

https://solutionsreview.com/data-management/the-best-open-source-data-catalog-tools-to-consider/

Might be a good starting point on what is already out there.

There is also this:

The Directory of Open Access Journals
https://doaj.org/

It seems to be a service, not downloadable software.









--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios



Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 2:49 AM, Achilleas Mantzios wrote:

Hello

I am imagining a system that can parse papers from various sources 
(web/files/etc) and in various formats (text, pdf, etc) and can store 
metadata for this paper ,some kind of global ID if applicable, 
authors, areas of research, whether the paper is "new", 
"highlighted", "historical", type (e.g. Case reports, Clinical 
trials), symptoms (e.g. tics, GI pain, psychological changes, 
anxiety, ), and other key attributes (I guess dynamic), it must be 
full text searchable, etc.


I am at the very beginning in this and it is done on a fully 
volunteer basis.


Lots of questions : is there any scientific/scholar analysis software 
already available? If yes and is really good and open source , then 
this will influence the rest of decisions. Otherwise , I'll have to 
form a team that can write one, in this case I'll have to decide DB, 
language, etc. I work 20 years with pgsql so it is the natural choice 
for any kind of data, I just ask this for the sake of completeness.


All ideas welcome.


A quick search found this:

https://solutionsreview.com/data-management/the-best-open-source-data-catalog-tools-to-consider/ 



Might be a good starting point on what is already out there.


This is interesting, so the keywords are "Data Catalog" ?



There is also this:

The Directory of Open Access Journals
https://doaj.org/

This seems very very poor. Just try a search there and then repeat in 
PMC (PubMed Central).

It seems to be a service, not downloadable software.














Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios


Στις 5/6/21 4:45 μ.μ., ο/η Vijaykumar Jain έγραψε:

http://tika.apache.org/ 

I checked, it behaves better with downloaded PDF rather than URL PDFs, 
in the 2nd case the metadata are poor.


Does not work with nih articles (but this is general problem not tika's )

To get started with collecting doc metadata. It looks this tool can 
help you started.
postgres does support fuzzy text search, so I do think dumping meta 
data /abstract in postgresql and then using trigram tsearch etc like 
extensions it should work well for a POC.
this being a pg mailing list :) what would be your expectation of type 
of data and growth of data would be your queries.
If you store data to support multiple lingual papers, will postgresql 
be able to handle ?
Ideally the docs would be stored somewhere on a object storage etc and 
the link of the same would be stored in the db when someone would 
request to read the whole paper.

Long before I read this
https://www.citusdata.com/blog/2017/04/20/analyzing-postgresql-email-archives/ 



So if this could work, your POC should too :) with postgresql.


On Sat, 5 Jun 2021 at 5:14 PM Laura Smith 
> wrote:





Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Saturday, 5 June 2021 12:14, Achilleas Mantzios
mailto:ach...@matrix.gatewaynet.com>> wrote:


>
> I know its a huge work, but you are missing a point. Nobody
wishes to
> compete with anyone. This is a about a project, a parent-advocacy
> non-profit that ONLY aims to save the sick children (or maybe also
> very young adults) of a certain spectrum . So the goal is to
make the
> right tools for researchers, clinicians and parents. This market
is too
> small to even consider making any money out of it, but the
research is
> still very expensive and the progress slower than optimum.


Unfortunately I'm not "missing a point", your final paragraph
summarises your position.

You have been taken in by the very charitable goal of saving sick
children.

Unfortunately your head has been disconnected from your heart.

If we put the charitable purpose to one side and take a purely
objective view at what you want to do, my original statement still
stands, i.e. the certainty that you are grossly underestimating
the technical and practical complexities of what you want to achieve.


--
Thanks,
Vijay
Mumbai, India


Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Adrian Klaver

On 6/5/21 9:56 AM, Achilleas Mantzios wrote:


Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 2:49 AM, Achilleas Mantzios wrote:

Hello

I am imagining a system that can parse papers from various sources 
(web/files/etc) and in various formats (text, pdf, etc) and can store 
metadata for this paper ,some kind of global ID if applicable, 
authors, areas of research, whether the paper is "new", 
"highlighted", "historical", type (e.g. Case reports, Clinical 
trials), symptoms (e.g. tics, GI pain, psychological changes, 
anxiety, ), and other key attributes (I guess dynamic), it must be 
full text searchable, etc.


I am at the very beginning in this and it is done on a fully 
volunteer basis.


Lots of questions : is there any scientific/scholar analysis software 
already available? If yes and is really good and open source , then 
this will influence the rest of decisions. Otherwise , I'll have to 
form a team that can write one, in this case I'll have to decide DB, 
language, etc. I work 20 years with pgsql so it is the natural choice 
for any kind of data, I just ask this for the sake of completeness.


All ideas welcome.


A quick search found this:

https://solutionsreview.com/data-management/the-best-open-source-data-catalog-tools-to-consider/ 



Might be a good starting point on what is already out there.


This is interesting, so the keywords are "Data Catalog" ?


What I searched on was 'open source article catalog'.





There is also this:

The Directory of Open Access Journals
https://doaj.org/

This seems very very poor. Just try a search there and then repeat in 
PMC (PubMed Central).


This is down to copyright issues I'm sure. For PubMed Central see:

https://www.ncbi.nlm.nih.gov/pmc/about/copyright/

for the if/ands/buts that restrict what you can do with the information 
and stay legal.



It seems to be a service, not downloadable software.












--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios



Στις 5/6/21 8:03 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 9:56 AM, Achilleas Mantzios wrote:


Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 2:49 AM, Achilleas Mantzios wrote:

Hello

I am imagining a system that can parse papers from various sources 
(web/files/etc) and in various formats (text, pdf, etc) and can 
store metadata for this paper ,some kind of global ID if 
applicable, authors, areas of research, whether the paper is "new", 
"highlighted", "historical", type (e.g. Case reports, Clinical 
trials), symptoms (e.g. tics, GI pain, psychological changes, 
anxiety, ), and other key attributes (I guess dynamic), it must be 
full text searchable, etc.


I am at the very beginning in this and it is done on a fully 
volunteer basis.


Lots of questions : is there any scientific/scholar analysis 
software already available? If yes and is really good and open 
source , then this will influence the rest of decisions. Otherwise 
, I'll have to form a team that can write one, in this case I'll 
have to decide DB, language, etc. I work 20 years with pgsql so it 
is the natural choice for any kind of data, I just ask this for the 
sake of completeness.


All ideas welcome.


A quick search found this:

https://solutionsreview.com/data-management/the-best-open-source-data-catalog-tools-to-consider/ 



Might be a good starting point on what is already out there.


This is interesting, so the keywords are "Data Catalog" ?


What I searched on was 'open source article catalog'.





There is also this:

The Directory of Open Access Journals
https://doaj.org/

This seems very very poor. Just try a search there and then repeat in 
PMC (PubMed Central).


This is down to copyright issues I'm sure. For PubMed Central see:

https://www.ncbi.nlm.nih.gov/pmc/about/copyright/

for the if/ands/buts that restrict what you can do with the 
information and stay legal.


maybe but still :

https://www.ncbi.nlm.nih.gov/pmc/?term=open+access%5Bfilter%5D+PANDAS+IVIG

>

https://doaj.org/search/articles?ref=homepage-box&source=%7B%22query%22%3A%7B%22query_string%22%3A%7B%22query%22%3A%22IVIG%20PANDAS%22%2C%22default_operator%22%3A%22AND%22%7D%7D%7D




It seems to be a service, not downloadable software.

















Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Adrian Klaver

On 6/5/21 10:39 AM, Achilleas Mantzios wrote:


Στις 5/6/21 8:03 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 9:56 AM, Achilleas Mantzios wrote:


Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 2:49 AM, Achilleas Mantzios wrote:

Hello

I am imagining a system that can parse papers from various sources 
(web/files/etc) and in various formats (text, pdf, etc) and can 
store metadata for this paper ,some kind of global ID if 
applicable, authors, areas of research, whether the paper is "new", 
"highlighted", "historical", type (e.g. Case reports, Clinical 
trials), symptoms (e.g. tics, GI pain, psychological changes, 
anxiety, ), and other key attributes (I guess dynamic), it must be 
full text searchable, etc.


I am at the very beginning in this and it is done on a fully 
volunteer basis.


Lots of questions : is there any scientific/scholar analysis 
software already available? If yes and is really good and open 
source , then this will influence the rest of decisions. Otherwise 
, I'll have to form a team that can write one, in this case I'll 
have to decide DB, language, etc. I work 20 years with pgsql so it 
is the natural choice for any kind of data, I just ask this for the 
sake of completeness.


All ideas welcome.


A quick search found this:

https://solutionsreview.com/data-management/the-best-open-source-data-catalog-tools-to-consider/ 



Might be a good starting point on what is already out there.


This is interesting, so the keywords are "Data Catalog" ?


What I searched on was 'open source article catalog'.





There is also this:

The Directory of Open Access Journals
https://doaj.org/

This seems very very poor. Just try a search there and then repeat in 
PMC (PubMed Central).


This is down to copyright issues I'm sure. For PubMed Central see:

https://www.ncbi.nlm.nih.gov/pmc/about/copyright/

for the if/ands/buts that restrict what you can do with the 
information and stay legal.


maybe but still :

https://www.ncbi.nlm.nih.gov/pmc/?term=open+access%5Bfilter%5D+PANDAS+IVIG


Yeah it is nice to have the resources of the NIH behind you. Still I 
would point out under Copyright and License information:


"This article is made available via the PMC Open Access Subset for 
unrestricted research re-use and secondary analysis in any form or by 
any means with acknowledgement of the original source. These permissions 
are granted for the duration of the World Health Organization (WHO) 
declaration of COVID-19 as a global pandemic."


Further on PMC Open Access Subset:

https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/

Again more ifs/ands/buts.

The point being, dealing with articles is a descent into legalese.  I am 
not saying this is show stopper, just that it will consume considerable 
resources to sort out. I for one applaud your effort and given what I 
have seen you do with the shipping software over the years I don't see 
this project as out of the realm of possibility.




 >

https://doaj.org/search/articles?ref=homepage-box&source=%7B%22query%22%3A%7B%22query_string%22%3A%7B%22query%22%3A%22IVIG%20PANDAS%22%2C%22default_operator%22%3A%22AND%22%7D%7D%7D 






It seems to be a service, not downloadable software.


















--
Adrian Klaver
adrian.kla...@aklaver.com




Re: strange behavior of WAL files

2021-06-05 Thread Tom Lane
Atul Kumar  writes:
> Please check my findings below

> older
> -rw--- 1 enterprisedb enterprisedb 16777216 Jun  2 02:47
> 000136CF00A4
> -rw--- 1 enterprisedb enterprisedb 16777216 Jun  2 02:45
> 000136CF00A3
> -rw--- 1 enterprisedb enterprisedb 16777216 Jun  2 02:44
> 000136CF00A5

I suspect these files were archived awhile ago (with different
names) and have already been renamed in preparation for using
them as future WAL segments ...

> -rw--- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
> 000136CF00A3
> -rw--- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
> 000136CF00A4

... and here we see that they just got overwritten with new WAL data,
which would make their new contents eligible for archiving.

Have you made any attempt to correlate your observations with
the actual WAL write position?  (pg_controldata would give you
at least a rough approximation of that, i.e. the WAL write
position as of the most recent checkpoint.  I think you can
get a more up-to-date result from one or another system view,
but I don't remember which.)

regards, tom lane




Re: Ideas for building a system that parses medical research publications/articles

2021-06-05 Thread Achilleas Mantzios



Στις 5/6/21 10:12 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 10:39 AM, Achilleas Mantzios wrote:


Στις 5/6/21 8:03 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 9:56 AM, Achilleas Mantzios wrote:


Στις 5/6/21 6:34 μ.μ., ο/η Adrian Klaver έγραψε:

On 6/5/21 2:49 AM, Achilleas Mantzios wrote:

Hello

I am imagining a system that can parse papers from various 
sources (web/files/etc) and in various formats (text, pdf, etc) 
and can store metadata for this paper ,some kind of global ID if 
applicable, authors, areas of research, whether the paper is 
"new", "highlighted", "historical", type (e.g. Case reports, 
Clinical trials), symptoms (e.g. tics, GI pain, psychological 
changes, anxiety, ), and other key attributes (I guess dynamic), 
it must be full text searchable, etc.


I am at the very beginning in this and it is done on a fully 
volunteer basis.


Lots of questions : is there any scientific/scholar analysis 
software already available? If yes and is really good and open 
source , then this will influence the rest of decisions. 
Otherwise , I'll have to form a team that can write one, in this 
case I'll have to decide DB, language, etc. I work 20 years with 
pgsql so it is the natural choice for any kind of data, I just 
ask this for the sake of completeness.


All ideas welcome.


A quick search found this:

https://solutionsreview.com/data-management/the-best-open-source-data-catalog-tools-to-consider/ 



Might be a good starting point on what is already out there.


This is interesting, so the keywords are "Data Catalog" ?


What I searched on was 'open source article catalog'.





There is also this:

The Directory of Open Access Journals
https://doaj.org/

This seems very very poor. Just try a search there and then repeat 
in PMC (PubMed Central).


This is down to copyright issues I'm sure. For PubMed Central see:

https://www.ncbi.nlm.nih.gov/pmc/about/copyright/

for the if/ands/buts that restrict what you can do with the 
information and stay legal.


maybe but still :

https://www.ncbi.nlm.nih.gov/pmc/?term=open+access%5Bfilter%5D+PANDAS+IVIG 



Yeah it is nice to have the resources of the NIH behind you. Still I 
would point out under Copyright and License information:


"This article is made available via the PMC Open Access Subset for 
unrestricted research re-use and secondary analysis in any form or by 
any means with acknowledgement of the original source. These 
permissions are granted for the duration of the World Health 
Organization (WHO) declaration of COVID-19 as a global pandemic."


Further on PMC Open Access Subset:

https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/

Again more ifs/ands/buts.

The point being, dealing with articles is a descent into legalese.  I 
am not saying this is show stopper, just that it will consume 
considerable resources to sort out. I for one applaud your effort and 
given what I have seen you do with the shipping software over the 
years I don't see this project as out of the realm of possibility.
Thank you Adrian, there is no money in this project, but the stakes are 
much much higher.


 >

https://doaj.org/search/articles?ref=homepage-box&source=%7B%22query%22%3A%7B%22query_string%22%3A%7B%22query%22%3A%22IVIG%20PANDAS%22%2C%22default_operator%22%3A%22AND%22%7D%7D%7D 






It seems to be a service, not downloadable software.