Re: Is it possible to compare a long text string and fuzzy match only phrases contained in?

2021-01-18 Thread Tim Clarke

On 17/01/2021 23:09, Benedict Holland wrote:
You want to do NLP in postgres? I would say that you would need a tool like 
opennlp to get your tokens and phases, then run a fuzzy matching algorithm.

Unless postgres has nlp capabilities but I am not sure I would use them. You 
actually want something fairly complex.

Thanks,
Ben

On Sun, Jan 17, 2021, 4:55 PM Shaozhong SHI 
mailto:shishaozh...@gmail.com>> wrote:
We are looking for working examples of comparing a long text string and 
fuzzy-matching multiple words (namely, phrases) contained in.

Any such work examples?

Regards,

David


We've had excellent results with

https://www.postgresql.org/docs/13/fuzzystrmatch.html

and you may find

https://www.postgresql.org/docs/13/unaccent.html

very useful in this area too

Tim Clarke MBCS
IT Director
Direct: +44 (0)1376 504510 | Mobile: +44 (0)7887 563420



Telephone: Witham: +44(0)1376 503500 | London: +44 (0)20 3009 0853 | Frankfurt: 
+49 (0)69 7191 6000 | Hong Kong: +852 5803 1687 | Toronto: +1 647 503 2848
Web: https://www.manifest.co.uk/



Minerva Analytics Ltd - A Solactive Company
9 Freebournes Court | Newland Street | Witham | Essex | CM8 2BL | United Kingdom



Copyright: This e-mail may contain confidential or legally privileged 
information. If you are not the named addressee you must not use or disclose 
such information, instead please report it to 
ad...@minerva.info
Legal: Minerva Analytics is the trading name of: Minerva Analytics Ltd: 
Registered in England Number 11260966 & The Manifest Voting Agency Ltd: 
Registered in England Number 2920820 Registered Office at above address. Please 
Click Here https://www.manifest.co.uk/legal/ for further information.


Re: System (and DB) upgrade problem., "part 2"

2021-01-18 Thread Peter J. Holzer
On 2021-01-18 05:33:05 +0100, W.P. wrote:
> Now I have problem with 9.6 -> 10.7 (Fedora 27 -> 28) upgrade.
> As expected after system upgrade, database fails to start "files
> incompatible with binaries".
> 
> Found pg_upgrade, command line options "-b /usr/lib/pgsql/postgresql-9.6/bin
> -B /usr/bin -d /var/lib/pgsql/data/" BUT what should I put for option -D?
> ("new cluster data") Was (directory) already created for me (during system
> upgrade), or have I to create it somewhere (where? is best practice).

The Debian/Ubuntu package contains a script pg_upgradecluster which
knows about the distribution-specific directory layout. You would
normally use that script instead pg_upgrade directly.

Maybe the Fedora package has something similar?

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Peter J. Holzer
On 2021-01-16 14:01:44 -0700, Rob Sargent wrote:
> Mail has always been well threaded, retaining which message lead to which
> replies.  How did we get away from relying on that (naked posting)?

Has that ever been a thing? Quoting (and trimming) the message you are
replying to has been normal since at least the late 1980's (when I
started to use E-Mail). Unix-based mailers (at least since elm, not sure
about mailx) automatically quoted the previous mail with the ">" prefix.
Eudora (on Windows) did that also, if I remember correctly. Few people
deleted everything. Probably because most MUAs displayed only one
message at a time. The first MUA I've seen that displayed an entire
thread at once was Gmail.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Designing a better connection pool for psycopg3

2021-01-18 Thread Daniele Varrazzo
Hello,

I've been gathering a few ideas about the connection pool I would like
to provide with psycopg3. I would be happy if you would like to take a
look and give some feedback.

https://www.psycopg.org/articles/2021/01/17/pool-design/

Thank you very much,

-- Daniele




Parallelism on Partitioning .

2021-01-18 Thread Brajendra Pratap Singh
Hi ,

We are trying to assign the parallel worker or execute the query in
parallel manner on partitioned Tables and partitioned master table but not
able to do that ,could u plz suggest .

Current Postgresql version :- 9.6

Fyi, We have tried with all parameters which can allow the parallel query
execution.

Thanks..


Re: Parallelism on Partitioning .

2021-01-18 Thread Thomas Markus

Hi


Hi ,

We are trying to assign the parallel worker or execute the query in 
parallel manner on partitioned Tables and partitioned master table but 
not able to do that ,could u plz suggest .


Current Postgresql version :- 9.6

Fyi, We have tried with all parameters which can allow the parallel 
query execution.


Thanks..


Your PG version is too old. 9.6 is not able to this (as I remember)

Thomas




Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Peter J. Holzer
On 2021-01-17 11:23:37 +0100, Thiemo Kellner wrote:
> > Also, could it be possible to make messages plain text? I see a lot of
> > varying fancy fonts and I hate that. I even hate it more when people
> > post messages not properly trimmed or messages that need formatting
> > preserved such as select output, i.e. table data, explain plans, etc.
> > Proportional fonts (Outlook with its darn Arial) is one of the worst...
> 
> Well, one could say that with html messages one can "force" a monospaced
> font like consolas or courier. Then again, there is no guarantee that the
> receiving end does have it installed.

"font-family: monospace" should always work. A MUA sending HTML mail can
(and should) therefore specify e.g. "font-family: Consolas, monospace",
not just "font-family: Consolas" to provide for a fallback.

(Do Webfonts work in Email? Do we even *want* them to work?)


> And on top, everyone is free to have her/his mail client to display
> plain text in monospaced font and is only to blame her/himself if not
> doing so.

Yeah. Unfortunately many people don't know how to configure that (or use
a MUA where it can't be configured). So unfortunatly we can't assume
that plain text == monospaced. As someone who likes to underscore stuff
in quotes and draw ASCII (or Unicode) diagrams, I find that annoying,
but I can't change it.


> > I think, an automatic conversion of incoming posts to plain text and
> > dropping all non plain text attachments would help a lot already.
> 
> I would not do that. It is the work on the wrong end with doubtful result.
> Wouldn't it be better to reject non-plain-text postings?

ACK.


> While at it, is there a rule of thumb for the length of inline code - in
> comparison to attaching code files in comparison to using something like
> pastebin.com? I only found very coarse instructions on what to do on the
> lists. Have I been missing a link to a netiquette page?

I don't think there is. My rule od thumb is that it should be short
enough to read as part of the message. Of course this is very subjective
and may even depend on my mood (Sometimes I find a 20 line SQL query too
long, sometimes I'm happy to dig through 200 lines of Perl code ...). It
also depends very much on coding style: If code is badly formatted or
organized, I'll give up much sooner.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Peter J. Holzer
On 2021-01-17 11:11:01 -0800, Adrian Klaver wrote:
> On 1/17/21 11:04 AM, Matthias Apitz wrote:
> > > Always reply-to-all.
> > 
> > Why? Why I (and other subscribers) have to have the same mail twice in
> > the mbox?
> 
> You can prevent that by going here:
> 
> https://lists.postgresql.org/manage/
> 
> and checking:
> 
> Don't receive an extra copy of mails when listed in To or CC fields

For me that works in the wrong direction. I absolutely want to get the
mails through the mailing-list, so that they are correctly filtered. I
would prefer to not get an extra copy directly. (but I can live with
that).

Of course the mailing list server can't filter mails it never sees.

Mutt adds a header to indicate the preferences of the sender, but I
think that is only recognized by mutt, so it's not a general solution.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: System (and DB) upgrade problem., "part 2"

2021-01-18 Thread Laurenz Albe
On Mon, 2021-01-18 at 05:33 +0100, W.P. wrote:
> For 9.5 to 9.6 transition, it worked like charm. (except some problems 
> with parsing json data in a view, but this I will address later, after 
> upgrade to final version - Fedora 30, probably PG 10.x).
> 
> Now I have problem with 9.6 -> 10.7 (Fedora 27 -> 28) upgrade.
> As expected after system upgrade, database fails to start "files 
> incompatible with binaries".
> 
> Found pg_upgrade, command line options "-b 
> /usr/lib/pgsql/postgresql-9.6/bin -B /usr/bin -d /var/lib/pgsql/data/" 
> BUT what should I put for option -D? ("new cluster data") Was 
> (directory) already created for me (during system upgrade), or have I to 
> create it somewhere (where? is best practice).
> 
> There are NO logs for today trial to start in 
> /var/lib/pgsql/data/pg_log/. Where they could be?

You would call "initdb" or "postgresql-10-setup" to create a new, empty
cluster in version 10 and use that with -D.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Laurenz Albe
On Sat, 2021-01-16 at 18:22 +1100, Tim Cross wrote:
> Hemil Ruparel  writes:
> > Exactly my point. We need to raise the bar of the behavior we tolerate.
> > This should not be tolerated. We need to set an example. The person in
> > question clearly understood english and I have never seen a person who
> > could use mailing lists but not google. So that's out of the question.
> > We are not free consultants. And you are not entitled to shit. You are
> > probably being paid to work on that project. We are not. Your problem. Fix
> > it yourself. Or at least have to courtesy to google it.
> 
> While I can understand your frustration, I disagree with your position.
> 
> It is too subjective and difficult to apply/adopt such a strong position
> and could too easily backfire, resulting in a perception of an elitist,
> unwelcoming and unfriendly community.
> 
> Banning should be reserved for the most serious and abusive cases.

+1

I personally find Hemil's attitude and his persistent refusal to
bottom-post more disruptive than the original question.

pgsql-general should remain a low-threshold forum where nothing about
PostgreSQL is off-topic.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Peter J. Holzer
On 2021-01-17 21:25:29 +0100, Matthias Apitz wrote:
> El día domingo, enero 17, 2021 a las 12:23:23p. m. -0700, David G. Johnston 
> escribió:
> 
> > Its 2021;
> 
> Yes, and for what this argument is good for? Is 2021 better than 2020 or
> even worth?

It's not about being better, but about what infrastructure or
capabilities you can expect. 

In 1990, MIME didn't exist. E-Mail was by definition US-ASCII text
(except for uuencode or by private arrangement). People often used text
terminals and may not have any Internet access.

In 2000, MIME existed, some MUAs supported HTML (at quite different
levels), most people who had Email also had Internet (though they might
not stay online while reading mail to save phone costs) and they would
typically read mail on a bitmapped display (though possibly in a
terminal emulator).

In 2021, MIME is actually implemented correctly in most MUAs, HTML/CSS
is very widely supported (but not nearly at the level supported by
browsers), most people (well, their devices) are online 24x7, and
terminal emulators are still a thing (I write this in vim called from
mutt in an xterm). Oh, and any bandwidth wasted by HTML or images is
negligible compared to cat videos. OTOH, they might read the mail on a
device with a very small screen and an atrocious "keyboard". Or use a
MUA which can't quote correctly.

So my expectations on what a recipient can or cannot read or how they
can format their mails are somewhat different than they were in 1990 or
2000 or 2010. They will again be different in 2030.


> > if an image is useful for the topic at hand (say designing a
> > system and having a diagram showing that design) then that image should be
> > allowed.
> 
> If someone needs really an image to show a problem, it can be put on
> some server and the link could be posted, like this one showing a PANIC
> of a system http://www.unixarea.de/fbsd-panic-20210110.jpg

That has the disadvantage of not being archived.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Tom Lane
"Peter J. Holzer"  writes:
> On 2021-01-17 21:25:29 +0100, Matthias Apitz wrote:
>> If someone needs really an image to show a problem, it can be put on
>> some server and the link could be posted, like this one showing a PANIC
>> of a system http://www.unixarea.de/fbsd-panic-20210110.jpg

> That has the disadvantage of not being archived.

Yeah, we actually pretty strongly discourage posts that provide links
to external pages instead of being self-contained.  The PG mailing
list archives go back nearly 25 years at this point, and it's mighty
handy to be able to read old messages without having to constantly
consult the Wayback Machine (and hope it captured $whatever).
There's only a *hard* policy of that sort with respect to patch
submissions; but bug reports are likewise more likely to be taken
seriously and acted on if they don't require looking at external
resources.

regards, tom lane




Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Paul Förster
Hi Peter,

> On 18. Jan, 2021, at 17:34, Peter J. Holzer  wrote:
> 
> In 1990, MIME didn't exist. E-Mail was by definition US-ASCII text
> (except for uuencode or by private arrangement). People often used text
> terminals and may not have any Internet access.

yes, those were the days! :-) As for people using computers these days, I often 
wish I could go back in time. :-P

Cheers,
Paul



Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Matthias Apitz
On Mon, 18 Jan 2021 18:05:34 +0100, Paul Förster wrote:
> Hi Peter,
>
>> On 18. Jan, 2021, at 17:34, Peter J. Holzer  wrote:
>>
>> In 1990, MIME didn't exist. E-Mail was by definition US-ASCII text
>> (except for uuencode or by private arrangement). People often used text
>> terminals and may not have any Internet access.
>
> yes, those were the days! :-) As for people using computers
> these days, I often wish I could go back in time. :-P
>
> Cheers,
> Paul
>
>

This was my first Internet around 1990: a 1200 baud modem and UUCP for mail and 
News.

matthias


--
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
Без книги нет знания, без знания нет коммунизма (Влaдимир Ильич Ленин)
Without books no knowledge - without knowledge no communism (Vladimir Ilyich 
Lenin)
Sin libros no hay saber - sin saber no hay comunismo. (Vladimir Ilich Lenin)




Re: System (and DB) upgrade problem., "part 2"

2021-01-18 Thread W.P.

W dniu 18.01.2021 o 17:19, Laurenz Albe pisze:

On Mon, 2021-01-18 at 05:33 +0100, W.P. wrote:

For 9.5 to 9.6 transition, it worked like charm. (except some problems
with parsing json data in a view, but this I will address later, after
upgrade to final version - Fedora 30, probably PG 10.x).

Now I have problem with 9.6 -> 10.7 (Fedora 27 -> 28) upgrade.
As expected after system upgrade, database fails to start "files
incompatible with binaries".

Found pg_upgrade, command line options "-b
/usr/lib/pgsql/postgresql-9.6/bin -B /usr/bin -d /var/lib/pgsql/data/"
BUT what should I put for option -D? ("new cluster data") Was
(directory) already created for me (during system upgrade), or have I to
create it somewhere (where? is best practice).

There are NO logs for today trial to start in
/var/lib/pgsql/data/pg_log/. Where they could be?

You would call "initdb" or "postgresql-10-setup" to create a new, empty
cluster in version 10 and use that with -D.

Ok, so "step-by-step":
1), I copy / move "somewhere" OLD DB files (*/pgsql/data/* for -d option),
2). Do initdb / postgresql-10-setup to create NEW empty base (in 
/var/lib/pgsql/ or  somewhere, for -D option),

3). do pg_upgrade.

Is that correct?

Is there somewhere "guide for 9.x -> 10.x CONCEPTS changes (and upgrade) 
guide"? (clusters etc).


Laurent

Yours,
Laurenz Albe







Re: System (and DB) upgrade problem., "part 2"

2021-01-18 Thread Bruce Momjian
On Mon, Jan 18, 2021 at 09:53:33PM +0100, W.P. wrote:
> W dniu 18.01.2021 o 17:19, Laurenz Albe pisze:
> > On Mon, 2021-01-18 at 05:33 +0100, W.P. wrote:
> > > For 9.5 to 9.6 transition, it worked like charm. (except some problems
> > > with parsing json data in a view, but this I will address later, after
> > > upgrade to final version - Fedora 30, probably PG 10.x).
> > > 
> > > Now I have problem with 9.6 -> 10.7 (Fedora 27 -> 28) upgrade.
> > > As expected after system upgrade, database fails to start "files
> > > incompatible with binaries".
> > > 
> > > Found pg_upgrade, command line options "-b
> > > /usr/lib/pgsql/postgresql-9.6/bin -B /usr/bin -d /var/lib/pgsql/data/"
> > > BUT what should I put for option -D? ("new cluster data") Was
> > > (directory) already created for me (during system upgrade), or have I to
> > > create it somewhere (where? is best practice).
> > > 
> > > There are NO logs for today trial to start in
> > > /var/lib/pgsql/data/pg_log/. Where they could be?
> > You would call "initdb" or "postgresql-10-setup" to create a new, empty
> > cluster in version 10 and use that with -D.
> Ok, so "step-by-step":
> 1), I copy / move "somewhere" OLD DB files (*/pgsql/data/* for -d option),
> 2). Do initdb / postgresql-10-setup to create NEW empty base (in
> /var/lib/pgsql/ or  somewhere, for -D option),
> 3). do pg_upgrade.
> 
> Is that correct?
> 
> Is there somewhere "guide for 9.x -> 10.x CONCEPTS changes (and upgrade)
> guide"? (clusters etc).

The pg_upgade docs have all the steps.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: System (and DB) upgrade problem., "part 2"

2021-01-18 Thread Ron

On 1/18/21 2:58 PM, Bruce Momjian wrote:

On Mon, Jan 18, 2021 at 09:53:33PM +0100, W.P. wrote:

[snip]

Ok, so "step-by-step":
1), I copy / move "somewhere" OLD DB files (*/pgsql/data/* for -d option),
2). Do initdb / postgresql-10-setup to create NEW empty base (in
/var/lib/pgsql/ or  somewhere, for -D option),
3). do pg_upgrade.

Is that correct?

Is there somewhere "guide for 9.x -> 10.x CONCEPTS changes (and upgrade)
guide"? (clusters etc).

The pg_upgade docs have all the steps.


The documents tend to assume the reader thoroughly knows Postgresql, and 
that's manifestly Not True.


--
Angular momentum makes the world go 'round.




Re: System (and DB) upgrade problem., "part 2"

2021-01-18 Thread David G. Johnston
On Mon, Jan 18, 2021 at 2:13 PM Ron  wrote:

> The documents tend to assume the reader thoroughly knows Postgresql, and
> that's manifestly Not True.
>

Maybe not, but users are still expected to read the documentation.  If
there remains questions or concerns after doing so then by asking such
there is a chance for someone to decide to volunteer an improvement to the
documentation so that the same question doesn't have to be asked in the
future.

David J.


Re: System (and DB) upgrade problem., "part 2"

2021-01-18 Thread Bruce Momjian
On Mon, Jan 18, 2021 at 03:13:13PM -0600, Ron wrote:
> On 1/18/21 2:58 PM, Bruce Momjian wrote:
> > On Mon, Jan 18, 2021 at 09:53:33PM +0100, W.P. wrote:
> [snip]
> > > Ok, so "step-by-step":
> > > 1), I copy / move "somewhere" OLD DB files (*/pgsql/data/* for -d option),
> > > 2). Do initdb / postgresql-10-setup to create NEW empty base (in
> > > /var/lib/pgsql/ or  somewhere, for -D option),
> > > 3). do pg_upgrade.
> > > 
> > > Is that correct?
> > > 
> > > Is there somewhere "guide for 9.x -> 10.x CONCEPTS changes (and upgrade)
> > > guide"? (clusters etc).
> > The pg_upgade docs have all the steps.
> 
> The documents tend to assume the reader thoroughly knows Postgresql, and
> that's manifestly Not True.

What is your point?  Maybe they shouldn't be using pg_upgrade then,
right?  If the documentaiton is unclear, please explain why.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Do we need a way to moderate mailing lists?

2021-01-18 Thread Rob Sargent



Has that ever been a thing? Quoting (and trimming) the message you are
replying to has been normal since at least the late 1980's (when I
started to use E-Mail).

Peter,
My memory of email spans only a slightly larger period but no doubt 
yours is better than mine.  My main reader was emac's rmail until too 
much mail wasn't text.  In my memory you had to turn on quoting the 
original (though irrc rmail had an insert-original command to allow one 
to do it as necessary - and at "point").  I see today that Thunderbird 
sets quoting on by default (per account) but it is optional. But it can 
also display an entire thread in the correct who-replied-to-whom. I get 
that often there's a particular sub-point of the original which needs to 
be at hand in the reply but rarely the entire message - so maybe we're 
back to it's a Netiquette thing (and in my case at least often a 
laziness thing).