Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Stephen Woodbridge
I would start with the address standardizer and try to compile and install it. I'm pretty sure it compiled and install in PostgreSQL up through 11, but more recent ones may need a little tweaking. Anyway, it handles tokenizing the address string into tokens. It works internally by building and

Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Shaozhong SHI
Hi, Steve, If I can find budget or a bigger enough project, your geocoder is certainly very useful. Regards, David On Mon, 21 Dec 2020 at 21:40, Shaozhong SHI wrote: > Hi, Steve, > > Alternatively, one could try to do something like the following to > generate values to mark each row with a g

Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Shaozhong SHI
Hi, Steve, Alternatively, one could try to do something like the following to generate values to mark each row with a group number, by using something like the following: SELECT t.value from regexp_matches('1234567 - 7654321 - some - more - text', '\d+', 'g') with ordinality as t(value,idx) where

Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Shaozhong SHI
Hi, Steve, I do respect your work. I am exploring a much simpler way, due to current circumstance, and curiosity to research. I have been struggling to find a way to locate and access regex elements. For instance, how to refer to and access the one before the last. The idea is precisely about y

Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Stephen Woodbridge
David, in terms of geocoding, which is how I think about these issues, a street can have a range of house numbers distributed over it. When you talk about "101d--120a Some Great Street" as a street you have to decided: 1. is this one location or many? 2. how is it distributed over the street

Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Shaozhong SHI
Hi, Steve, Thanks. I think that the key matter lies to one thing: Get SAO and PAO numbers and text separate, so that we can intersect SAO and PAO number arrays. By doing so, we can detect addresses for matching addresses to known address points. This could offer a much simpler alternative. Wh

Re: [postgis-users] Upgrading PostGIS

2020-12-21 Thread Clifford Snow
Thanks for all the suggestions. I decided to take the safe route by running pg_dumpall then turning on Postgresql 13 with Postgis 3.1. Once the data was loaded the system was working perfectly. I didn't mention this in my original email but one of my databases has nealy 100M records. I didn't wan

Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Stephen Woodbridge
On 12/21/2020 11:40 AM, Shaozhong SHI wrote: Hi, Steve W, Many thanks, How best to see where two addresses match. For instance, one would say:  101d--120a Some Great Street and 104d-110d Some Great Street match. How best to do it? Would it be possible to turn both ranges into arrays, and t

Re: [postgis-users] One line solution: select a dirty full address and get a clean address

2020-12-21 Thread Shaozhong SHI
Hi, Steve W, Many thanks, How best to see where two addresses match. For instance, one would say: 101d--120a Some Great Street and 104d-110d Some Great Street match. How best to do it? Would it be possible to turn both ranges into arrays, and then intersect the range to check out? Please en

Re: [postgis-users] ST_ClusterDBSCAN for "geography" data type?

2020-12-21 Thread Marco Boeringa
Hi Darafei, Thanks for the suggestion. I must admit I have a bit of difficulty visualizing what this exactly does, and had to look up that EPSG:4978 projection you mention, but do I understand it right that the solution you suggest should work for global data sets, and isn't influenced or lim

Re: [postgis-users] ST_ClusterDBSCAN for "geography" data type?

2020-12-21 Thread Marco Boeringa
Hi Giuseppe, Thanks for confirming what I already suspected, and that "geography" is not supported. Maybe, as a first step to implementing this, just assuming the globe is a perfect sphere instead of spheroid, would help in easing implementing something like this for "geography", and provide

Re: [postgis-users] ST_ClusterDBSCAN for "geography" data type?

2020-12-21 Thread Komяpa
Hi, My last exercise in KMeans showed that it's enough to add support for 3D distances instead of 2D distances in the code and transform your geometry into EPSG:4978 (after Force3D). That will cluster in a 3D XYZ coordinate system using straight lines in 3D, which is usually good enough. On Mon,

Re: [postgis-users] ST_ClusterDBSCAN for "geography" data type?

2020-12-21 Thread Giuseppe Broccolo
Hi Marco, Il giorno dom 20 dic 2020 alle ore 10:03 Marco Boeringa < ma...@boeringa.demon.nl> ha scritto: > Hi, > > Reading through the PostGIS documentation, I noticed the > "ST_ClusterDBSCAN" function takes a distance as one of the inputs. Now > the docs suggest the current algorithm only takes