[GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-23 Thread Yan Chunlu
I was doing postgresql streaming replication, which was fine when two
machine in the same datecenter. but recently I was planning to deploy
new slave at a different datecent, the latency between the master and
slave is 20ms;
below is the related configurateion:
Both master and slave have below configuration:
hot_standby = on
wal_level = hot_standby
max_wal_senders = 5

checkpoint_segments = 64
wal_keep_segments = 128

I am using pgpool to automation but the method is similar to the
method described here:
http://wiki.postgresql.org/wiki/Streaming_Replication

the data dir size is about 30G,  I have tried many times but every
time after the sync was over and slave was started,  postgresql  is
just hanging there with error message(attached below), while trying to
connect it returns error message "psql: FATAL:  the database system is
starting up"


the strange part is with same configuration, other slaves in the same
datacenter works fine...



what does invalid record length and invalid magic number  normally
means? xlog  corrupted?
Thanks for any further help!
the log message with debug5 level was like this(just clips, I could
upload full log file if necessary):



17997 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ LOG:  database
system was interrupted; last known up at 2011-07-23 07:07:57 CDT
17828 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:  forked
new backend, pid=17998 socket=8
17828 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:  forked
new backend, pid=17999 socket=8
17999 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@postgres [local]FATAL:  the database system is starting up
17999 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@postgres [local]DEBUG:  shmem_exit(1): 0 callbacks to make
17999 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@postgres [local]DEBUG:  proc_exit(1): 1 callbacks to make
17999 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@postgres [local]DEBUG:  exit(1)
17999 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@postgres [local]DEBUG:  shmem_exit(-1): 0 callbacks to make
17999 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@postgres [local]DEBUG:  proc_exit(-1): 0 callbacks to make
17828 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:
reaping dead processes
17828 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:  server
process (PID 17999) exited with exit code 1
17998 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@template1 10.28.53.11(33647)FATAL:  the database system is
starting up
17998 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@template1 10.28.53.11(33647)DEBUG:  shmem_exit(1): 0
callbacks to make
17998 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@template1 10.28.53.11(33647)DEBUG:  proc_exit(1): 1 callbacks
to make
17998 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@template1 10.28.53.11(33647)DEBUG:  exit(1)
17998 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@template1 10.28.53.11(33647)DEBUG:  shmem_exit(-1): 0
callbacks to make
17998 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT
postgres@template1 10.28.53.11(33647)DEBUG:  proc_exit(-1): 0
callbacks to make
17828 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:
reaping dead processes
17828 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:  server
process (PID 17998) exited with exit code 1
17997 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:
standby_mode = 'on'
17997 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:
primary_conninfo = 'host=jefferson port=5432 user=postgres'
17997 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:
trigger_file = '/var/log/pgpool/trigger/trigger_file1'
17997 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ LOG:  entering
standby mode
17997 2011-07-23 07:12:19 CDT 2011-07-23 07:12:19 CDT @ DEBUG:  could
not open file "pg_xlog/0003005400DB" (log file 84, segment
219): No such file or directory


17997 2011-07-23 07:27:32 CDT 2011-07-23 07:27:32 CDT @ DEBUG:  record
known xact 36933672 latestObservedXid 36933674
17997 2011-07-23 07:27:32 CDT 2011-07-23 07:27:32 CDT @ CONTEXT:  xlog
redo commit: 2011-07-23 06:41:41.264405-05
17997 2011-07-23 07:27:32 CDT 2011-07-23 07:27:32 CDT @ DEBUG:  remove
KnownAssignedXid 36933672
17997 2011-07-23 07:27:32 CDT 2011-07-23 07:27:32 CDT @ CONTEXT:  xlog
redo commit: 2011-07-23 06:41:41.264405-05
17997 2011-07-23 07:27:32 CDT 2011-07-23 07:27:32 CDT @ DEBUG:  record
known xact 36933674 latestObservedXid 36933674
17997 2011-07-23 07:27:32 CDT 2011-07-23 07:27:32 CDT @ CONTEXT:  xlog
redo insert: rel 1663/16386/17404; tid 18378/37
17997 2011-07-23 07:27:32 CDT 2011-07-23 07:27:32 CDT @ LOG:  invalid
record length at 54/DDFE4010

17997 2011-07-23 07:13:26 CDT 2011-07-23 07:13:26 CDT @ DEBUG:  remove
KnownAssignedXid 36929085
17997 2011-07-23 07:13:26 CDT 2011-07-23 07:13:26 CDT @ CONTEXT:  xlog
redo commit: 2011-07-23 06:33:29.760915-05
17997 2011-07-23 

Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-23 Thread Yan Chunlu
thanks for the help!
are there any other possible reasons?

both system are using Debian amd64, as uname -a shows:
Linux washington 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010
x86_64 GNU/Linux

and using the following program it tells both of them are little-endian
#include 
#include 


bool isBigEndian()
{
int no = 1;
char *chk = (char *)&no;

if (chk[0] == 1)
{
return 0;
}
else
{
return 1;
}
}

main()
{
printf("this is %d \n",(int)isBigEndian());
return 0;
}
~

On Sat, Jul 23, 2011 at 11:55 PM, Scott Ribe
 wrote:
> On Jul 23, 2011, at 6:50 AM, Yan Chunlu wrote:
>
>> what does invalid record length and invalid magic number  normally
>> means? xlog  corrupted?
>> Thanks for any further help!
>
> It means your build settings for pg are not compatible across the 2 machines. 
> For instance, one machine is 32-bit and the other is 64-bit, or one machine 
> is big-endian and the other is little-endian...
>
> --
> Scott Ribe
> scott_r...@elevated-dev.com
> http://www.elevated-dev.com/
> (303) 722-0567 voice
>
>
>
>
>



-- 
闫春路

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-23 Thread Yan Chunlu
I used apt-get to install postgresql, running pg_config showing they
are exactly the same...


running on master:
https://gist.github.com/1102148


running on slave:
https://gist.github.com/1102151

On Sun, Jul 24, 2011 at 2:44 AM, Scott Marlowe  wrote:
> On Sat, Jul 23, 2011 at 11:55 AM, Tomas Vondra  wrote:
>> On 23 Červenec 2011, 18:14, Yan Chunlu wrote:
>>> thanks for the help!
>>> are there any other possible reasons?
>>>
>>> both system are using Debian amd64, as uname -a shows:
>>> Linux washington 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010
>>> x86_64 GNU/Linux
>>
>> It is not just about the architecture, it means the PostgreSQL was
>> configured somehow differently during the build. E.g. a different block
>> size or WAL block size would make such problems.
>>
>> Or maybe one of the buils might be 32-bit for some reason (you can run
>> 32-bit system in a 64-bit environment). You can do this
>>
>> $ less postgres | grep Class
>>
>> to check this (ELF32 => 32bit, ELF64 => 64bit).
>>
>> Did you use the same binary packages or have you built the server yourself?
>
> Different date formats too.
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-23 Thread Yan Chunlu
less postgres  didn't showing anything... cause it's binary, I tried with -a
 less postgres |grep -a ELF

less postgres | grep -a Class

nothing related to (ELF32 => 32bit, ELF64 => 64bit).





On Sun, Jul 24, 2011 at 1:55 AM, Tomas Vondra  wrote:
> On 23 Červenec 2011, 18:14, Yan Chunlu wrote:
>> thanks for the help!
>> are there any other possible reasons?
>>
>> both system are using Debian amd64, as uname -a shows:
>> Linux washington 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010
>> x86_64 GNU/Linux
>
> It is not just about the architecture, it means the PostgreSQL was
> configured somehow differently during the build. E.g. a different block
> size or WAL block size would make such problems.
>
> Or maybe one of the buils might be 32-bit for some reason (you can run
> 32-bit system in a 64-bit environment). You can do this
>
> $ less postgres | grep Class
>
> to check this (ELF32 => 32bit, ELF64 => 64bit).
>
> Did you use the same binary packages or have you built the server yourself?
>
> Tomas
>
>



--

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-23 Thread Yan Chunlu
the system date formats? looks the same:

master:
#date
Sat Jul 23 21:53:34 CDT 2011

slave:
#date
Sat Jul 23 21:52:56 CDT 2011



On Sun, Jul 24, 2011 at 2:44 AM, Scott Marlowe  wrote:
> On Sat, Jul 23, 2011 at 11:55 AM, Tomas Vondra  wrote:
>> On 23 Červenec 2011, 18:14, Yan Chunlu wrote:
>>> thanks for the help!
>>> are there any other possible reasons?
>>>
>>> both system are using Debian amd64, as uname -a shows:
>>> Linux washington 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010
>>> x86_64 GNU/Linux
>>
>> It is not just about the architecture, it means the PostgreSQL was
>> configured somehow differently during the build. E.g. a different block
>> size or WAL block size would make such problems.
>>
>> Or maybe one of the buils might be 32-bit for some reason (you can run
>> 32-bit system in a 64-bit environment). You can do this
>>
>> $ less postgres | grep Class
>>
>> to check this (ELF32 => 32bit, ELF64 => 64bit).
>>
>> Did you use the same binary packages or have you built the server yourself?
>
> Different date formats too.
>



-- 
闫春路

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-23 Thread Yan Chunlu
thanks for all the help!

@Adrian:  yes, only one instance on each machine

not the slave finally started and could be connect, replication didn't
begin, just following errors:
https://gist.github.com/1102225





BTW: is that possible that rsync has finished but the data didn't
flush to disk, so when postgresql started it was seeing corrupted
files?


On Sun, Jul 24, 2011 at 11:23 AM, Scott Ribe
 wrote:
> On Jul 23, 2011, at 8:43 PM, Yan Chunlu wrote:
>
>> I used apt-get to install postgresql, running pg_config showing they
>> are exactly the same...
>
> BTW, forgot to mention this in my first message: I run streaming replication 
> across the country with latency well over 100ms and no problems.
>
> --
> Scott Ribe
> scott_r...@elevated-dev.com
> http://www.elevated-dev.com/
> (303) 722-0567 voice
>
>
>
>
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-24 Thread Yan Chunlu
is there anything special you have configured on master and slave?

could I see the related configuration on your master and slave? such
as wal_keep_segments,checkpoint_segments or any other might be
related? thanks a lot!

On Sun, Jul 24, 2011 at 11:23 AM, Scott Ribe
 wrote:
> On Jul 23, 2011, at 8:43 PM, Yan Chunlu wrote:
>
>> I used apt-get to install postgresql, running pg_config showing they
>> are exactly the same...
>
> BTW, forgot to mention this in my first message: I run streaming replication 
> across the country with latency well over 100ms and no problems.
>
> --
> Scott Ribe
> scott_r...@elevated-dev.com
> http://www.elevated-dev.com/
> (303) 722-0567 voice
>
>
>
>
>



-- 
闫春路

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-24 Thread Yan Chunlu
checkpoint_segments = 64
wal_keep_segments = 128

On Sun, Jul 24, 2011 at 8:25 PM, Tomas Vondra  wrote:
> On 24 Červenec 2011, 6:09, Yan Chunlu wrote:
>> thanks for all the help!
>>
>> @Adrian:  yes, only one instance on each machine
>>
>> not the slave finally started and could be connect, replication didn't
>> begin, just following errors:
>> https://gist.github.com/1102225
>
> These errors just mean the master already removed WAL segments, so the
> slave can't actually start the replication because there'd be a gap. This
> usually happens with enough write activity (inserts, updates) when the
> slave is being setup.
>
> Whaht is your wal_keep_segments value? Increase it or set up WAL
> archiving, so that the slave can get the data.
>
> Tomas
>
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-24 Thread Yan Chunlu
I did the SR procedure again, still no luck:

is that normal that after start slave postgresql, the first line of log is:
database system was interrupted; last known up at 2011-07-24 10:53:38 CDT??



4760 2011-07-24 10:55:58 CDT 2011-07-24 10:55:58 CDT @ LOG:  database
system was interrupted; last known up at 2011-07-24 10:53:38 CDT
4760 2011-07-24 10:55:58 CDT 2011-07-24 10:55:58 CDT @ LOG:  entering
standby mode
4762 2011-07-24 10:55:59 CDT 2011-07-24 10:55:59 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4761 2011-07-24 10:55:59 CDT 2011-07-24 10:55:59 CDT @ LOG:  streaming
replication successfully connected to primary
4764 2011-07-24 10:55:59 CDT 2011-07-24 10:55:59 CDT postgres@postgres
10.28.53.11(53442)FATAL:  the database system is starting up
4770 2011-07-24 10:56:00 CDT 2011-07-24 10:56:00 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4802 2011-07-24 10:56:01 CDT 2011-07-24 10:56:01 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4760 2011-07-24 10:56:01 CDT 2011-07-24 10:56:01 CDT @ LOG:  redo
starts at 57/6B002028
4760 2011-07-24 10:56:01 CDT 2011-07-24 10:56:01 CDT @ LOG:  invalid
record length at 57/6B20E010
4761 2011-07-24 10:56:01 CDT 2011-07-24 10:56:01 CDT @ FATAL:
terminating walreceiver process due to administrator command
4760 2011-07-24 10:56:01 CDT 2011-07-24 10:56:01 CDT @ LOG:  invalid
magic number  in log file 87, segment 107, offset 2490368
4847 2011-07-24 10:56:02 CDT 2011-07-24 10:56:02 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4850 2011-07-24 10:56:02 CDT 2011-07-24 10:56:02 CDT postgres@postgres
10.28.53.11(53443)FATAL:  the database system is starting up
4851 2011-07-24 10:56:03 CDT 2011-07-24 10:56:03 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4860 2011-07-24 10:56:04 CDT 2011-07-24 10:56:04 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4865 2011-07-24 10:56:05 CDT 2011-07-24 10:56:05 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4859 2011-07-24 10:56:05 CDT 2011-07-24 10:56:05 CDT @ LOG:  streaming
replication successfully connected to primary
4874 2011-07-24 10:56:06 CDT 2011-07-24 10:56:06 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4869 2011-07-24 10:56:06 CDT 2011-07-24 10:56:06 CDT
postgres@template1 10.28.53.11(53444)FATAL:  the database system is
starting up
4879 2011-07-24 10:56:07 CDT 2011-07-24 10:56:07 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4760 2011-07-24 10:56:07 CDT 2011-07-24 10:56:07 CDT @ LOG:  invalid
record length at 57/6B2BA010
4859 2011-07-24 10:56:07 CDT 2011-07-24 10:56:07 CDT @ FATAL:
terminating walreceiver process due to administrator command
4760 2011-07-24 10:56:07 CDT 2011-07-24 10:56:07 CDT @ LOG:  invalid
magic number  in log file 87, segment 107, offset 2883584
4887 2011-07-24 10:56:08 CDT 2011-07-24 10:56:08 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4888 2011-07-24 10:56:08 CDT 2011-07-24 10:56:08 CDT @ LOG:  streaming
replication successfully connected to primary
4892 2011-07-24 10:56:09 CDT 2011-07-24 10:56:09 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4896 2011-07-24 10:56:09 CDT 2011-07-24 10:56:09 CDT
postgres@template1 10.28.53.11(53445)FATAL:  the database system is
starting up
4901 2011-07-24 10:56:10 CDT 2011-07-24 10:56:10 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4906 2011-07-24 10:56:11 CDT 2011-07-24 10:56:11 CDT postgres@postgres
[local]FATAL:  the database system is starting up
4760 2011-07-24 10:56:11 CDT 2011-07-24 10:56:11 CDT @ LOG:  invalid
record length at 57/6B486010
4888 2011-07-24 10:56:11 CDT 2011-07-24 10:56:11 CDT @ FATAL:
terminating walreceiver process due to administrator command
4760 2011-07-24 10:56:11 CDT 2011-07-24 10:56:11 CDT @ LOG:  invalid
magic number  in log file 87, segment 107, offset 4849664



On Sun, Jul 24, 2011 at 8:46 PM, Yan Chunlu  wrote:
> checkpoint_segments = 64
> wal_keep_segments = 128
>
> On Sun, Jul 24, 2011 at 8:25 PM, Tomas Vondra  wrote:
>> On 24 Červenec 2011, 6:09, Yan Chunlu wrote:
>>> thanks for all the help!
>>>
>>> @Adrian:  yes, only one instance on each machine
>>>
>>> not the slave finally started and could be connect, replication didn't
>>> begin, just following errors:
>>> https://gist.github.com/1102225
>>
>> These errors just mean the master already removed WAL segments, so the
>> slave can't actually start the replication because there'd be a gap. This
>> usually happens with enough write activity (inserts, updates) when the
>> slave is being setup.
>>
>> Whaht is your wal_keep_segments value? Increase it or set up WAL
>> archiving, so that the slave can get the data.
>>
&g

Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-24 Thread Yan Chunlu
 checkpoint_segments = 64
wal_keep_segments = 128
this setting seems is for 5GB capacity, I think there is noway I would
ever write 5GB data during the rsync progress.


I think the problem is still "invalid record length" and "invalid
magic number", it start showing right after I complete sync data and
start slave.  If I stop slave later and restart, yes it could show
xlog not found and can not catch master. but why the "invalid" things
in the first place?


On Mon, Jul 25, 2011 at 4:28 AM, Tomas Vondra  wrote:
> Dne 24.7.2011 14:46, Yan Chunlu napsal(a):
>> checkpoint_segments = 64
>> wal_keep_segments = 128
>
> This information alone is not sufficient - we don't know how much write
> activity is on the primary system, so we can't say if those number are
> sufficient or not. You have to tune them according to write activity on
> the primary server.
>
> For example let's suppose the current WAL segment on the primary is "1"
> and that it's configured with wal_keep_segments = 5 (i.e. about 80MB of
> data).
>
> Before you prepare and start the slave machine, someone writes 100MB of
> data to the primary database (one big insert/update or a lot of small
> ones, doesn't matter). 100MB is about 6 WAL segments, so the current WAL
> segment on the primary is 7, and because of wal_keep_segments there are
> segments 3,4,5,6,7 available.
>
> But when the slave connects, it asks for segment no. 2 and it's not
> available. It's not possible to skip that segment so the replication
> fails to start.
>
> If the primary only received 60MB of data, it'd probably worked (there'd
> be enough segments kept on the primary).
>
> Those 128 segments is about 2GB of data. How much data is written on the
> primary between creating a filesystem copy and starting the slave?
>
> You don't neet to keep the files on the master, you can set up archiving
> and keep them somewhere else (on a different system etc.).
>
> Tomas
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-25 Thread Yan Chunlu
I am using debian ant apt-get to install postgresql, dpkg list shows
they are the same?  is there anyway to tell what's version it is
compiled from? thanks!

Master# dpkg -l |grep post
ii  postgresql-9.0  9.0.4-1+b1
  object-relational SQL database, version 9.0 server
ii  postgresql-client-9.0   9.0.4-1+b1
  front-end programs for PostgreSQL 9.0
ii  postgresql-client-common118
  manager for multiple PostgreSQL client versions
ii  postgresql-common   118
  PostgreSQL database-cluster manager
ii  postgresql-contrib-9.0  9.0.4-1+b1
  additional facilities for PostgreSQL
ii  postgresql-server-dev-9.0   9.0.4-1+b1
  development files for PostgreSQL 9.0 server-side programming


Slave $ dpkg -l |grep post
ii  postgresql-9.0  9.0.4-1+b1
object-relational SQL database, version 9.0 server
ii  postgresql-client-9.0   9.0.4-1+b1
front-end programs for PostgreSQL 9.0
ii  postgresql-client-common118
manager for multiple PostgreSQL client versions
ii  postgresql-common   118
PostgreSQL database-cluster manager
ii  postgresql-contrib-9.0  9.0.4-1+b1
additional facilities for PostgreSQL



On Mon, Jul 25, 2011 at 3:38 PM, Fujii Masao  wrote:
> On Mon, Jul 25, 2011 at 10:56 AM, Yan Chunlu  wrote:
>> I think the problem is still "invalid record length" and "invalid
>> magic number", it start showing right after I complete sync data and
>> start slave.  If I stop slave later and restart, yes it could show
>> xlog not found and can not catch master. but why the "invalid" things
>> in the first place?
>
> You might have the same problem which was reported before.
> http://archives.postgresql.org/pgsql-hackers/2011-06/msg00661.php
>
> That problem was fixed, and the fix will be included in next minor
> update (i.e., 9.0.5).
> http://archives.postgresql.org/pgsql-committers/2011-06/msg00101.php
>
> Of course, you can avoid the problem by building PostgreSQL with
> gcc != 4.6.0, I think.
>
> Regards,
>
> --
> Fujii Masao
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-25 Thread Yan Chunlu
sorry for the typo, "debian and apt-get"

On Mon, Jul 25, 2011 at 5:39 PM, Yan Chunlu  wrote:
> I am using debian and apt-get to install postgresql, dpkg list shows
> they are the same?  is there anyway to tell what's version it is
> compiled from? thanks!
>
> Master# dpkg -l |grep post
> ii  postgresql-9.0                      9.0.4-1+b1
>  object-relational SQL database, version 9.0 server
> ii  postgresql-client-9.0               9.0.4-1+b1
>  front-end programs for PostgreSQL 9.0
> ii  postgresql-client-common            118
>  manager for multiple PostgreSQL client versions
> ii  postgresql-common                   118
>  PostgreSQL database-cluster manager
> ii  postgresql-contrib-9.0              9.0.4-1+b1
>  additional facilities for PostgreSQL
> ii  postgresql-server-dev-9.0           9.0.4-1+b1
>  development files for PostgreSQL 9.0 server-side programming
>
>
> Slave $ dpkg -l |grep post
> ii  postgresql-9.0                      9.0.4-1+b1
> object-relational SQL database, version 9.0 server
> ii  postgresql-client-9.0               9.0.4-1+b1
> front-end programs for PostgreSQL 9.0
> ii  postgresql-client-common            118
> manager for multiple PostgreSQL client versions
> ii  postgresql-common                   118
> PostgreSQL database-cluster manager
> ii  postgresql-contrib-9.0              9.0.4-1+b1
> additional facilities for PostgreSQL
>
>
>
> On Mon, Jul 25, 2011 at 3:38 PM, Fujii Masao  wrote:
>> On Mon, Jul 25, 2011 at 10:56 AM, Yan Chunlu  wrote:
>>> I think the problem is still "invalid record length" and "invalid
>>> magic number", it start showing right after I complete sync data and
>>> start slave.  If I stop slave later and restart, yes it could show
>>> xlog not found and can not catch master. but why the "invalid" things
>>> in the first place?
>>
>> You might have the same problem which was reported before.
>> http://archives.postgresql.org/pgsql-hackers/2011-06/msg00661.php
>>
>> That problem was fixed, and the fix will be included in next minor
>> update (i.e., 9.0.5).
>> http://archives.postgresql.org/pgsql-committers/2011-06/msg00101.php
>>
>> Of course, you can avoid the problem by building PostgreSQL with
>> gcc != 4.6.0, I think.
>>
>> Regards,
>>
>> --
>> Fujii Masao
>> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> NTT Open Source Software Center
>>
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-25 Thread Yan Chunlu
gcc compiler on my machine is 4.6.1, but I didn't compile it myself.
just installed the binary from apt-get. I will try to compile it by
myself to see what will happen

On Mon, Jul 25, 2011 at 8:18 PM, Tomas Vondra  wrote:
> On 25 Červenec 2011, 11:39, Yan Chunlu wrote:
>> I am using debian ant apt-get to install postgresql, dpkg list shows
>> they are the same?  is there anyway to tell what's version it is
>> compiled from? thanks!
>
> AFAIK there's no way to find out which compiler was used to build
> PostgreSQL binaries (IIRC this is not in pg_config). So you can do this
>
> gcc --version
>
> That should print version of the default compiler, and it's very likely it
> was used to build the PostgreSQL binaries. If it's 4.6.0, you've been hit
> by the problem Fujii Masao described. Use a different gcc version and it
> should work fine.
>
> Tomas
>
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-25 Thread Yan Chunlu
seems the Master server is compiled using 4.6.0:
  version

 PostgreSQL 9.0.4 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.6.real
(Debian 4.6.0-6) 4.6.1 20110428 (prerelease), 64-bit


and there is no way  to know what slave is using since I have remove it.

I am using a compiled version(gcc 4.3.2)  of postgresql as slave,
seems working now.

I think the problem maybe is like Fujii said, does that bug only
effect hot-stanby server?  seems master is okay.


On Mon, Jul 25, 2011 at 10:06 PM, Tom Lane  wrote:
> "Tomas Vondra"  writes:
>> On 25 Červenec 2011, 11:39, Yan Chunlu wrote:
>>> I am using debian ant apt-get to install postgresql, dpkg list shows
>>> they are the same?  is there anyway to tell what's version it is
>>> compiled from? thanks!
>
>> AFAIK there's no way to find out which compiler was used to build
>> PostgreSQL binaries (IIRC this is not in pg_config).
>
> No, but "SELECT version();" should tell the tale.
>
>                        regards, tom lane
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-25 Thread Yan Chunlu
oh god...thanks a lot for the tip. I did actually lost some data, the
master server has crashed two times. every time it comes back, the
index were broken. I need to reindex it. I have already set fsync=on.
  just thought it was normal behavior

about gcc version, only 4.6.0 effected?4.6.1 is okay? then I could
compile my own version with 4.6.1

but how about the data? from the bug information, the data file seems
not compatible
I need to do pg_dump and pg_restore?   what a nightmare.

On Mon, Jul 25, 2011 at 10:47 PM, Tom Lane  wrote:
> Yan Chunlu  writes:
>> seems the Master server is compiled using 4.6.0:
>
>>  PostgreSQL 9.0.4 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.6.real
>> (Debian 4.6.0-6) 4.6.1 20110428 (prerelease), 64-bit
>
> Hmm.  Given the datestamp, that version of gcc almost certainly does
> have the bug.  I wonder whether Martin Pitt knows about this issue
> and the workaround we put in --- I'd have thought he'd push updated
> .debs with a workaround, as I did for Fedora ...
>
> Martin: see
> http://archives.postgresql.org/pgsql-hackers/2011-06/msg00890.php
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c2ba0121c73b7461331104a46d140156e847572a
>
>> and there is no way  to know what slave is using since I have remove it.
>
> If it was installed from the same .deb then it'd be the same build.
>
>> I think the problem maybe is like Fujii said, does that bug only
>> effect hot-stanby server?  seems master is okay.
>
> Well, actually, the bug affects WAL replay of any sort, which means
> if your master were to crash and restart you'd be at risk of data
> corruption on the master.  I'd replace the master build too, ASAP.
>
>                        regards, tom lane
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] streaming replication does not work across datacenter with 20ms latency?

2011-07-25 Thread Yan Chunlu
how about 4.3.2?

 I have gcc 4.3.2 compiled postgresql 9.0.4 as slave, is that okay
that I turn the slave into master?  so the switch will be a lot more
smooth.

On Tue, Jul 26, 2011 at 12:08 AM, Tom Lane  wrote:
> Yan Chunlu  writes:
>> oh god...thanks a lot for the tip. I did actually lost some data, the
>> master server has crashed two times. every time it comes back, the
>> index were broken. I need to reindex it. I have already set fsync=on.
>>   just thought it was normal behavior
>
> Uh, no.
>
>> about gcc version, only 4.6.0 effected?    4.6.1 is okay? then I could
>> compile my own version with 4.6.1
>
> It's supposed to be fixed in 4.6.1 according to
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49390
>
> Personally I'd use 4.6.1 *and* apply the Postgres patch.
>
>                        regards, tom lane
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] wiki.postgresql.org down?

2011-07-27 Thread Yan Chunlu
I tried several times to visit the site recently but the domain always
can not be resolved. is there something wrong?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] wiki.postgresql.org down?

2011-07-27 Thread Yan Chunlu
thanks! working now

On Wed, Jul 27, 2011 at 3:54 PM, John R Pierce  wrote:
> On 07/27/11 12:48 AM, John R Pierce wrote:
>>
>> ...hopefully its reserved...
>
> errr.  i meant to type RESOLVED.  hah.
>
> --
> john r pierce                            N 37, W 122
> santa cruz ca                         mid-left coast
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] how to adjust auto increment id offset?

2011-11-15 Thread Yan Chunlu
I would like to implement two master db with even-odd id sharding.   in
mysql it is fairly easy by using the configuration:

auto_increment_offset = 1
auto_increment_increment = 2


but I have searched a lot didn't find anything related to this, some users
doing this via trigger like "rubyrep".

is there an easy way to do this?  thanks!


Re: [GENERAL] how to adjust auto increment id offset?

2011-11-15 Thread Yan Chunlu
thanks a lot for the tip!

sorry for used the wrong word, it is just multi-master but not sharding,  I
would like to setup two master server across two datacenter.  one's id
increased by 1, and the other by 2.
so I could have a queue sync the record in the background by myself.  kind
of a dumb way but it seems they only choice for me,  the delay within
several minutes is acceptable.

On Tue, Nov 15, 2011 at 5:12 PM, John R Pierce  wrote:

> On 11/15/11 12:56 AM, Bèrto ëd Sèra wrote:
>
>> Hi
>>
>>
>> On 15 November 2011 11:44, Yan Chunlu > springri...@gmail.com>**> wrote:
>>
>>I would like to implement two master db with even-odd id sharding.
>>  in mysql it is fairly easy by using the configuration:
>>
>>auto_increment_offset = 1
>>auto_increment_increment = 2
>>
>>
>>but I have searched a lot didn't find anything related to this,
>>some users doing this via trigger like "rubyrep".
>>
>>is there an easy way to do this?  thanks!
>>
>>
>> http://www.postgresql.org/**docs/8.1/static/sql-**createsequence.html<http://www.postgresql.org/docs/8.1/static/sql-createsequence.html>
>>
>>
>>
> also see ALTER SEQUENCE.
>
> basically, you'll need to fix up every sequence (these are created
> automatically if you have fields of type SERIAL)   on your 2nd server,
>
>ALTER SEQUENCE somesequencename INCREMENT BY 2 RESTART WITH 2;
>
> and on your 1st server,
>
>ALTER SEQUENCE somesequencename INCREMENT BY 2;
>
> do this before inserting any data.
>
> thats a fairly unusual sharding technique, how do you plan on doing
> queries across both sets of data?
>
>
> --
> john r pierceN 37, W 122
> santa cruz ca mid-left coast
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general<http://www.postgresql.org/mailpref/pgsql-general>
>


[GENERAL] how could duplicate pkey exist in psql?

2011-11-16 Thread Yan Chunlu
recently I have found several tables has exactly the same pkey,  here is
the definition:
"diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)


the data is like this:

   159292 | funnypics_link_point   | 41

| num
   159292 | funnypics_link_point   | 40

| num


I could not even update this record.

really confused about how could this happen... thanks!


Re: [GENERAL] how could duplicate pkey exist in psql?

2011-11-17 Thread Yan Chunlu
 I am using pgpool's replication feature, it does copy pg_xlog from one
server to another, was that possible cause of the problem?

thanks for the help!

On Thu, Nov 17, 2011 at 5:38 PM, Edson Richter wrote:

>
> Em 17-11-2011 03:19, Yan Chunlu escreveu:
>
>  recently I have found several tables has exactly the same pkey,  here is
>> the definition:
>> "diggcontent_data_account_**pkey" PRIMARY KEY, btree (thing_id, key)
>>
>>
>> the data is like this:
>>
>>   159292 | funnypics_link_point   | 41
>>
>>   | num
>>   159292 | funnypics_link_point   | 40
>>
>>   | num
>>
>>
>> I could not even update this record.
>>
>> really confused about how could this happen... thanks!
>>
>
> I know one scenario this can happen on Linux. In my case, it was caused by
> a "rsync"... instead copy to a different location, script was copying
> pg_xlog over own pg_xlog.
>
> I did this stupidity once, and learned for a life time. Lost two hours of
> work to recover everything (from backup, at least I had one).
>
> Be careful with rsync and cp, since Linux does not block files from being
> overwriten even when they are in use.
>
>
> Regards,
>
> Edson.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general<http://www.postgresql.org/mailpref/pgsql-general>
>


Re: [GENERAL] how could duplicate pkey exist in psql?

2011-11-17 Thread Yan Chunlu
seems they are identical:
   159292 | |funnypicscn_link_karma|
   159292 | |funnypicscn_link_karma|

On Thu, Nov 17, 2011 at 4:07 PM, Szymon Guz  wrote:

>
>
> On 17 November 2011 06:19, Yan Chunlu  wrote:
>
>> recently I have found several tables has exactly the same pkey,  here is
>> the definition:
>> "diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)
>>
>>
>> the data is like this:
>>
>>159292 | funnypics_link_point   | 41
>>
>>   | num
>>159292 | funnypics_link_point   | 40
>>
>>   | num
>>
>>
>> I could not even update this record.
>>
>> really confused about how could this happen... thanks!
>>
>
> Hi,
> could you send us result of the query:
> select thing_id, '|'||key||'|' from table?
> Maybe there are some more spaces in the key column which were hidden by
> table alignment in the client?
>
> regards
> Szymon
>


Re: [GENERAL] how could duplicate pkey exist in psql?

2011-11-21 Thread Yan Chunlu
got it.  thank you very much for you help.   I found out this problem too
late, and there is no backup.

luckily there was not too much data for this, and my app keeps running
without error.

I am not sure if they are related but I could not use pg_restore to import
data dumped by "pg_dump -Fc";

pg_restore will print some error message about "duplicate primary key", and
the table is empty. no data has been imported.

pg_restore supposed to import the data and ignore the errors.does any
one have the similar problem?

On Thu, Nov 17, 2011 at 11:08 PM, Edson Richter wrote:

>  Em 17-11-2011 09:21, Yan Chunlu escreveu:
>
> I am using pgpool's replication feature, it does copy pg_xlog from one
> server to another, was that possible cause of the problem?
>
>
> I did not mean that this IS your problem, I just gave you a tip regarding
> a problem I had in the past, that eventually has same simptom.
>
> This scenario only happens when your script is copy data over own data...
> like in "rsync -ar 
> root@127.0.0.1:/var/lib/pgsql/9.0/data/*/var/lib/pgsql/9.0/data/"
>
> the command above is highly dangerous because it copies data over the
> network link over its own data... if you have transactions runing during
> the command above, you will get a crash (and, in my case, I had duplicate
> primary keys).
>
> Would be better to check if this could be happening to you... some script
> overwriting data using rsync, cp, etc... I had no other situation where
> Postgresql allowed duplicate keys.
>
> Hope this helps,
>
> Edson.
>
>
>
>
>  thanks for the help!
>
> On Thu, Nov 17, 2011 at 5:38 PM, Edson Richter wrote:
>
>>
>> Em 17-11-2011 03:19, Yan Chunlu escreveu:
>>
>>  recently I have found several tables has exactly the same pkey,  here is
>>> the definition:
>>> "diggcontent_data_account_pkey" PRIMARY KEY, btree (thing_id, key)
>>>
>>>
>>> the data is like this:
>>>
>>>   159292 | funnypics_link_point   | 41
>>>
>>>   | num
>>>   159292 | funnypics_link_point   | 40
>>>
>>>   | num
>>>
>>>
>>> I could not even update this record.
>>>
>>> really confused about how could this happen... thanks!
>>>
>>
>>  I know one scenario this can happen on Linux. In my case, it was caused
>> by a "rsync"... instead copy to a different location, script was copying
>> pg_xlog over own pg_xlog.
>>
>> I did this stupidity once, and learned for a life time. Lost two hours of
>> work to recover everything (from backup, at least I had one).
>>
>> Be careful with rsync and cp, since Linux does not block files from being
>> overwriten even when they are in use.
>>
>>
>> Regards,
>>
>> Edson.
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>


Re: [GENERAL] how could duplicate pkey exist in psql?

2011-11-21 Thread Yan Chunlu
and database will stop receiving the following data after detected an
error?
that means while using pg_restore, no error allowed to happen, otherwise
the database will stop receiving data and the import will fail.

I found only one record in psql's log:

 duplicate key value violates unique constraint "account_pkey"

does that means one duplicate record will prevent all other records to
import?

On Mon, Nov 21, 2011 at 10:55 PM, Adrian Klaver wrote:

> On Monday, November 21, 2011 6:39:55 am Yan Chunlu wrote:
> > got it.  thank you very much for you help.   I found out this problem too
> > late, and there is no backup.
> >
> > luckily there was not too much data for this, and my app keeps running
> > without error.
> >
> > I am not sure if they are related but I could not use pg_restore to
> import
> > data dumped by "pg_dump -Fc";
> >
> > pg_restore will print some error message about "duplicate primary key",
> and
> > the table is empty. no data has been imported.
> >
> > pg_restore supposed to import the data and ignore the errors.does any
> > one have the similar problem?
> >
>
> pg_restore may ignore the error and keep on going but the database will
> not. In
> other words when pg_restore receives the error it will continue on to the
> next
> item (unless you have the -e switch on). As far as the server(database) is
> concerned duplicate primary key is still an error and the data will not be
> loaded.
>
>
> --
> Adrian Klaver
> adrian.kla...@gmail.com
>


[GENERAL] pg_dump and pg_restore make data dir doubled size?

2011-12-18 Thread Yan Chunlu
I just tried several times to make sure I didn't do anything wrong.

I use
pg_dump -U postgres  -E UTF8  -F c -b -v -a  dbcontent -f  data.backup
then
pg_restore -a -d dbcontent data.backup

to restore a db, and the data dir's size on the source machine is
56GB,  after I restored data on the target machine, it 130GB,  I did
vacuum full but nothing changed.

is there anything wrong with the db?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dump and pg_restore make data dir doubled size?

2011-12-18 Thread Yan Chunlu
never mind. just found I changed the wal_keep_segments to 5000, it's
the size of pg_xlog dir who used a lot of space.



On Sun, Dec 18, 2011 at 10:18 PM, Yan Chunlu  wrote:
> I just tried several times to make sure I didn't do anything wrong.
>
> I use
> pg_dump -U postgres  -E UTF8  -F c -b -v -a  dbcontent -f  data.backup
> then
> pg_restore -a -d dbcontent data.backup
>
> to restore a db, and the data dir's size on the source machine is
> 56GB,  after I restored data on the target machine, it 130GB,  I did
> vacuum full but nothing changed.
>
> is there anything wrong with the db?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ignore duplicate key while using COPY?

2011-12-18 Thread Yan Chunlu
I am using COPY public.table_name FROM STDIN  to import data.  it is
very efficient, but if there's any duplicate key exists, the whole
procedure has been stopped.  is there anyway to around this?

why does not postgresql just give a warning and continue the copy?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ignore duplicate key while using COPY?

2011-12-21 Thread Yan Chunlu
thanks a lot!

On Mon, Dec 19, 2011 at 10:19 PM, Adrian Klaver wrote:

> On Sunday, December 18, 2011 10:54:21 pm Yan Chunlu wrote:
> > I am using COPY public.table_name FROM STDIN  to import data.  it is
> > very efficient, but if there's any duplicate key exists, the whole
> > procedure has been stopped.  is there anyway to around this?
> >
> > why does not postgresql just give a warning and continue the copy?
>
> Take a look at pgloader:
> http://pgfoundry.org/projects/pgloader/
>
> --
> Adrian Klaver
> adrian.kla...@gmail.com
>


[GENERAL] select statment going slow and slow while using IN (xx,xx)

2012-01-12 Thread Yan Chunlu
I am transforming a db with millions records to anther schema.  for
some reason I need to select the records using IN (xx,xx).

the ids in the IN was about 1000 recored every time, but I found the
query was getting slow while the selection moving on.

the shared buffer is 2048M.   and the cpu and io usage is as normal.

But while I am processing a smaller db on another machine, which only
has the default 24MB, the selection went very smooth.


here is the logs:

start... 0 limit:1000
selection time: 0.140721082687
sort_options
msgtime
start... 1000 limit:1000
selection time: 0.122759103775
start... 2000 limit:1000
selection time: 0.150802850723
start... 3000 limit:1000
selection time: 0.173918008804
start... 4000 limit:1000
selection time: 0.212812900543
start... 5000 limit:1000
selection time: 0.255054950714
start... 6000 limit:1000
selection time: 0.230540037155
start... 7000 limit:1000
selection time: 0.24426317215
start... 8000 limit:1000
selection time: 0.326669931412
start... 9000 limit:1000
selection time: 0.351358175278
start... 1 limit:1000
selection time: 0.386382102966
start... 11000 limit:1000
selection time: 0.440491914749
start... 12000 limit:1000
selection time: 0.443608045578
start... 13000 limit:1000
selection time: 0.49751496315
start... 14000 limit:1000
selection time: 12.0050361156
start... 15000 limit:1000
selection time: 26.3596658707

start... 16000 limit:1000
selection time: 43.5269529819

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] select statment going slow and slow while using IN (xx,xx)

2012-01-12 Thread Yan Chunlu
I also tried explain but found nothing special:

explain select * from data_table where thing_id in
(164438,112478,102941,112377,164442,181764,104028);
  QUERY PLAN
---
 Bitmap Heap Scan on data_table  (cost=31.75..579.10 rows=141 width=53)
   Recheck Cond: (thing_id = ANY
('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))
   ->  Bitmap Index Scan on idx_data_table  (cost=0.00..31.71 rows=141 width=0)
 Index Cond: (thing_id = ANY
('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))



On Thu, Jan 12, 2012 at 4:47 PM, Yan Chunlu  wrote:
> I am transforming a db with millions records to anther schema.  for
> some reason I need to select the records using IN (xx,xx).
>
> the ids in the IN was about 1000 recored every time, but I found the
> query was getting slow while the selection moving on.
>
> the shared buffer is 2048M.   and the cpu and io usage is as normal.
>
> But while I am processing a smaller db on another machine, which only
> has the default 24MB, the selection went very smooth.
>
>
> here is the logs:
>
> start... 0 limit:1000
> selection time: 0.140721082687
> sort_options
> msgtime
> start... 1000 limit:1000
> selection time: 0.122759103775
> start... 2000 limit:1000
> selection time: 0.150802850723
> start... 3000 limit:1000
> selection time: 0.173918008804
> start... 4000 limit:1000
> selection time: 0.212812900543
> start... 5000 limit:1000
> selection time: 0.255054950714
> start... 6000 limit:1000
> selection time: 0.230540037155
> start... 7000 limit:1000
> selection time: 0.24426317215
> start... 8000 limit:1000
> selection time: 0.326669931412
> start... 9000 limit:1000
> selection time: 0.351358175278
> start... 1 limit:1000
> selection time: 0.386382102966
> start... 11000 limit:1000
> selection time: 0.440491914749
> start... 12000 limit:1000
> selection time: 0.443608045578
> start... 13000 limit:1000
> selection time: 0.49751496315
> start... 14000 limit:1000
> selection time: 12.0050361156
> start... 15000 limit:1000
> selection time: 26.3596658707
>
> start... 16000 limit:1000
> selection time: 43.5269529819

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] select statment going slow and slow while using IN (xx,xx)

2012-01-13 Thread Yan Chunlu
seems similar to this problem:
http://stackoverflow.com/questions/5198380/improving-postgres-psycopg2-query-performance-for-python-to-the-same-level-of-ja

but no solution yet.

On Thu, Jan 12, 2012 at 5:00 PM, Yan Chunlu  wrote:
> I also tried explain but found nothing special:
>
> explain select * from data_table where thing_id in
> (164438,112478,102941,112377,164442,181764,104028);
>                                              QUERY PLAN
> ---
>  Bitmap Heap Scan on data_table  (cost=31.75..579.10 rows=141 width=53)
>   Recheck Cond: (thing_id = ANY
> ('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))
>   ->  Bitmap Index Scan on idx_data_table  (cost=0.00..31.71 rows=141 width=0)
>         Index Cond: (thing_id = ANY
> ('{164438,112478,102941,112377,164442,181764,104028}'::bigint[]))
>
>
>
> On Thu, Jan 12, 2012 at 4:47 PM, Yan Chunlu  wrote:
>> I am transforming a db with millions records to anther schema.  for
>> some reason I need to select the records using IN (xx,xx).
>>
>> the ids in the IN was about 1000 recored every time, but I found the
>> query was getting slow while the selection moving on.
>>
>> the shared buffer is 2048M.   and the cpu and io usage is as normal.
>>
>> But while I am processing a smaller db on another machine, which only
>> has the default 24MB, the selection went very smooth.
>>
>>
>> here is the logs:
>>
>> start... 0 limit:1000
>> selection time: 0.140721082687
>> sort_options
>> msgtime
>> start... 1000 limit:1000
>> selection time: 0.122759103775
>> start... 2000 limit:1000
>> selection time: 0.150802850723
>> start... 3000 limit:1000
>> selection time: 0.173918008804
>> start... 4000 limit:1000
>> selection time: 0.212812900543
>> start... 5000 limit:1000
>> selection time: 0.255054950714
>> start... 6000 limit:1000
>> selection time: 0.230540037155
>> start... 7000 limit:1000
>> selection time: 0.24426317215
>> start... 8000 limit:1000
>> selection time: 0.326669931412
>> start... 9000 limit:1000
>> selection time: 0.351358175278
>> start... 1 limit:1000
>> selection time: 0.386382102966
>> start... 11000 limit:1000
>> selection time: 0.440491914749
>> start... 12000 limit:1000
>> selection time: 0.443608045578
>> start... 13000 limit:1000
>> selection time: 0.49751496315
>> start... 14000 limit:1000
>> selection time: 12.0050361156
>> start... 15000 limit:1000
>> selection time: 26.3596658707
>>
>> start... 16000 limit:1000
>> selection time: 43.5269529819

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general