2015-02-09 18:40 GMT+01:00 Guillaume Drolet <droletguilla...@gmail.com>:
> I tried starting the cluster again. Once again everything looked fine at > the start (first three lines of this log, in English this time): > > 2015-02-09 11:40:55 EST LOG: database system was shut down at 2015-02-06 > 09:50:21 EST > 2015-02-09 11:40:55 EST LOG: database system is ready to accept > connections > 2015-02-09 11:40:55 EST LOG: autovacuum launcher started > > Since it seemed to work, I opened the terminal and tried connecting to the > database: > > C:\Users\admlocal>psql -U postgres -d mortalite > > So far so good, I got a connection: > > psql (9.3.5) > Attention : l'encodage console (850) diffère de l'encodage Windows (1252). > Les caractères 8 bits peuvent ne pas fonctionner correctement. > Voir la section « Notes aux utilisateurs de Windows » de la > page > référence de psql pour les détails. > Saisissez « help » pour l'aide. > > mortalite=# > > I tried the help command and it worked: > > mortalite=# help > Vous utilisez psql, l'interface en ligne de commande de PostgreSQL. > Saisissez: > \copyright pour les termes de distribution > \h pour l'aide-mémoire des commandes SQL > \? pour l'aide-mémoire des commandes psql > \g ou point-virgule en fin d'instruction pour exécuter la requête > \q pour quitter > > > But then when I tried to query the db, it crashed: > > mortalite=# \dt > la connexion au serveur a été coupée de façon inattendue > Le serveur s'est peut-être arrêté anormalement avant ou durant le > traitement de la requête. > La connexion au serveur a été perdue. Tentative de réinitialisation : > Échec. > !> > > And here's the rest of the log file after the crash: > > 2015-02-09 12:29:19 EST LOG: server process (PID 2240) was terminated by > exception 0xC0000005 > 2015-02-09 12:29:19 EST DETAIL: Failed process was running: SELECT > n.nspname as "Schema", > c.relname as "Name", > CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' > THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' > WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END as "Type", > pg_catalog.pg_get_userbyid(c.relowner) as "Owner" > FROM pg_catalog.pg_class c > LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace > WHERE c.relkind IN ('r','') > AND n.nspname <> 'pg_catalog' > AND n.nspname <> 'information_schema' > AND n.nspname !~ '^pg_toast' > AND pg_catalog.pg_table_is_visible(c.oid) > ORDER BY 1,2; > 2015-02-09 12:29:19 EST HINT: See C include file "ntstatus.h" for a > description of the hexadecimal value. > 2015-02-09 12:29:19 EST LOG: terminating any other active server processes > 2015-02-09 12:29:19 EST WARNING: terminating connection because of crash > of another server process > 2015-02-09 12:29:19 EST DETAIL: The postmaster has commanded this server > process to roll back the current transaction and exit, because another > server process exited abnormally and possibly corrupted shared memory. > 2015-02-09 12:29:19 EST HINT: In a moment you should be able to reconnect > to the database and repeat your command. > 2015-02-09 12:29:19 EST LOG: archiver process (PID 4576) exited with exit > code 1 > 2015-02-09 12:29:19 EST LOG: all server processes terminated; > reinitializing > 2015-02-09 12:29:29 EST FATAL: pre-existing shared memory block is still > in use > 2015-02-09 12:29:29 EST HINT: Check if there are any old server processes > still running, and terminate them. > > According to this page > <https://msdn.microsoft.com/en-ca/library/cc704588.aspx>, exception > 0xC0000005 means STATUS_ACCESS_VIOLATION - The instruction at 0x%08lx > referenced memory at 0x%08lx. The memory could not be %s. This is not of > much help to me. > > I hope these additional bits of information can help someone figuring out > a solution to get my cluster up and running again. > > PS. I was thinking of reinstalling PGSQL over my current install but > keeping my PGDATA. I've done it in the past for fixing problems with > starting the service and it worked. What do you think? > > The SQL query you see in your log is the result of your \dt. It should work. I don't see why reinstalling PostgreSQL will fix anything here. You said you did a pg_basebackup. The cluster you try to start is the cluster used with pg_basebackup or a restore done with the pg_basebackup? Cheers, > > Guillaume > > > > > > 2015-02-09 11:37 GMT-05:00 Guillaume Drolet <droletguilla...@gmail.com>: > > >> >> 2015-02-07 1:24 GMT-05:00 Guillaume Lelarge <guilla...@lelarge.info>: >> >>> Le 6 févr. 2015 17:31, "Adrian Klaver" <adrian.kla...@aklaver.com> a >>> écrit : >>> > >>> > On 02/06/2015 05:03 AM, Guillaume Drolet wrote: >>> >> >>> >> Hi, >>> >> >>> >> Yesterday I ran a pg_basebackup of my cluster. Since it has completed, >>> >> my cluster doesn't work properly. I tried restarting the computer (or >>> >> service) a few times but I always get the same messages in my logs >>> (it's >>> >> in French. If someone is willing to help me I can try to translate the >>> >> logs. Just ask): >>> > >>> > >>> > Enter Google Translate:) >>> > >>> >>> But first, Guillaume, do yourself and everyone else a favor: turn the >>> dam log into English. Set lc_messages to 'C' in postgresql.conf. >>> >> Thanks for this! I didn't know about this great feature. >> >> >>> > First some questions: >>> > >>> > 1) What Postgres version? >>> > >>> > 2) What OS(s)? I am assuming Windows from the log info below, but we >>> all know what assuming gets you. >>> > >>> > 3) Where were you backing up from and to? >>> > >>> > 4) Which cluster does not start, the master or the child you created >>> with pg_basebackup? >>> > >>> > >>> >> >>> >> 2015-02-06 07:11:38 EST LOG: le système de bases de données a été >>> >> interrompu ; dernier lancement connu à 2015-02-06 07:05:05 EST >>> >> 2015-02-06 07:11:38 EST LOG: le système de bases de données n'a pas >>> été >>> >> arrêté proprement ; restauration >>> >> automatique en cours >>> >> 2015-02-06 07:11:38 EST LOG: record with zero length at 24B/2C000160 >>> >> 2015-02-06 07:11:38 EST LOG: la ré-exécution n'est pas nécessaire >>> >> 2015-02-06 07:11:38 EST LOG: le système de bases de données est prêt >>> >> pour accepter les connexions >>> >> 2015-02-06 07:11:38 EST LOG: lancement du processus autovacuum >>> >> 2015-02-06 07:11:38 EST FATAL: le rôle « 208375PT$ » n'existe pas >>> > >>> > >>> > So where is role 208375PT$ supposed to come from? >>> > >>> > >>> >> >>> >> Then if I start pgAdmin I get a series of pop-ups I have to click OK >>> to >>> >> to continue: >>> >> >>> >> An error has ocurred: Column not found in pgSet: "datlastsysoid" >>> >> An error has ocurred: Column not found in pgSet: datlastsysoid >>> >> An error has ocurred: Column not found in pgSet: oid >>> >> An error has ocurred: Column not found in pgSet: encoding >>> >> An error has ocurred: Column not found in pgSet: Connection to >>> database >>> >> broken >>> > >>> > >>> > Not sure about that this, someone more versed in pgAdmin will have to >>> answer. >>> > >>> >>> Usually you see these messages when you're using a pgadmin major release >>> older than a PostgreSQL make release. For a 9.3 release, that would mean a >>> pgadmin older than 1.18. >>> >> >> I'm running pgadmin 1.18.1 >> >>> > >>> >> >>> >> And after that, I went back to the log file and there's new >>> information >>> >> added: >>> >> >>> >> 2015-02-06 07:51:05 EST LOG: processus serveur (PID 184) a été arrêté >>> >> par l'exception 0x80000004 >>> >> 2015-02-06 07:51:05 EST DÉTAIL: Le processus qui a échoué exécutait : >>> >> SELECT version(); >>> >> 2015-02-06 07:51:05 EST ASTUCE : Voir le fichier d'en-tête C « >>> >> ntstatus.h » pour une description de la valeur >>> >> hexadécimale. >>> > >>> > >>> > Well according to here: >>> > >>> > https://msdn.microsoft.com/en-us/library/cc704588.aspx >>> > >>> > 0x80000004 >>> > STATUS_SINGLE_STEP >>> > >>> > >>> > {EXCEPTION} Single Step A single step or trace operation has just been >>> completed. >>> > >>> > A developer is going to have explain what that means. >>> > >>> > >>> > >>> >> 2015-02-06 07:51:05 EST LOG: arrêt des autres processus serveur >>> actifs >>> >> 2015-02-06 07:51:05 EST ATTENTION: arrêt de la connexion à cause de >>> >> l'arrêt brutal d'un autre processus serveur >>> >> 2015-02-06 07:51:05 EST DÉTAIL: Le postmaster a commandé à ce >>> processus >>> >> serveur d'annuler la transaction >>> >> courante et de quitter car un autre processus serveur a quitté >>> >> anormalement >>> >> et qu'il existe probablement de la mémoire partagée corrompue. >>> >> 2015-02-06 07:51:05 EST ASTUCE : Dans un moment, vous devriez être >>> >> capable de vous reconnecter à la base de >>> >> données et de relancer votre commande. >>> >> 2015-02-06 07:51:05 EST LOG: processus d'archivage (PID 692) quitte >>> >> avec le code de sortie 1 >>> >> 2015-02-06 07:51:05 EST LOG: tous les processus serveur se sont >>> >> arrêtés, réinitialisation >>> >> 2015-02-06 07:51:15 EST FATAL: le bloc de mémoire partagé >>> pré-existant >>> >> est toujours en cours d'utilisation >>> >> 2015-02-06 07:51:15 EST ASTUCE : Vérifier s'il n'y a pas de vieux >>> >> processus serveur en cours d'exécution. Si c'est le >>> >> cas, fermez-les. >>> >> >>> >> I was about to try restarting postgresql using the base backup I made >>> >> yesterday but since this means I'll have to copy my database again >>> (700 >>> >> GB takes a while...) I am looking for a better solution from more >>> >> experienced people. >>> > >>> > >>> > >>> > My suspicion is you copied at least partly over a running server. >>> > >>> >> >> > -- Guillaume. http://blog.guillaume.lelarge.info http://www.dalibo.com