greenplum全量恢复gprecoverseg-F出现Unabletoconnecttodatabase时的相关分析及解决方法(一)

2015-11-21 01:54:03 · 作者: · 浏览: 56
之前有两位朋友碰到过在对greenplum的系统构架更改后,出现全量恢复gprecoverseg -F也无法正常运行的情况。
报错信息为Unable to connect to database. Retrying 1
gprecoverseg failed. (Reason='Unable to connect to database and start transaction') exiting...
有幸拷得一份 虚拟机上的全部文件,对其进行分析。
发现其实出现这个问题只需要修改pg_changetracking下的CT_METADATA,或者说从其他正常的主事例上拷贝一份到出问题的主事例上即可。


以下为大致分析的过程,感兴趣的可以看一下。


--启动 数据库,有一个mirror出错。
[gpadmin@gpmaster ~]$ gpstart -a
20150727:22:28:21:001922 gpstart:gpmaster:gpadmin-[INFO]:-Starting gpstart with args: -a
20150727:22:28:21:001922 gpstart:gpmaster:gpadmin-[INFO]:-Gathering information and validating the environment...
20150727:22:28:28:001922 gpstart:gpmaster:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.4.1 build 2'
20150727:22:28:28:001922 gpstart:gpmaster:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150'
20150727:22:28:29:001922 gpstart:gpmaster:gpadmin-[INFO]:-Starting Master instance in admin mode
20150727:22:28:33:001922 gpstart:gpmaster:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20150727:22:28:33:001922 gpstart:gpmaster:gpadmin-[INFO]:-Obtaining Segment details from master...
20150727:22:28:34:001922 gpstart:gpmaster:gpadmin-[INFO]:-Setting new master era
20150727:22:28:34:001922 gpstart:gpmaster:gpadmin-[INFO]:-Master Started...
20150727:22:28:34:001922 gpstart:gpmaster:gpadmin-[INFO]:-Shutting down master
20150727:22:28:36:001922 gpstart:gpmaster:gpadmin-[WARNING]:-Skipping startup of segment marked down in configuration: on gpslave-2 directory /data/mirror/gpseg0 <<<<<
20150727:22:28:36:001922 gpstart:gpmaster:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
...............................................
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:-Process results...
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:-----------------------------------------------------
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:- Successful segment starts = 3
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:- Failed segment starts = 0
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[WARNING]:-Skipped segment starts (segments are marked down in configuration) = 1 <<<<<<<<
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:-----------------------------------------------------
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:-
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:-Successfully started 3 of 3 segment instances, skipped 1 other segments
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[INFO]:-----------------------------------------------------
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[WARNING]:-****************************************************************************
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[WARNING]:-There are 1 segment(s) marked down in the database
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[WARNING]:-To recover from this current state, review usage of the gprecoverseg
20150727:22:29:23:001922 gpstart:gpmaster:gpadmin-[WARNING]:-management utility which will recover failed segment instance databases.
20150727:22:29:23:001922 gpstart:gpmaste