0%

redhat7.6 安装 oracle 12.2.0.1 RAC 小计

redhat7.6 安装 oracle 12.2.0.1 RAC 小计

1.简介

虽说博客能总结、提炼技能,整理思路,拓展相关知识点,但是比较浪费时间,所以自己一般小的知识点或者技巧都是喜欢笔记,遇到复杂一点,网上没有的才写一下,希望能帮到需要的人。

客户这边很多11g的环境都因oracle服务支持到期,计划或者已经进行升级到19c了,但客户win8上面有几套12.2.0.1的暂时不升级,需要在Linux安装新的12.2.0.1环境对接,而且使用的服务器资源池是最新的redhat7系列,安装还是遇到了不少问题,网上大都是7.3/7.4,本次7.6不太适用,但好在最后也安装成功了,所以后续总结一下,有问题的地方欢迎帮忙指正。

2.环境

操作系统 redhat 7.6
内核版本 3.10.0-1062.9.1.el7.x86_64
GI版本 12.2.0.1
节点数 2节点
安装方式 静默

查看12.2.0.1官网gi安装redhat7操作系统要求:Red Hat Enterprise Linux 7: 3.10.0-123.el7.x86_64 or later

虽然是满足的,但是此次环境的内核已经安装最新补丁升级到3.10.0-1062版本,相比123高太多,所以安装时候遇到很多问题。

3.GI安装

1.基本环境准备:略(内核参数、系统调整、用户和文件夹、存储、网络、安装介质、补丁…)

2.编辑GI安装响应文件:略

3.安装前校验通过:略(resolve解析忽略)

4.开始安装GI:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
./gridSetup.sh -silent -responseFile /home/grid/dba_scripts/grid.12.2.rsp -ignorePrereqFailure |tee -a /home/grid/dba_scripts/grid.install.log

## 介质安装完成,开始配置集群(提示需要依次执行下面3个脚本)
As a root user, execute the following script(s):
1. /u01/app/oraInventory/orainstRoot.sh
2. /u01/app/12.2.0/grid/root.sh

Execute /u01/app/oraInventory/orainstRoot.sh on the following nodes:
[node01, node02]
Execute /u01/app/12.2.0/grid/root.sh on the following nodes:
[node01, node02]

Run the script on the local node first. After successful completion, you can start the script in parallel on all other nodes.

Successfully Setup Software.
As install user, execute the following command to complete the configuration.
/u01/app/12.2.0/grid/gridSetup.sh -executeConfigTools -responseFile /home/grid/dba_scripts/grid.12.2.rsp [-silent]
1. 执行/u01/app/12.2.0/grid/root.sh报错:CLSRSC-400: A system reboot is required to continue installing.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/u01/app/oraInventory/orainstRoot.sh # node01执行完成
/u01/app/oraInventory/orainstRoot.sh # node02执行完成
/u01/app/12.2.0/grid/root.sh # node01执行报错如下:

2020/09/25 15:57:43 CLSRSC-594: Executing installation step 13 of 19: 'InstallAFD'.
2020/09/25 15:57:46 CLSRSC-594: Executing installation step 14 of 19: 'InstallACFS'.
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node01'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'node01'
CRS-2673: Attempting to stop 'ora.evmd' on 'node01'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'node01'
CRS-2677: Stop of 'ora.gpnpd' on 'node01' succeeded
CRS-2677: Stop of 'ora.evmd' on 'node01' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'node01' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'node01'
CRS-2677: Stop of 'ora.gipcd' on 'node01' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node01' has completed
CRS-4133: Oracle High Availability Services has been stopped.
CRS-4123: Oracle High Availability Services has been started.
2020/09/25 16:00:38 CLSRSC-400: A system reboot is required to continue installing.
The command '/u01/app/12.2.0/grid/perl/bin/perl -I/u01/app/12.2.0/grid/perl/lib -I/u01/app/12.2.0/grid/crs/install /u01/app/12.2.0/grid/crs/install/rootcrs.pl ' execution failed

## 注意: 第14步开始失败过不去,安装acfs时候,CLSRSC-400: A system reboot
## 我重启后继续执行root.sh,依然失败,开始谷歌。
2. 安装acfs补丁:25078431( 此次未解决)

​ mos文章是7.3,博文中指出是7.x,我依照安装补丁:25078431,依然不行,第14步安装acfs依然要求重启。

​ 我怀疑是我7.6的小版本1062实在太高了,目前所处环境比较规范严格没办法。

1
./gridSetup.sh -applyOneOffs /u01/software/25078431/25078431
3. 安装RU或者PSU后再执行root.sh报错:CLSRSC-175: Failed to write the checkpoint ‘ROOTCRS_FIRSTNODE’ with status ‘SUCCESS’ (error code 1)
  • acfs已经不再报错,但到最后一步又报错了。

​ 本次下载RU是12.2.0.1的20181016 RU,PSU是12.2.0.1的20200714 PSU,都可以解决那个acfs要求重启的问题,并且20年7月这个PSU应该是包含18那个RU,可以先打RU再打PSU不冲突,推荐安装最新的RU或者PSU应该就好。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# 安装步骤
# 注意: 18c gridSetup.sh开始支持RU、PSU、OneOff补丁,12c gridSetup.sh不支持RU。
1. 解压gi zip到指定grid_home下面

2. ./gridSetup.sh -silent -responseFile /home/grid/dba_scripts/grid.12.2.rsp -ignorePrereqFailure |tee -a /home/grid/dba_scripts/grid.install.log # 2节点介质安装完成

3. node01和node02分别替换最新的OPatch工具(因为要安装最新PSU,默认opatch版本太低)

4. root用户node01执行orainstRoot.sh成功,node02执行orainstRoot.sh成功

5. node01使用opatchauto安装PSU成功,node02也一样安装成功PSU:
# root 执行,注意:最新PSU 31326390下面有2个,是GI和OJVM的组合补丁,需要就指定到GI那个就行:
./opatchauto apply /u01/software/31326390/31305382 -oh /u01/app/12.2.0/grid -analyze
./opatchauto apply /u01/software/31326390/31305382 -oh /u01/app/12.2.0/grid

6. node01执行root.sh,19步执行完成最后post操作报错,检查点写入失败:
/u01/app/12.2.0/grid/root.sh
##########################################################################################
CLSRSC-175: Failed to write the checkpoint 'ROOTCRS_FIRSTNODE' with status 'SUCCESS' (error code 1)
Died at /u01/app/12.2.0/grid/crs/install/crsutils.pm line 13413.
The command '/u01/app/12.2.0/grid/perl/bin/perl -I/u01/app/12.2.0/grid/perl/lib -I/u01/app/12.2.0/grid/crs/install /u01/app/12.2.0/grid/crs/install/rootcrs.pl' execution failed
##########################################################################################
根据提示日志,查看/u01/app/12.2.0/grid/crs/install/crsutils.pm line 13413,这段代码是关于写入transfile检查点,不知道是bug还是由于gridSetup安装后,在执行root.sh之前安装PSU后影响了grid_home的完整性校验导致。
再分析安装session日志:/u01/app/grid/crsdata/node02/crsconfig/rootcrs_node02_2020-09-28_03-55-35PM.log,发现有2个检查点报错:crs_asmbackup和transfile检查点。 继续查看这个日志中涉及的2个检查点报错日志和检查点文件,发现12c有2个检查点文件,一个是global,一个是本地node01的,crs_asmbackup检查点成功写在了global,但是本地node01检查点文件没有导致安装日志里面显示crs_asmbackup检查点报错,估计不影响可以忽略,主要问题估计还是那个transfile检查点导致,谷歌和mos无果,尝试node01再次执行root.sh。

7. node01再次执行root.sh,这次成功,发现本地检查点文件里面crs_stack那个总的检查点由fail变成success了。
# 好像11g后crs安装使用检查点,所以grid安装的root.sh可以多次执行,不影响,修复相关问题后继续接着执行就行了。

8. node02执行root.sh,报错:
##########################################################################################
2020/09/28 15:55:44 CLSRSC-594: Executing installation step 1 of 19: 'SetupTFA'.
2020/09/28 15:55:45 CLSRSC-4001: Installing Oracle Trace File Analyzer (TFA) Collector.
2020/09/28 15:55:45 CLSRSC-4004: Failed to install Oracle Trace File Analyzer (TFA) Collector. Grid Infrastructure operations will continue.
2020/09/28 15:55:45 CLSRSC-594: Executing installation step 2 of 19: 'ValidateEnv'.
2020/09/28 15:55:46 CLSRSC-363: User ignored prerequisites during installation
2020/09/28 15:55:46 CLSRSC-594: Executing installation step 3 of 19: 'CheckFirstNode'.
2020/09/28 15:55:52 CLSRSC-507: The root script cannot proceed on this node node02 because either the first-node operations have not completed on node node01 or there was an error in obtaining the status of the first-node operations.
Died at /u01/app/12.2.0/grid/crs/install/crsutils.pm line 4119.
The command '/u01/app/12.2.0/grid/perl/bin/perl -I/u01/app/12.2.0/grid/perl/lib -I/u01/app/12.2.0/grid/crs/install /u01/app/12.2.0/grid/crs/install/rootcrs.pl ' execution failed
##########################################################################################
# 没办法,虽然node01再次执行root.sh显示成功,但是node02执行root.sh校验node01节点依然不通过
# 谷歌和mos未解决,想着修改node02校验也不太好,尝试另一种方法,不管node02,先把node01安装完成,然后通过del和add把node02重新添加进入cluster。

9. node01执行最后配置成功:
/u01/app/12.2.0/grid/gridSetup.sh -silent -executeConfigTools -responseFile home/grid/dba_scripts/grid.12.2.rsp
# 这一步会配置mgmt db。

# 从开始配置到安装结束,node01的$ORACLE_HOME/diag/crs/trace下面的crs日志一直未发生错误,所以判断node01应该是正常健康安装结束的。
# olsnodes -n :只显示了node01,node02还未加入集群。
1
2
3
有时在install或者upgrade GI时,在执行 root.sh or rootupgrade.sh or gridsetup.bat 脚本前需要先安装patch,可以使用 gridSetup.sh进行安装补丁,具体参考mos指导:  Doc ID 1410202.1

# 此次使用opatchauto进行PSU安装,也可以使用gridSetup.sh安装,但是RU不能使用gridSetup.sh。而且注意:可以gridSetup.sh应用补丁和进行silent安装放在一个命令执行,但我尝试时因为它先安装patch,再install GI文件,安装前会报很多关于OPatch相关的警告,因为安装补丁前替换了OPatch文件夹,而且安装补丁修改了GRID_HOME,我怕有风险,就先install,再安装PSU,最后再执行root.sh。
4. 删除node02和添加node02进入集群

参考官网:add and del cluster node on unix:12.2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
## node02清除
-----------------------------------------------------------------------------------------
# node02 grid用户执行:
./deinstall -local

# rm node02 all file
rm -fr /etc/ora* /tmp/* /opt/ORCLfmap /usr/local/bin/* /u01/app/oraInventory/* /u01/app/grid/* /u01/app/12.2.0/grid/* /var/tmp/.oracle/* /var/tmp/.oracle /u01/app/12.2.0/grid/.patch_storage /u01/app/12.2.0/grid/.opatchauto_storage

# node01 root用户执行
./crsctl delete node -n node02

# node01 grid用户执行
cluvfy stage -post nodedel -n node02 -verbose
-----------------------------------------------------------------------------------------

## node02添加
-----------------------------------------------------------------------------------------
# node01 grid用户执行
./addnode/addnode.sh -silent "CLUSTER_NEW_NODES={node02}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={node02-vip}" "CLUSTER_NEW_NODE_ROLES={HUB}" -skipPrereqs -ignoreSysPrereqs -ignoreInternalDriverError

# node02 root用户执行
/u01/app/oraInventory/orainstRoot.sh # 成功
/u01/app/12.2.0/grid/root.sh # 成功

# node01 grid用户执行
cluvfy stage -post nodeadd -n node02 -verbose
-----------------------------------------------------------------------------------------

5.其他grid安装执行root.sh遇到错误:

1. node01执行root.sh配置olr报错:CLSRSC-169: Failed to create or upgrade OLR
1
2
3
4
5
6
7
8
9
10
2020/09/27 14:50:29 CLSRSC-594: Executing installation step 8 of 19: 'SetupLocalGPNP'.
2020/09/27 14:50:53 CLSRSC-594: Executing installation step 9 of 19: 'ConfigOLR'.
PROTL-4: Failed to retrieve data from the local registry
2020/09/27 14:51:01 CLSRSC-169: Failed to create or upgrade OLR
Died at /u01/app/12.2.0/grid/crs/install/oraolr.pm line 496.
The command '/u01/app/12.2.0/grid/perl/bin/perl -I/u01/app/12.2.0/grid/perl/lib -I/u01/app/12.2.0/grid/crs/install /u01/app/12.2.0/grid/crs/install/rootcrs.pl ' execution failed

# 解决:
cd /u01/app/12.2.0/grid/crs/install
./rootcrs.sh -deconfig -force
1. node01执行root.sh最后post报错:CLSRSC-614: failed to get the list of configured diskgroups
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
2020/09/27 16:20:25 CLSRSC-594: Executing installation step 18 of 19: 'ConfigNode'.
CRS-2672: Attempting to start 'ora.ASMNET1LSNR_ASM.lsnr' on 'node01'
CRS-2672: Attempting to start 'ora.ASMNET2LSNR_ASM.lsnr' on 'node01'
CRS-2676: Start of 'ora.ASMNET1LSNR_ASM.lsnr' on 'node01' succeeded
CRS-2676: Start of 'ora.ASMNET2LSNR_ASM.lsnr' on 'node01' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'node01'
CRS-2676: Start of 'ora.asm' on 'node01' succeeded
CRS-2672: Attempting to start 'ora.OCRDG.dg' on 'node01'
CRS-2676: Start of 'ora.OCRDG.dg' on 'node01' succeeded
2020/09/27 16:21:39 CLSRSC-594: Executing installation step 19 of 19: 'PostConfig'.
2020/09/27 16:27:13 CLSRSC-614: failed to get the list of configured diskgroups
Died at /u01/app/12.2.0/grid/crs/install/oraasm.pm line 2069.
The command '/u01/app/12.2.0/grid/perl/bin/perl -I/u01/app/12.2.0/grid/perl/lib -I/u01/app/12.2.0/grid/crs/install /u01/app/12.2.0/grid/crs/install/rootcrs.pl ' execution failed

# 解决:
安装最新 RU 或者 PSU

6.GI安装root.sh脚本功能总结:

1
2
3
4
5
6
7
8
9
10
11
12
13
1. 完成配置集群前的一些杂项工作。
2. 节点的初始化设置。
3. 初始化OLR。
4. 初始化gpnp wallet 和 gpnp profile。
5. 配置ohasd和启动ohas。
6. 向集群添加初始化资源。
7. 将css启动到exclusive模式,格式化vf。
# 除了第1个运行root.sh的node外,其他node会排他启动失败然后以正常模式启动并加入集群,所以node1执行root.sh要初始化集群相比node2执行慢很多。
8. 以正常模式启动集群。
9. 向集群添加CRS相关的资源。
10. 重新启动集群。

可能还有一些新的额外的像: tfa、afd、acfs、ika等待。

4.后话

不知道是不是头晕,安装了很多次GI,都快吐了,其中有一次居然打完PSU后,GI顺利执行root.sh完成了未报错,但是由于我失误node02未应用PSU就执行root.sh导致失败,卸载重装以后就再也不行了,node01老是最后post报错有个checkpoint写入失败,而且装了2个集群,都一样,不知道是不是bug,哈哈哈。