About Klaas-jan Jongsma

Oracle DBA, ACE Associate with main interests in Exadata, Engineerd systems, all that is database and networking. I am currently employed at VX Company.

A GoldenGate profile & environment script

I have been working a lot with Oracle’s GoldenGate the last year or so and i am loving the product more and more. I really enjoy it’s modulair format where you can actually form it into exactly the thing that you need. However one thing about working with GoldenGate is something that always annoyed me a bit: There is no good way of finding out where the binaries are if the GoldenGate processes are not running. Coming from an Oracle database background, i am used to an oratab file or a cluster registry to see where a database is running (or not running).

I often have to logon to a server with GoldenGate installed on it, i forget where it is running and find myself doing something like a ps -ef | grep mgr to see what the $OGG_HOME is. It becomes a little bit more of a challenge when all processes from GoldenGate are down and you can’t really remember the $OGG_HOME anymore. One option would be to take a look into the inventory.xml:

<INVENTORY>
<VERSION_INFO>
   <SAVED_WITH>11.2.0.3.0</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="OraGI12Home1" LOC="/u01/app/12.1.0.2/grid" TYPE="O" IDX="1">
   <NODE_LIST>
      <NODE NAME="rac1"/>
      <NODE NAME="rac2"/>
   </NODE_LIST>
</HOME>
<HOME NAME="OraDB12Home1" LOC="/u01/app/oracle/product/12.1.0.2/dbhome_1" TYPE="O" IDX="2">
   <NODE_LIST>
      <NODE NAME="rac1"/>
      <NODE NAME="rac2"/>
   </NODE_LIST>
</HOME>
<HOME NAME="OraHome1" LOC="/u01/app/oracle/product/12.1.2/oggcore_12c" TYPE="O" IDX="3"/>
</HOME_LIST>
<COMPOSITEHOME_LIST>
</COMPOSITEHOME_LIST>
</INVENTORY>

In the example above it is obvious that the GoldenGate software was installed in the 3rd home by looking at the name of the directory, there is of course nothing that stops someone from installing GoldenGate in an a completely anonymous path. The Oracle Inventory does not specify what type of software is installed in a specific home, it just records that a home is installed in that location.

So i decided to create a small script that keeps track of your GoldenGate environment on a server. After setting up this script you will see the following when you have logged on to your server:

=Alias MgrUp OGG_HOME
ogg1 No /u01/app/oracle/product/12.1.2/oggcore_11g
ogg2 No /u01/app/oracle/product/12.1.2/oggcore_12c

The script has created 2 aliases ogg1 and ogg2, each alias will setup the GoldenGate environment for that specific home. The setup of the environment consists of a couple things:

  • It sets $OGG_HOME to the home that was just selected
  • Expands $PATH with $OGG_HOME so you can run the tools directly
  • Does an check if it can find the shared libraries of the Oracle RDBMS and immediately warns if no RDBMS home has been set. It specifically checks if the libnnz*.so libraries (GoldenGate needs this for cipher/wallet support) are there.
  • Setting up a bunch of handy aliases:
    • ggstart: Starts the MGR process
    • gg-startall: Starts all processes that are configured
    • ggstop: Stops the MGR process
    • gg-stopall: Stops all processes that are configured
    • gginfo: Shows the output of info all from ggsci
    • ggversion: Shows some version information of the system
    • ggcom: Generic wrapper for ggsci, handy for scripting. Usage is like this: ggcom “info all”
    • ggreprofile: Reloads the oggprofile script for you add an additional home or made some modifications to the script.
    • rgg: Wraps ggsci with rlwrap so you have command line history in ggsci (please oracle update ggsci)
    • rlogdump: rlwrap alias for logdump, same as with rgg.
    • gglog: Opens the Golden Gate logfile for viewing
    • gghome: Goes to the $OGG_HOME

How does this script work? After it is copied to a directory on your server, change variable OGG_TOOLS_HOME in the top of the script to correct path:

### OGG Tools home location - CHANGE THIS ###
OGG_TOOLS_HOME=/u01/tools/ogg

Then add the full location of the script to the user profile that runs GoldenGate like this: . /u01/tools/ogg/oggprofile (note it is: point[space]{path]) and log out and in again. As soon as you have logged in again the script will check in the $OGG_TOOLS_HOME location if a file called oggtab exists, if it doesn’t exists it will create the file:

if ! [[ -f $OGG_TOOLS_HOME/oggtab ]] ; then touch $OGG_TOOLS_HOME/oggtab ; fi;

In the oggtab the profile script will keep track of which GoldenGate home it is has detected. It does this by comparing the contents of the oggtab with the running MGR processes. If it finds a new home it will add it to the oggtab:

V_OGG_LOC=`ps -ef | grep [\.]\/mgr | awk '{print $10}' | sed 's/\/dirprm\/mgr\.prm//'`

if ! [[ -z $V_OGG_LOC ]]; then 
	for OGG_HOME in $V_OGG_LOC;
	do
		grep -q -F  "$OGG_HOME" $OGG_TOOLS_HOME/oggtab || echo $OGG_HOME >> $OGG_TOOLS_HOME/oggtab
	done
fi;

Be aware that the script has now way of knowing if a home is deleted, if you remove an $OGG_HOME you must also manually remove the home from the oggtab file. With the oggtab file the script can loop through each of the homes and check if the MGR is up and build up the alias for the $OGG_HOME:

for TAB_HOME in `cat $OGG_TOOLS_HOME/oggtab`;
do
	if [[ " ${V_OGG_LOC[*]} " == *" $TAB_HOME "* ]]; then
		MGR_UP="Yes"
	else
		MGR_UP="No"
	fi
	
	printf " $(tput bold)%-16s$(tput sgr0) %-11s %-16s \n" $(echo ogg$HOME_COUNT) $(echo $MGR_UP) $(echo $TAB_HOME)
	alias ogg$HOME_COUNT="export OGG_HOME=$TAB_HOME; export PATH=$PATH:$OGG_HOME; export JAGENT_HOME=$V_JAGENT; echo \"GoldenGate home set to: $TAB_HOME\"; f_dbhome_set"
		if [ -n "$JAGENT_HOME" ]; then
			echo "Jagent location set to: $JAGENT_HOME"
		fi;
		((HOME_COUNT++))
done

The script prints out all the information and immediately shows you in the output (see the top of the post) with the alias name is, if the manager is up-and-running and what the path location is. Personally this scripts solves a bunch of small nuisances i have with GoldenGate and makes management a lot easier for me. I find myself making additional aliases for specific repeating tasks at customers with ggcom function quiet regularly.

You can find the GoldenGate environment script here

Advertisements

Can we trust the Exadata ISO image?

Recently i had to reimage a couple of Exadata compute nodes for different reasons like an almost simultaneous failure of 2 harddisks in the compute node.

After the compute got back online with a fresh image and restoring settings like cellip.ora, DNS, NTP and cellinit.ora we got to the point where we need to reinstall the Oracle components, this clusters has an 12.1.0.2 Grid Infrastructure, and 11.2.0.4. RDBMS home as well as an 12.1.0.2 home. Adding this new node back to the cluster and installing the new home should not provide much issue, the image versions of the compute nodes are the same:

[root@dm01dbadm01 ~]# dcli -g /root/dbs_group -l root imageinfo | grep “Image version: “
dm01dbadm01: Image version: 12.1.2.2.1.160119
dm01dbadm02: Image version: 12.1.2.2.1.160119
dm01dbadm03: Image version: 12.1.2.2.1.160119
dm01dbadm04: Image version: 12.1.2.2.1.160119
dm01dbadm05: Image version: 12.1.2.2.1.160119
dm01dbadm06: Image version: 12.1.2.2.1.160119
dm01dbadm07: Image version: 12.1.2.2.1.160119
dm01dbadm08: Image version: 12.1.2.2.1.160119
[root@dm01dbadm01 ~]#

Let’s run the cluster verification utility and see if there are indeed no issues:

[oracle@dm01dbadm01 [+ASM1] ~]$ cluvfy comp peer -refnode dm01dbadm02 -n dm01dbadm02 -orainv oinstall -osdba dba | grep -B 3 -A 2 mismatched
Compatibility check: Package existence for "glibc (x86_64)" [reference node: dm04dbadm02]
Node Name Status Ref. node status Comment
------------ ------------------------ ------------------------ ----------
dm01dbadm02 glibc-2.12-1.166.el6_7.3 (x86_64) glibc-2.12-1.166.el6_7.7 (x86_64) mismatched
Package existence for "glibc (x86_64)" check failed

Compatibility check: Package existence for "glibc-devel (x86_64)" [reference node: dm04dbadm02]
Node Name Status Ref. node status Comment
------------ ------------------------ ------------------------ ----------
dm01dbadm02 glibc-devel-2.12-1.166.el6_7.3 (x86_64) glibc-devel-2.12-1.166.el6_7.7 (x86_64) mismatched
Package existence for "glibc-devel (x86_64)" check failed

That is a bit of a nasty surprise, cluvfy has found some mismatches between the glibc versions between our reference node and our freshly imaged node. It looks like our re-imaged node has a lower release of glibc installed then the rest of the cluster:

[root@dm01dbadm01 ~]# dcli -g /root/dbs_group -l root "yum info glibc | grep ^Release"
dm01dbadm01: Release : 1.166.el6_7.7
dm01dbadm01: Release : 1.192.el6
dm01dbadm01: Release : 1.192.el6
dm01dbadm02: Release : 1.166.el6_7.3
dm01dbadm03: Release : 1.166.el6_7.7
dm01dbadm03: Release : 1.192.el6
dm01dbadm03: Release : 1.192.el6
dm01dbadm04: Release : 1.166.el6_7.7
dm01dbadm04: Release : 1.192.el6
dm01dbadm04: Release : 1.192.el6
dm01dbadm05: Release : 1.166.el6_7.7
dm01dbadm05: Release : 1.192.el6
dm01dbadm05: Release : 1.192.el6
dm01dbadm06: Release : 1.166.el6_7.7
dm01dbadm06: Release : 1.192.el6
dm01dbadm06: Release : 1.192.el6
dm01dbadm07: Release : 1.166.el6_7.7
dm01dbadm07: Release : 1.192.el6
dm01dbadm07: Release : 1.192.el6
dm01dbadm08: Release : 1.166.el6_7.7
dm01dbadm08: Release : 1.192.el6
dm01dbadm08: Release : 1.192.el6

[root@dm01dbadm01 ~]# dcli -g /root/dbs_group -l root "yum info glibc | grep ^Version | head -1"
dm01dbadm01: Version : 2.12
dm01dbadm02: Version : 2.12
dm01dbadm03: Version : 2.12
dm01dbadm04: Version : 2.12
dm01dbadm05: Version : 2.12
dm01dbadm06: Version : 2.12
dm01dbadm07: Version : 2.12
dm01dbadm08: Version : 2.12
[root@dm01dbadm01 ~]#

The glibc release is lower on our new node then on every node in the rack. This customer has been using Exadata for a long time and it quiet experienced in maintaining them. We have a good insight in what additional software was installed and we are sure that no installed components have updated the glibc release to this higher version. The most logical explanation would that this was update during dbnodeupdate.sh patching. After some digging around we found out that the new version was indeed installed by dbnodeupdate.sh due to this glibc vulnerability (Oracle support account is needed):

glibc vulnerability (CVE-2015-7547) patch availability for Oracle Exadata Database Machine (Doc ID 2108582.1)

After manually installing the updated glibc, clufvy was happy and we managed to install Grid Infrastructure again. Unfortunately our issues didn’t stop there, after successfully installing GI and RDBMS 11.2.0.4 we went tried installing the 12c RDBMS home:

[oracle@dm01dbadm01 [oracle] addnode]$ ./addnode.sh -silent -waitforcompletion "CLUSTER_NEW_NODES={dm01dbadm02}"
Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB. Actual 13549 MB Passed
Checking swap space: must be greater than 150 MB. Actual 24176 MB Passed
[FATAL] [INS-30160] Installer has detected that the nodes [dm01dbadm02] specified for addnode operation have uncleaned inventory.
ACTION: Ensure that the inventory location /u01/app/oraInventory is cleaned before performing addnode procedure.
[oracle@dm01dbadm01 [oracle] addnode]$

Wait a minute, i am the only person working on this node and 15 minutes ago i managed to install an 11g home and now it is suddenly corrupted? A check confirmed that we did not have malformed XML in our inventory.xml or any non-printable ASCII characters in it. To completely rule out a corrupted XML we rebuild the Oracle Inventory, all without success.  Time for some debugging the OUI:

[oracle@dm01dbadm01 [oracle] addnode]$ ./addnode.sh -silent -waitforcompletion "CLUSTER_NEW_NODES={dm01dbadm02}" -debug -logLevel finest

Hidden somewhere in the output there are these few lines:

[main] [ 2017-02-08 11:33:17.523 CET ] [RuntimeExec.runCommand:207] runCommand: Waiting for the process
[Thread-158] [ 2017-02-08 11:33:17.523 CET ] [StreamReader.run:61] In StreamReader.run
[Thread-157] [ 2017-02-08 11:33:17.523 CET ] [StreamReader.run:61] In StreamReader.run
[Thread-158] [ 2017-02-08 11:33:17.601 CET ] [StreamReader.run:65] ERROR>bash: /usr/bin/ksh: No such file or directory
[main] [ 2017-02-08 11:33:17.602 CET ] [RuntimeExec.runCommand:209] runCommand: process returns 127
[main] [ 2017-02-08 11:33:17.603 CET ] [RuntimeExec.runCommand:226] RunTimeExec: output>
[main] [ 2017-02-08 11:33:17.603 CET ] [RuntimeExec.runCommand:235] RunTimeExec: error>
[main] [ 2017-02-08 11:33:17.603 CET ] [RuntimeExec.runCommand:238] bash: /usr/bin/ksh: No such file or directory
[main] [ 2017-02-08 11:33:17.604 CET ] [RuntimeExec.runCommand:257] Returning from RunTimeExec.runCommand
[main] [ 2017-02-08 11:33:17.604 CET ] [UnixSystem.pathExists:1013] pathExists..RuntimeExec returned 127

The OUI is trying to find the korn shell at /usr/bin/ksh and a double check on our new image confirms that the ISO gave us the korn shell in /bin/ksh:

[root@dm02dbadm02 ~]# ls -l /bin/ksh*
lrwxrwxrwx 1 root root 21 okt 30 02:35 /bin/ksh -> /etc/alternatives/ksh
-rwxr-xr-x 1 root root 1501800 sep 22 2015 /bin/ksh93
[root@dm02dbadm02 ~]# ls -l /etc/alternatives/ksh
lrwxrwxrwx 1 root root 10 okt 30 02:35 /etc/alternatives/ksh -> /bin/ksh93
[root@dm02dbadm02 ~]# ls -l /usr/bin/ksh
ls: cannot access /usr/bin/ksh: No such file or directory

It looks that the OUI for 12c is hard coded to look for ksh in /usr/bin, having ksh in /bin will result in the 12c OUI complaining about an issue with the inventory instead of throwing an error on not being able to locate ksh. After consulting with Oracle Support, we heard that:

After ol5 to ol6 migration, we need to ensure two things:
a. ol6’s ksh rpm package is installed. And /bin/ksh exists.
b. /usr/bin/ksh exists. Otherwise, create a softlink to /bin/ksh.

Support is referring to an issue that appeared when Exadata moved from OEL5 to OEL6 which is quiet some while ago. It is surprising that a fresh image has this bug making it impossible to install 12c without fixing the ksh location. The easy fix is to make a softlink to ksh in /usr/bin:

[root@dm01dbadm02 ~]# ln -s /bin/ksh /usr/bin/ksh
[root@dm01dbadm02 ~]# ls -l /usr/bin/ksh
lrwxrwxrwx 1 root root 8 feb 8 18:37 /usr/bin/ksh -> /bin/ksh

Et voila… our addnode.sh works and we have our 12c home back. What worries me is that we can’t trust the ISO image of an Exadata release to be up-to-date with what is being delivered by the regular patch cycles. Both the CVE issue with glibc and the ksh issue have already been patched.

So please Oracle make the Exadata ISO great again.

Golden Gate Activity Logging Tracing

A lot of customers of mine who are starting with Oracle Golden Gate have issues when it comes to troubleshooting Golden Gate. It is the most common question i get: “I have setup Golden Gate, it was working fine but now process X abandoned… now what do i do?”.  The OGG error log is a good starting point but is not always showing the real issue. There are a lot different way to obtain more information on why you are running into an issue. In this blogpost i want to introduce a maybe not so well known feature called Golden Gate Activity Logging Tracing and yes that is how they are really calling this feature.

The most obvious way to enable tracing are the TRACE(1)/TRACE2 commands which are fairly well documented in OGG documentation. You can either send the command directly to a running process like this:

GGSCI (rac1) 3> send extract EXT1 trace /tmp/trace_me.trc

Sending trace request to EXTRACT EXT1 ...
Trace file /tmp/trace_me.trc opened.

If we then do an insert into a table that is being monitored by this Extract process will write the following piece of data in /tmp/trace_me.trc:

12:20:26.259 (397290) * --- entering DataSource::readLCR() --- *
12:20:28.161 (399192) Extract pointer info reused [0], [file 0][src 0][tgt 0][tab PDB1.PDB1ADMIN.T1]. File [PDB1.PDB1ADMIN.T1], targ_idx [4294967295], object type [2], num_extracts [1]
12:20:28.163 (399194) exited DataSource::readLCR (stat=0, seqno=0, rba=0 io_type=5)
12:20:28.163 (399194) processing record for PDB1.PDB1ADMIN.T1
12:20:28.163 (399194) Writing seqno [21], rba [1465], table [PDB1.PDB1ADMIN.T1]
12:20:28.164 (399195) * --- entering DataSource::readLCR() --- *

The TRACE parameter shows what the extract is reading from the database and writing into the OGG trail file. This gives us some useful information we could use for troubleshooting OGG. To disable tracing from our extract we send it the TRACE OFF command:

GGSCI (rac1) 5> send extract EXT1 trace off

Sending trace request to EXTRACT EXT1 ...
Closing all trace files..

Keep in mind though that on a busy system with lots DML tracing can generate a huge amount of data. Besides sending the TRACE/TRACE2 command to a process you can add it to a parameter file directly to start tracing as soon as the the extract or replicat has finished parsing the parameter file. The actual tracing output from TRACE is limited, the main purpose is mainly to see where Golden Gate is spending most of its time on:

SUMMARY STATISTICS

General statistics:
0.00% Checking messages (includes checkpointing)
0.00% Checking periodic tasks
0.00% Waiting for more data
0.00% Converting ASCII data to internal
10.45% Reading input records
0.00% Writing output records (replicate_io)
0.00% Mapping columns
0.00% Outputting data records
0.00% Performing SQL statements
0.00% Executing BATCHSQL statements
0.00% Executing SQL statements
0.00% Preparing SQL statements
0.00% Commit operations
0.00% Rollback operations
0.00% Checkpointing

The difference between TRACE and TRACE2 is minimal, it showing you how long your process in spending in de different code segments:

17:00:07.961 (2814715) * --- entering DataSource::readLCR() --- *
17:00:07.961 (2814715) Reading input records 10.357% (execute=291.531,total=2814.713,count=15230)
17:00:07.961 (2814715) Checking messages (includes checkpointing) 0.005% (execute=0.143,total=2814.713,count=15230)

If you want tracing in Golden Gate you should look elsewhere, in the past Golden Gate had features such as tltrace and traceini but they are deprecated or being deprecated. Golden Gate has introduced Activity Logging Tracing in release 11.1.1.0.0. This feature has been out for a long while but is only documented in DocID 1204284.1 and can be very handy at some points. Activity Logging Tracing is enabled by default in OGG kernel but must be enabled which it can do on 2 ways. Activity Logging Tracing can generate a dump file as soon as one of your OGG binaries  abends. The other way to enable Activity Logging Tracing is by placing an XML file in $OGG_HOME that has this syntax: gglog-<process_name>.xml where process_name is can by the name of the group you want trace or the name of the binary. Doc ID 1204284.1 is not very clear about this but you can actuall trace almost every OGG binary including ggsci and server. Some examples to clarify how to name your Activity Logging Tracing xml file:

  • If you want to trace your extract EXT1 name the file gglog-EXT1.xml
  • If you want to trace an replicat called REP1 call it gglog-REP1.xml
  • For tracing every extract name it gglog-extract.xml
  • Similar for tracing ggsci call it gglog-ggsci.xml
  • If you want to trace everything name the file gglog.xml however this will generate a huge, huge, huge amount of data.

As soon as you place your xml file in $OGG_HOME Golden Gate will start tracing and depending on your XML file  contents these files can become very big very quickly. Activity Logging Tracing will generate a file gglog__.log, so tracing extract EXT1 will generate the file gglog-EXT1_extract.log and tracing ggsci will generate gglog-ggsci_ggsci.log. This naming behaviour can be changed in the xml-file if you want the trace files to be generated on eg. a different filesystem:

<param name="File" value="gglog-%I_%A.log"/>

The default logging format is OGG is log4j and the documentation gives some clue on the general working of this XML file. When a process sees the XML files it calls the functions form libgglog.so and starts processing whatever is configured in the XML file. Doc ID 1204284.1 gives you some hints on what can by used for tracing. The example file gglog-full.xml shows that with only adding this:

<level value="all"/>

Golden Gate will immediately starts tracing everything, but be warned this will output a lot of trace information. Personally i tend to use de gglog-debug.xml as a starting off point for Golden Gate tracing as it already gives some more fine grained control. As an example, the values below gives you a fine grained-control on the redo generation tracing:

<logger name="er.redo.ora">
<level value="all"/>
</logger>
<logger name="er.redo.ora.data">
<level value="all"/>
</logger>
<logger name="er.redo.ora.rtc">
<level value="off"/>
</logger>
<logger name="er.redo.ora.rtcfm">
<level value="off"/>
</logger>
<logger name="er.redo.ora.thread">
<level value="all"/>
</logger>
<logger name="er.redo.ora.thread.checkpoint">
<level value="off"/>
</logger>
<logger name="er.redo.ora.thread.blocks">
<level value="off"/>
</logger>
<logger name="er.redo.ora.thread.logswitch">
<level value="all"/>
</logger>
<logger name="er.redo.ora.thread.unpack">
<level value="all"/>

Enabling all of them enables you to trace redo information in realtime. It dumps data in an output something similar to this:

IXAsyncTrans 2907 oracle/ltwtcapture.c | >>>>>> lcrcnt=1031 received lcr type Internal ROW
IXAsyncTrans 1254 oracle/ltwtcapture.c | kngl_flag 0x00 kngl_xflag 0x0000 krvx_dflag 0x0002
IXAsyncTrans 1079 oracle/ltwtcapture.c | lcrvsn: [5] unpacked scn: [0x0000000000477cee]
IXAsyncTrans 1848 oracle/ltwtcapture.c | Ignore opcode 1 (orig op: 7)
IXAsyncTrans 2950 oracle/ltwtcapture.c | received commit, return
IXAsyncTrans 455 oracle/IXAsyncTrans.cpp | Queueing committed record with tid '2.30.1940'
IXAsyncTrans 580 oracle/IXAsyncTrans.cpp | Queueing result (DataBuffer,452,0x0000000004456b80)
IXAsyncTrans 2115 oracle/redooraix.c | Try to get record from logmining server.
IXAsyncTrans 2968 oracle/ltwtcapture.c | end of lcr buffer, return nodata
IXAsyncTrans 2973 oracle/ltwtcapture.c | Release lcr buffer
IXAsyncTrans 2151 oracle/redooraix.c | no lcr returned
IXAsyncTrans 2115 oracle/redooraix.c | Try to get record from logmining server.
main 325 oracle/IXAsyncTrans.cpp | Data buffer 0x0000000004456b80 returned for 0x1c4 bytes
main 10848 oracle/redoora.c | Processing next transaction with xid '2.30.1940'
main 10626 oracle/redoora.c | Processing first record for committed transaction 2.30.1940, scn 4685038
main 4530 oracle/redooraix.c | Try to get committed lcr from COM 0x00000000031f5750 with TID 0002.01e.00000794
main 1705 oracle/redooraix.c | maintain_rc_count(-1), xid=0x0002.01e.00000794 op=INSERT lcr_flag=0x8 xflag=0x0 row_cnt=0 chunk_cnt=0 txn_flag=0x0 groupRBA=0 groupthread=0
main 11032 oracle/redooraix.c | retrieved item# 1 with op 2 (INSERT), still have 0 row, 0 chunk in COM, source COM
main 5006 oracle/redooraix.c | Retrieved item# 1 from COM with TID 0x0002.01e.00000794, skip 0, more_record 0, tind 3
main 5015 oracle/redooraix.c | Exhausted COM, no more items, delete tran now.
main 10751 oracle/redoora.c | first record from commit txn with scn 4685038, seqno 53, rba 32348688, thread_id 1, jts_time_stamp 212351410734000000
main 7753 oracle/redoora.c | FORMAT XOUT LCR:
main 1638 oracle/redooraix.c | === NEW column list (num_columns=1) ===
main 1650 oracle/redooraix.c | Col[1](id,dty,len,uid,sid,iid,flag,flg,flg2,flg3,type): 0 0 1 0 1 1 0 0 0 0 1
main 1664 oracle/redooraix.c | 000000: 58 |X |
main 1638 oracle/redooraix.c | === OLD column list (num_columns=0) ===
main 6837 oracle/redooraix.c | XOUT_INSERT: format insert
main 5566 oracle/redooraix.c | Format col[1] NEW (id,dty,len,key,flag,flg3,type): 0 1 1 1 0x0 0x0 1
main 6396 oracle/redooraix.c | AFC/CHR after format value: 000000: 58 |X |
main 7010 oracle/redooraix.c | After format lcr, len 9
main 7012 oracle/redooraix.c | 000000: 00 00 05 00 00 00 01 00 58 |........X |
main 5388 oracle/redooraix.c | After format record at 53 - 32346988, get trail record with len 9
main 5389 oracle/redooraix.c | 000000: 00 00 05 00 00 00 01 00 58 |........X |
main 349 oracle/IXAsyncTrans.cpp | Data buffer 0x0000000004456b80 released
main 567 oracle/IXAsyncTrans.cpp | Queueing free buffer 0x0000000004456b80
IXAsyncTrans 2863 oracle/ltwtcapture.c | Receive 336 bytes from LCRCaptureReader
IXAsyncTrans 2907 oracle/ltwtcapture.c | >>>>>> lcrcnt=1032 received lcr type Internal ROW

There are more loggers then mentioned by Oracle in Doc ID 1204284.1 and as far i know they are not publicly available. Some of the loggers are quiet easy to figure out what they are doing like tracing DDL in a extract process for example:

<logger name="gglog.ddl.std">
<level value="all"/>

But mostly it is just guess work and/or a lot of testing to see what they actually do (and what they are). In general, you don’t really need them, in most cases the files with loggers as they are enough to troubleshoot most of the issues.

Creating a RAC cluster using Ansible (part 2)

In my previous 2 blogposts i explained how to setup a Vagrantfile that deploys 2 virtualbox machines that can be used as a basis for a RAC cluster and how you could use Ansible to deploy RAC on those VM’s. In this post i want to dive a bit deeper into how i setup Ansible and why, keep in mind that this just one way of doing this.

The Github repository containing all the files can be found here

I am not going to go over the contents of the commons role as it probably speaks for itself on what it does. After the common role, we first need to setup networking which we need for interconnects. Because we have added extra network devices and we later on need to make sure that the device we configured as interconnect interface always keeps being the interconnect. Reason for this is that we have configured Virtualbox so that these interfaces are on their own network. To keep the device persistency we configure udev. This will work because in our Vagrantfile we have set the MAC addresses of the interfaces to a fixed value:

- name: Setup udev for network devices
  replace: dest=/etc/udev/rules.d/70-persistent-net.rules regexp='ATTR.*{{ item.device }}' replace='ATTR{address}=="{{ item.mac|lower }}", ATTR{type}=="1", KERNEL=="eth*", NAME="{{ item.device }}'
  with_items: "{{ network_ether_interfaces }}"
  when: network_ether_interfaces is defined
  register: udev_net

The array the is being referenced at the with_items line here is a host specific setting so it will take this value from the host_vars/ file. This means that the line above will make an udev rule that is specific for every network interface on the host that ansible is running on.

Ansible uses jinja2 as the format to dynamically format files. We user this format for our network configuration files. With the ansible template command, ansible will read the ethernet.j2 file and the jinja engine will use the template to create the correct file. In case of the action below, the ethernet.j2 will be uploaded as a ifcfg- file and it will do so on every host as long it can find data for that host in host_vars directory:

- name: Create the network configuration for ethernet devices
  template: src=ethernet.j2 dest=/etc/sysconfig/network-scripts/ifcfg-{{ item.device }}
  with_items: "{{ network_ether_interfaces }}"
  when: network_ether_interfaces is defined
  register: ether_result

The jinja2 code is quiet simple to read, the ethernet.j2 file looks like this:

# {{ ansible_managed }}
{% if item.bootproto == 'static' %}
DEVICE={{ item.device }}
BOOTPROTO=static
{% if item.address is defined %}
IPADDR={{ item.address }}
{% endif %}
{% if item.onboot is defined %}
ONBOOT={{ item.onboot }}
{% endif %}
{% if item.peerdns is defined %}
PEERDNS={{ item.peerdns }}
{% endif %}
{% if item.defroute is defined %}
DEFROUTE={{ item.defroute }}
{% endif %}
{% if item.netmask is defined %}
NETMASK={{ item.netmask }}
{% endif %}
{% if item.gateway is defined %}
GATEWAY={{ item.gateway }}
{% endif %}
{% if item.mac is defined %}
HWADDR={{ item.mac }}
{% endif %}
{% endif %}

{% if item.bootproto == 'dhcp' %}
DEVICE={{ item.device }}
BOOTPROTO=dhcp
{% endif %}

It is basically divided into to part, the first part tells jinja2 what to do when there is static device configuration and the other one for DHCP enabled devices. Jinja2 will create a line for every item found in the host_vars. The results of this actions are being registered by Ansible with the line “register: ether_result” We are using these results in the next action:

- name: bring up network devices
  shell: ifdown {{ item.item.device }}; ifup {{ item.item.device }}
  with_items: "{{ ether_result.results }}"
  when: ether_result is defined and item.changed

Here we are only restarting those interfaces which are registered in the ether_result action and are changed. A more complex use of ansible is in the template for the hosts file. The hosts file is the basis for DNSMasq which is being used as a simpler alternative for Bind DNS. The template for the hosts file is being created with information from the ansible facts. These facts are being gathered automatically by Ansible as soon as a playbook begins but we have changed things already in this run so right after we have brought up the network interfaces we have gathered our facts again, with these updated facts we can now build our hosts file.

127.0.0.1 localhost
{% for host in groups['all'] %}
{{ hostvars[host]['ansible_eth2']['ipv4']['address'] }} {{ hostvars[host]['ansible_hostname'] }}
{{ hostvars[host]['ansible_eth2']['ipv4']['address'] | regex_replace('(^.*\.).*$', '\\1') }}{{ hostvars[host]['ansible_eth2']['ipv4']['address'].split('.')[3] | int + 10 }} {{ hostvars[host]['ansible_hostname'] }}-vip
{{ hostvars[host]['ansible_eth1']['ipv4']['address'] }} {{ hostvars[host]['ansible_hostname'] }}-priv
{% endfor %}
{% set count = 1 %}
{% for i in range(1,4) %}
{{ hostvars[inventory_hostname]['ansible_eth2']['ipv4']['address'] | regex_replace('(^.*\.).*$', '\\1') }}{{ count | int +250 }} rac-cluster-scan
{% set count = count + 1 %}
{% endfor %}

The for-loop will go through all the facts and get the IP4 address and generates the correct hosts file entry for every node in the cluster. It then takes take the IP4 address and add 10 to the last octet the create the address for the vip interface. The eth1 address i used for the interconnect in our cluster. The last part the the file is a loop the generate 3 additional ip4 addresses based on the address of eth2 and adds 250 to it plus the integer of the loop. These are the our SCAN addresses of our cluster. Now we have the hosts file setup we can install dnsmasq and have DNS ready. We are pinging the interfaces just to make sure they are up, if the ping failes ansible will stop the runbook.

Our network is now setup as we want it to be and we can go on to configure storage. Vagrant already created four shared disks for us which are presented to both virtual machines. We now have to make sure that we have device persistence storage which we can do with both ASMLib and Udev. For both methods there need to be partition, i am using sfdisk as an easy way to create partitions on all 4 disks:

- name: Create disk partitions
  shell: echo "0," | sfdisk -q {{item.1}}
  with_indexed_items: "{{ device_list.stdout_lines }}"
  when: "{{ item.0 }} > 0"
  become: true
  run_once: true
  register: sfdisk_output

The with_indexed_items command gives me an index number which is used to make sure we are not destroying the partitions on /dev/sda where the OS is installed on. Because the OS is installed on the the first disk, we can start sfdisk at index 1. When ansible needs to install asmlib, it then installs the needed RPM’s and uploads and runs a shell script. The reason for this script is that for ASMLib there is some interactivity needed to configure it. This can be solved with regular shell scripting with a here document, as far as i know there is no such thing in Ansible. For Udev we can more or less copy what we did the make our network interfaces persistent.

After we have installed a toolset we can go start with the installation of the Grid Infrastructure. The ASM diskstring and prefix depends on if we have configured it earlier with udev or with asmlib. In order to use the correct valuables i am adding these values to the ansibles facts as custom fact. Depending on which role it finds Ansible well load these values into the facts:

- set_fact: asm_diskstring="/dev/oracleasm/disks"
  when: "'configure_asmlib' in role_names"

- set_fact: asm_diskstring="/dev/ASMDISK*"
  when: "'configure_udev' in role_names"

Now we can create our SSH keys on both nodes, download the public key back to our Virtual Box host. After we have uploaded the public keys to all the hosts we can add the to our know_hosts using ssh-keyscan.

We upload the Grid Infrastructure zipfiles, extract them and install the cvuqdisk rpm file on all nodes. For the installation of Grid Infrastructure software itself we network_ether_interfaces array to set the client and interconnect interfaces correctly. The response file is made out of a jinja2 template so we can easily customise some settings depending on our needs.  Because we are doing a silent install from a response file we need to run the configToolAllCommands which is normally an interactive part when you use the OUI. Finally we create the FRA diskgroup and we can clean up everything we uploaded and now longer need.

The installation of the RDBMS software with Ansible is now very straightforward. It is just a case of adjusting the limits.conf, creating the response file using the jinja2 template and install the software like your regular silent install. The same goes for the actual creation of the database, there are the checks, the response file creation and the database creation itself. For database creation right now i use the response file method, as soon as i have some time i will switch this out for database creation using a template to have more fine-grain control on the created database.

If you take just the Vagrant file and the ansible file and start installing from scratch it will take you a couple of hours (about 3) to download the vagrant box, the software, creating the machines and the provisioning with Ansible.

  • The Github repository can be found here
  • The blogpost about Vagrant and the vagrantfile can be found here
  • Creating a RAC cluster using Ansible (part 1) can be found here

Creating a RAC cluster using Ansible (part 1)

In my previous blog post i explained how Vagrant can be used to create a 2-node VM environment for deploying a RAC cluster on it. In this post i want to dive into how to actually do the RAC install using ansible so you can easily create a RAC test environment on eg. your laptop.

I have created a git repo with the files that i am using for this blog post, this git repo can be found here

I am not going over over the basic of Ansible in this blogpost, i am assuming you have a basic understanding of Ansible and the YAML markup language. If you need a quick refresh on how Ansible works you can take a look here or here for example.

When trying to automate a RAC deployment you need to think about which steps to take and in what order. Also out-of-the-box Vagrant will do a serial deployment of your cluster which will for obvious reasons not work, the solutions for this is in my previous blogpost. It explains on how to setup Vagrant so ansible can do a “parallel” deployment. I will use a slightly modified version of the Vagrantfile i created in that blogpost.

If you go through the new vagrantfile you will see a few changes, the most obvious one are all the new parameters on top they can be used to easily tweak the RAC setup to your liking. You can change between things like CDB or not, memory settings, domain name etc. These parameters are all being passed over to ansible:

ansible.extra_vars = {
devmanager: "#{devmanager}",
db_create_cdb: "#{db_create_cdb}",
db_pdb_amount: "#{db_pdb_amount}",
db_name: "#{db_name}",
db_total_mem: "#{db_total_mem}",
gi_first_node: "#{host_prefix}1",
gi_last_node: "#{host_prefix}#{servers}",
domain_name: "#{domain_name}"
}

The final change i made to my vagrantfile is at the network section, i am providing the VM’s with a fix MAC address so i can refer to those later on when using udev as the device manager. I will get back to the MAC addresses later (in the next post) when we get to the part that configures device persistency

vb.customize ['modifyvm', :id, '--macaddress2', "080027AAAA#{rac_id}1"]
vb.customize ['modifyvm', :id, '--macaddress3', "080027AAAA#{rac_id}2"]

Vagrant is now able to create 2 VM’s that we can use for our RAC cluster. As you can imagine building 2 node RAC cluster with ansible will result to a rather large playbook that will be a bit cumbersome to work with. Also you might want to re-use certain parts later on for different deployments, in order to solve these issues Ansible has to ability to create roles where every role is defined set of actions all with its own set of files, templates and variables. For this RAC playbook i  have created a set of roles to handle specific parts of the installation:

  • common – setups some common linux setting we want to have no matter what
  • configure_network – setups linux network and configures stuff like DNS
  • configure_udev – If selected in the vagrantfile this will configure the system for udev
  • configure_asmlib – If selected in the vagrantfile this will configure asmlib
  • tools_install – installs a toolset with several scripts, install rlwrap etc.
  • rac_install_gi – installs and configures Oracle Grid Infrastructure
  • rac_install_db – installs and configures the Oracle Database home
  • rac_create_db – Creates the RAC database

The reason to use roles is to be keep a better overview of your ansible scripts but even more you can reuse roles for other playbooks. These roles are being called in the main Ansible file that is defined in the vagrantfile at line 88:

ansible.playbook   = "ansible/rac_gi_db.yml"

This YAML file kicks of the entire installation of our RAC cluster, if look in rac_gi_db.yml you will notice that there only a few lines there. This file is referencing the different roles. If Ansible is seeing role common, it expects that there is a subdirectory called roles, which should have a subdirectory in it with the role name, with another nested subdirectory tasks in it. In that subdirectory Ansible will look for a file called main.yml and runs that. So to visualize this, Ansible expects the following directory structure as a minimum for a role:

rac_gi_db.yml/roles/<em></em>/tasks/main.yml

The Ansible documentation itself has a more in-depth explanation on this. Another thing you might notice are the roles for configuring either asmlib or udev:

- { role: configure_udev, when: devmanager == "udev" }
- { role: configure_asmlib, when: devmanager == "asmlib" }

Like mentioned above, a role is a set of subdirectories that holds the information needed to perform the tasks of the specific role. Directories must have a specific name in order for Ansible to recognise them. Roles can have the following subdirectories, depending on your need.

  • defaults/ – where variables local this role are stored
  • files/ – needed if you want to upload files with ansible to a host. Eg. zipfiles for an Oracle GI install
  • tasks/ – This directory contains at least a file named main.yml which is holds all the tasks that this role has to execute.
  • templates/ – As the name suggests, it contains templates to be converted into files by Ansible. For Example a template for a responsefile that needs to be created with the correct database name and settings or a hosts file with the correct host names. These files are in the Jinja2 format.

If you look at line 11 in the vagrantfile you can see that there is a variable devmanager defined which is being parsed into Ansible at line 90. So depending on the value of the variable devmanager Ansible will either configure udev or asmlib. Global variables are furthermore set in the groupvars/all.yml file, these variables are being passed over too all roles in the group ‘all’.  In this ansible setup variables are being set at different levels, i made it so that the most important variables can be set in top of the vagrant file in lines 11 to 16 this can of course be expanded if needed. Besides passing variable direct to Ansible, specific roles can have their own variables, these are variables are set in rac_gi_db.yml/roles//defaults/main.yml. These are variables local to the execution of this specific role and can’t be referenced outside it. The final group of variables we need are host specific variable like ip address and network configurations in general. For these settings we have a file per host, were every host has its own file in the sub directory host_vars.

Now we know how Ansibles gets the parameters, we know what the roles are doing we can can actually start deploying. As soon as your start vagrant and it is done creating the initial VM’s (covered in my previous blogpost) it will start ansible and go through the roles and runs all the YAML files in the tasks directories. At one point you will find yourself in the need of getting some configuration files during installs but these files are dependent on eg. IP addresses, hostnames, database name etc. For example you can you use this task to create a dnsmasq config based on variables in an array called ether_result.results (which we created somewhat earlier in the task.

- name: Create dnsmasq configuration
template: src=dnsmasq.conf.j2 dest=/etc/dnsmasq.conf
with_items: "{{ ether_result.results }}"

In order to get these dynamic configurations we need to write these files with a language called Jinja2. You can take your normal configuration file and replace the variable with something that Jinja2 can work with. As an example let’s look at the dnsmasq.conf file in Jinja form:

# {{ ansible_managed }}
#/etc/dnsmasq.conf
domain-needed
bogus-priv

domain={{ domain_name }}
expand-hosts
local=/{{ domain_name }}/

server=10.0.2.3

interface=eth0
{% for msg in ether_result.results %}interface={{ msg.item.device }}
{% endfor %}

As you can see it is almost your regular dnsmasq.conf file but with some changes. At the top we have a variable  {{ ansible_managed }} which will be replaced with a string containing source file name, creation date and from which host it was created. The {{ domain_name }} variable is being passed on to ansible from vagrant and is actually set in the Vagrantfile.  the last lines of the files are a jinja2 loop going through whatever is in ether_result.results and finding every msg in msg.item.device, which will contain all the names of the network devices i have configured and placing it after the string “interface=”. After ansible is done with the task above you will end up with a file that looks something like this:

# Ansible managed: /my/path/t0/dnsmasq.conf.j2 modified on 2016-08-09 12:24:40 by klaasjan on exabook.local
#/etc/dnsmasq.conf
domain-needed
bogus-priv

domain=mydomain.local
expand-hosts
local=/mydomain.local/

server=10.0.2.3

interface=eth0
interface=eth1
interface=eth2

If everything is configured correctly, installation of the software can begin. So we need to move some zip files from the your host to the guest vm. Ansible does this with the copy command:

- name: Copy RDBMS zipfiles
copy: src={{ item }} dest={{ ora_stage }} owner={{ oracle_user }} group={{ oracle_group }} mode=0644
with_items: "{{ rdbms_zips }}"
when: inventory_hostname == "{{ gi_first_node }}" and rdbms_check.stdout != rdbms_home

The command above will copy the files in the variable {{ item }}, which is being read from the array {{ rdbms_zips }} on the line “with_items:”, it then copies these files to the whatever location is set in {{ ora_stage }} and with the correct user and group permissions. The final line means that it will only copy these files to the first node (we do not need them twice) and when there is not already and oracle home installed.

I mentioned above that ansible will look for the file item.rdbms_zips in the ./files subdirectory. So before installing make sure that all the needed files are there. To make this a bit easier for myself i have created a small script called rebuild.sh which will download these files for you (you need to have an MOS account). It will do so using Maris Elsin excellent getMOSPatch command, which is a jar files now and no longer a shell script. After downloading rebuild.sh will do what the name suggests and remove and rebuild your vagrant RAC environment. Use at your own risk, if it breaks a network config for you in VirtualBox or something in Vagrant… i have warned you.

In the next blogpost i will explain in somewhat more detail what the different steps are doing.

Vagrant for you RAC test environment

Creating your own test VM environment with RAC is a fun exercise to do, however after you have rebuild your environment a couple of times this will get a very tiresome exercise and you want to start to automate your deployments. People have already written several blogposts about the GI and RDBMS orchestrations and the several tools that are available for this. Within the Oracle community Ansible seems to be a very popular choice for the part of getting your test environment up-and-running. But what about the part to get your VM up-and-running, a very repetitive task and not a very interesting task to say the least.

One of your options would be to start scripting the creation of your VM’s and the installation of Linux afterwards. If you are using VirtualBox you could do this by writing a script around the VB commandline VBoxManage and create it from there. For your Linux deployment, PXE boot seems to be a logical choice to me but it still involves in running a DHCP and TFTP server locally (hint: dnsmasq), getting the ISO, bootloaders etc. A fun exercise to test but still quite a lot of work to automate or keep up-and-running.

So this is where Vagrant comes in, it can easily configure and deploy virtual machines for you. Basically it is “nothing  more” then a Ruby shell around several (VM) providers like VirtualBox, AWS, VMWare, Hyper-V and Dockers. Vagrant works with so called boxes, which are nothing more then compressed VM’s that can get modified for your needs at the moment you spin them up. You can choose to let Vagrant download a box from the Vagrant cloud or you could make your own box if your want. Running this:

vagrant init hashicorp/precise64

followed by:

vagrant up

This will give you a VirtualBox VM running Ubuntu which will be dowloaded from the Vagrant cloud. Out of the box Vagrant assumes you have VirtualBox installed. You can then ssh into the box with “vagrant ssh” or, within a multiple hosts scenario you can use “vagrant ssh nodename”. Stopping and starting your vagrant boxes can be done by respectively “vagrant halt” and “vagrant up”. If your are done and want to remove the VM, run “vagrant destroy”.

What if you want to do something more interesting, like deploying a RAC cluster which needs shared storage and multiple network interfaces. You need to create a file called Vagrantfile in your working directory. This file contains the code to modify your boxes with Vagrant. A very basic Vagrantfile should look something like this:

Vagrant.configure("2") do
  config.box = "hashicorp/precise64"
end

Let’s assume we want to create 2 VM’s with 4 disks of storage shared between them and as for networking we want a management interface, a public interface and interconnect network. We will end up with a this file: Here on GitHub Gist

Let’s break this file down so you got an understanding of what is going on. First of this file is, just as the rest of Vagrant, written in Ruby so all Ruby syntax will work in this file as well.

At the top i have defined some variables, such as the amount of servers i want to generate, hardware dimensions, shared disks etc. The API version is needed for Vagrant so it knows what syntax it can expect.

VAGRANTFILE_API_VERSION = "2"
ASM_LOC     = "/pathto/vagrant/rac/asmdisk"
num_disks   = 4
servers     = 2
mem         = 4096
cpu         = 2

The first step to do is to tell Vagrant how you want to setup your vagrant environment for this Vagrantfile. I am telling vagrant i want to use a box called oel68 (which is a custom Vagrant box i made) and i want X11 to be enabled for ease of use if i need to need to use DBCA or something similiar:

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = "oel68"
config.ssh.forward_x11 = true
config.ssh.forward_agent = true

Now for the interesting stuff, creating multiple VM’s for our RAC cluster. I didn’t want to copy and paste several server configurations and make small adjustments to them, instead i wanted to make it a bit more flexible so i just an each iterator that just loops through it until it reaches the “servers” variable. I am creating VM’s that are called rac1 until racn.

(1..servers).each do |rac_id|
config.vm.define "rac#{rac_id}" do |config|
config.vm.hostname = "rac#{rac_id}"

Next step is to create a ruby block that does the VirtualBox configuration. I am adding 2 nic’s, Nic 1 is already in the box by default and is a interface that is connected to my host with a nat network. Nic 2 is for the interconnects, nic 3 is my public interface. Further i am setting all the the nics to an Intel PRO/1000 MT Server card, changing the CPU and Memory settings and updating the SATA port count to 5 so we can add the shared storage later on.

# Do Virtualbox configuration
config.vm.provider :virtualbox do |vb|
	vb.customize ['modifyvm', :id, '--nic2', 'intnet', '--intnet2', 'rac-priv']
	vb.customize ['modifyvm', :id, '--nic3', 'hostonly', '--hostonlyadapter3', 'vboxnet0']

	# Change NIC type (https://www.virtualbox.org/manual/ch06.html#nichardware)
	vb.customize ['modifyvm', :id, '--nictype1', '82545EM']
	vb.customize ['modifyvm', :id, '--nictype2', '82545EM']
	vb.customize ['modifyvm', :id, '--nictype3', '82545EM']  

	# Change RAC node specific settings
	vb.customize ['modifyvm', :id, '--cpus', cpu]
	vb.customize ['modifyvm', :id, '--memory', mem]  

	# Increase SATA port count
	vb.customize ['storagectl', :id, '--name', 'SATA', '--portcount', 5]

We can now create the shared storage for our RAC cluster. We want to create 4 disks so we can use the same trick for this as we are doing for our server creation; an each iterator. We do need to take care of a few things here, we don’t want to overwrite an existing disk and only create and attach it once when we give the “vagrant up” command (this line). To be more precise, i only need one VM to create the disks with VBoxManage createmedium but i need all VM’s to attach these disks. The next IF loop makes sure that only the first node creates the disks and every other node only attaches the storage.

(1..num_disks).each do |disk|
	if ARGV[0] == "up" &amp;&amp; ! File.exist?(ASM_LOC + "#{disk}.vdi")
		if rac_id == 1
			vb.customize ['createmedium',
						'--filename', ASM_LOC + "#{disk}.vdi",
						'--format', 'VDI',
						'--variant', 'Fixed',
						'--size', 5 * 1024]
			vb.customize ['modifyhd',
						 ASM_LOC + "#{disk}.vdi",
						'--type', 'shareable']
		end # End createmedium on rac1

		vb.customize ['storageattach', :id,
				'--storagectl', 'SATA',
				'--port', "#{disk}",
				'--device', 0,
				'--type', 'hdd',
				'--medium', ASM_LOC + "#{disk}.vdi"]
	end  # End if exist
end    # End of EACH iterator for disks

The code below is a workaround for a nasty bug with my CPU which i have both with VMWare Fusion and Virtual Box. It is well documented by Laurent Leturgez and Danny Bryant

# Workaound for Perl bug with root.sh segmentation fault,
# see this blogpost from Danny Bryant http://dbaontap.com/2016/01/13/vbox5/
vb.customize ['setextradata', :id, "VBoxInternal/CPUM/HostCPUID/Cache/Leaf", "0x4"]
vb.customize ['setextradata', :id, "VBoxInternal/CPUM/HostCPUID/Cache/SubLeaf", "0x4"]
vb.customize ['setextradata', :id, "VBoxInternal/CPUM/HostCPUID/Cache/eax", "0"]
vb.customize ['setextradata', :id, "VBoxInternal/CPUM/HostCPUID/Cache/ebx", "0"]
vb.customize ['setextradata', :id, "VBoxInternal/CPUM/HostCPUID/Cache/ecx", "0"]
vb.customize ['setextradata', :id, "VBoxInternal/CPUM/HostCPUID/Cache/edx", "0"]
vb.customize ['setextradata', :id, "VBoxInternal/CPUM/HostCPUID/Cache/SubLeafMask", "0xffffffff"]

We now have our VM’s ready and we can start the provisioning of these VM’s. If we just add a provisioning block like this Vagrant will start the provisioning in series. So create VM rac1, do provisioning, create rac2, do provisioning:

# Create disk partitions
if rac_id ==  1
       config.vm.provision "shell", inline: <<-SHELL
if [ -f /etc/SFDISK_CREATE_DATE ]; then
echo "Partition creation already done."
exit 0
fi
for i in `ls /dev/sd* | grep -v sda`;  do echo \\; | sudo sfdisk -q $i; done
date > /etc/SFDISK_CREATE_DATE
       SHELL
end # End create disk partitions

In most cases you want to start the provisioning when all VM’s are ready. Vagrant supports several provisioning methods like Ansible, Shell scripting, Puppet, Chef etc. If we are intalling a $GI_HOME we need both nodes to be up, have all the interfaces up with IP’s assigned etc.

if rac_id == servers
	# Start Ansible provisioning
	config.vm.provision "ansible" do |ansible|
		#ansible.verbose = "-v"
		ansible.limit = "all"
		ansible.playbook = "ansible/rac_gi_db.yml"
	end # End of Ansible provisioning
end

Above i am only starting the provision block once my rac_id equals the server variable, meaning when i have created all my RAC nodes. Now Ansible can do the provisioning of my servers in parallel because the ansible limit variable is set to all. Vagrant makes an Ansible host file with all the hosts which you can use for the provisioning. The whole provisioning of a RAC cluster itself is outside the scope of this blogpost. If you want Vagrant a go, you can download it here

The link to the full Vagrantfile on Gist is here

Patching the Big Data Appliance

I have been patching engineered systems since the launch of the Exadata V2 and recently i had the opportunity to patch the BDA we have in house. As far as comparisons go, this is were the similarities stop between Exadata and a Big Data Appliance (BDA) patching.
Our BDA is a so called startes rack consisting of 6 nodes running a hadoop cluster, for more information about this read my First Impressions blog post. On Exadata patching consist of a whole set of different patches and tools like patchmgr and dbnodeupdate.sh, on the Big Data Appliance we are patching with a tool called Mammoth. In this blogpost we will use the the upgrade from BDA software release 4.0 to 4.1 as an example to describe the patching process. You can download the BDA Mammoth bundle patch through DocID 1485745.1 (Oracle Big Data Appliance Patch Set Master Note). This patchset contains of 2 zipfiles which you should upload to your primary node in the cluster (usually node number 1 or 7), if you are not sure which node to use, you can use bdacli to find out which node you should use:

[root@bda1node01 BDAMammoth-ol6-4.1.0]# bdacli getinfo cluster_primary_host
bda1node01

After determining which node to use you can upload the 2 zipfiles from MOS and unzip them somewhere on that specific node:

[root@bda1node01 patch]# for i in `ls *.zip`; do unzip $i; done
Archive:  p20369730_410_Linux-x86-64_1of2.zip
  inflating: README.txt
   creating: BDAMammoth-ol6-4.1.0/
  inflating: BDAMammoth-ol6-4.1.0/BDAMammoth-ol6-4.1.0.run
Archive:  p20369730_410_Linux-x86-64_2of2.zip
   creating: BDABaseImage-ol6-4.1.0_RELEASE/
...
..
.
  inflating: BDABaseImage-ol6-4.1.0_RELEASE/json-select
[root@bda1node01 patch]#

After unzipping these files you end up with a huge file containing a shell script with an uuencode added binary payload:

[root@bda1node01 BDAMammoth-ol6-4.1.0]# ll
total 3780132
-rwxrwxrwx 1 root root 3870848281 Jan 16 10:17 BDAMammoth-ol6-4.1.0.run
[root@bda1node01 BDAMammoth-ol6-4.1.0]# 

This .run files does a couple of checks, like determining on which version of Linux we are running (this BDA is on OEL6), it installs a new version of BDAMammoth and moves to previous version to /opt/oracle/BDAMammoth/previous-BDAMammoth/. It also updates the Cloudera parcels and yum repository in /opt/oracle/BDAMammoth/bdarepo and places a new basimeage iso file on this node which is needed if you are adding new nodes to the cluster are need to reimaging them. Finally it also updates all the puppet files in /opt/oracle/BDAMammoth/puppet/manifests, Oracle uses puppet a lot for deploying software on the BDA, this means that patching the BDA is also done by a collection of puppet scripts. When this is done we can start the patch process by running mammoth -p, so here we go:

[root@bda1node01 BDAMammoth]# ./mammoth -p
INFO: Logging all actions in /opt/oracle/BDAMammoth/bdaconfig/tmp/bda1node01-20150130103200.log and traces in /opt/oracle/BDAMammoth/bdaconfig/tmp/bda1node01-20150130103200.trc
INFO: all_nodes not generated yet, skipping check for password-less ssh
ERROR: Big Data SQL is not supported on BDA v4.1.0
ERROR: Please uninstall Big Data SQL before upgrade cluster
ERROR: Cannot continue with operation
INFO: Running bdadiagcluster...
INFO: Starting Big Data Appliance diagnose cluster at Fri Jan 30 10:32:12 2015
INFO: Logging results to /tmp/bda_diagcluster_1422610330.log
SUCCESS: Created BDA diagcluster zipfile on node bda1node01
SUCCESS: Created BDA diagcluster zipfile on node bda1node02
SUCCESS: Created BDA diagcluster zipfile on node bda1node03
SUCCESS: Created BDA diagcluster zipfile on node bda1node04
SUCCESS: Created BDA diagcluster zipfile on node bda1node05
SUCCESS: Created BDA diagcluster zipfile on node bda1node06
SUCCESS: bdadiagcluster_1422610330.zip created
INFO: Big Data Appliance diagnose cluster complete at Fri Jan 30 10:32:54 2015
INFO: Please get the Big Data Appliance cluster diagnostic bundle at /tmp/bdadiagcluster_1422610330.zip
Exiting...
[root@bda1node01 BDAMammoth]# 

This error is obvious, we have Big Data SQL installed on our cluster (which was introduced in the BDA4.0 software) but this version we are running is not supported for BDA4.1. Unfortunately we are running BDSQL 1.0 which is the only version of BDSQL there is at this point, this is was also a known bug described in MOS Doc ID 1964471.1. So we have 2 options wait for a new release of BDSQL and postpone the upgrade or remove BDSQL and continue with the upgrade. We decided to continue with the upgrade and deinstall BDSQL for now, it was not working on the Exadata site due to conflicting patches with the latest RDBMS 12.1.0.2 patch. Removing BDSQL can be done with bdacli or mammoth-reconfig, bdacli calls mammoth-reconfig so as far as i know it doesn’t really matter. So lets give that a try:

[root@bda1node01 BDAMammoth]# ./mammoth-reconfig remove big_data_sql
INFO: Logging all actions in /opt/oracle/BDAMammoth/bdaconfig/tmp/bda1node01-20150130114222.log and traces in /opt/oracle/BDAMammoth/bdaconfig/tmp/bda1node01-20150130114222.trc
INFO: This is the install of the primary rack
ERROR: Version mismatch between mammoth and params file
ERROR: Mammoth version: 4.1.0, Params file version: 4.0.0
ERROR: Cannot continue with install #Step -1#
INFO: Running bdadiagcluster...
INFO: Starting Big Data Appliance diagnose cluster at Fri Jan 30 11:42:25 2015
INFO: Logging results to /tmp/bda_diagcluster_1422614543.log
SUCCESS: Created BDA diagcluster zipfile on node bda1node01
SUCCESS: Created BDA diagcluster zipfile on node bda1node02
SUCCESS: Created BDA diagcluster zipfile on node bda1node03
SUCCESS: Created BDA diagcluster zipfile on node bda1node04
SUCCESS: Created BDA diagcluster zipfile on node bda1node05
SUCCESS: Created BDA diagcluster zipfile on node bda1node06
SUCCESS: bdadiagcluster_1422614543.zip created
INFO: Big Data Appliance diagnose cluster complete at Fri Jan 30 11:43:03 2015
INFO: Please get the Big Data Appliance cluster diagnostic bundle at /tmp/bdadiagcluster_1422614543.zip
Exiting...
[root@bda1node01 BDAMammoth]#

Because we have already extracted the new version of Mammoth when executed the .run file we now have a version mismatch between the existing the mammoth software and params file /opt/oracle/BDAMammoth/bdaconfig/VERSION. We know that when we ran BDAMammoth-ol6-4.1.0.run the old mammoth software was backed up to /opt/oracle/BDAMammoth/previous-BDAMammoth/. Our first thought was to replace to new version with the previous version:

[root@bda1node01 oracle]# mv BDAMammoth BDAMammoth.new
[root@bda1node01 oracle]# cp -R ./BDA
BDABaseImage/           BDABaseImage-ol6-4.0.0/ BDABaseImage-ol6-4.1.0/ BDAMammoth.new/
[root@bda1node01 oracle]# cp -R ./BDAMammoth.new/previous-BDAMammoth/ ./BDAMammoth

Running it again, resulted in lots more errors, i am just pasting parts of the output here, just to give you an idea:

[root@bda1node01 BDAMammoth]# ./mammoth-reconfig remove big_data_sql
INFO: Logging all actions in /opt/oracle/BDAMammoth/bdaconfig/tmp/bda1node01-20150130113539.log and traces in /opt/oracle/BDAMammoth/bdaconfig/tmp/bda1node01-20150130113539.trc
INFO: This is the install of the primary rack
INFO: Checking if password-less ssh is set up
...
..
.
json-select: no valid input found for "@CIDR@"

*** json-select *** Out of branches near "json object <- json value
     OR json array <- json value" called from "json select command line"
*** json-select *** Syntax error on line 213:
}
...
..
.
ERROR: Puppet agent run on node bda1node01 had errors. List of errors follows

************************************
Error [6928]: Report processor failed: Permission denied - /opt/oracle/BDAMammoth/puppet/reports/bda1node01.oracle.vxcompany.local
************************************

INFO: Also check the log file in /opt/oracle/BDAMammoth/bdaconfig/tmp/pagent-bda1node01-20150130113621.log
...
..
.

So after fiddling around with this we found out that the soluting was actually extremely simple, there was no need to move the old 4.0 mammoth software back into place. The solution was to simply run the mammoth-reconfig script directly from backup directory: /opt/oracle/BDAMammoth/previous-BDAMammoth/ and we finally have BDSQL (mainly the cellsrv software) disabled on the cluster:

[root@bda1node01 BDAMammoth]# bdacli status big_data_sql_cluster
ERROR: Service big_data_sql_cluster is disabled. Please enable the service first to run this command.
[root@bda1node01 BDAMammoth]#

With BDSQL disabled we van give mammoth -p another shot and try to upgrade the hadoop cluster and wait for all the puppet scripts the finish. The output of the mammoth script is long so i won’t post it in here for readability reasons. All the logging from the mammoth script and puppet agents are all written to this /opt/oracle/BDAMammoth/bdaconfig/tmp/ during the patch process and eventually it is all nicely put into a single zipfile when the patching is done in /tmp/cdhctr1-install-summary.zip. The patch process on our starter rack took about +2 hours in total and about 40 minutes into the patching process we get this:

WARNING: The OS kernel was updated on some nodes - so those nodes need to be rebooted
INFO: Nodes to be rebooted: bda1node01,bda1node02,bda1node03,bda1node04,bda1node05,bda1node06
Proceed with reboot? [y/n]: y


Broadcast message from root@bda1node01.oracle.vxcompany.local
	(unknown) at 13:09 ...

The system is going down for reboot NOW!
INFO: Reboot done.
INFO: Please wait until all reboots are complete before continuing
[root@bda1node01 BDAMammoth]# Connection to bda1node01 closed by remote host.

Don’t be fooled that the reboot is the end of the patch process, the interesting part is just about the start. After the system is back online you can see that we are still on the old BDA software version:

[root@bda1node01 ~]# bdacli getinfo cluster_version
4.0.0

This is not really documented clearly by oracle, there is no special command to continue the patch process (like dbnodeupdate.sh -c on Exadata), just simply enter mammoth -p again and mammoth will pick up the patching process were it left off. In this second part of the patching process mammoth will actually update the Cloudera stack and all it components like Hadoop, Hive, Impala etc. this will take about 90 minutes to finish with a whole bunch of tests runs from map reduce jobs, ooze jobs, teragen sorts etc. (in my case it reported that i had a flume agent in concerning health after the upgrade). After all this is done the mammoth script will finish and we can verify if we have an upgraded cluster or not:

[root@bda1node01 BDAMammoth]# bdacli getinfo cluster_version
4.1.0

[root@bda1node01 BDAMammoth]# bdacli getinfo cluster_cdh_version
5.3.0-ol6

Except from the issues surrounding BDSQL patching the BDA worked like a charm, although Oracle could have documented it a bit better on how to continue after the OS upgrade and reboot. Personally i really like the fact that oracle uses puppet for orchestration on the BDA. Next step would be waiting on the update for BDSQL so we can do some proper testing.