About cellcli calibrate

With cellcli you can run the calibrate command, this will measure the performance of your harddisks and flashcards. It will eventually report to you the throughput and amount of IOPS that was measured during the calibrate, it will also report to you which luns are not performing up to par.

Calibrate has a few simple options:

CellCLI> help calibrate

  Usage: CALIBRATE [HARDDISK | FLASHDISK | lun1 [, lun2]] [FORCE]
  Purpose: Runs raw performance tests on disks.
           CELLSRV must be shutdown.  User id root is required.

  Options:
    [FLASHDISK]: option to specify that only flash LUNs be calibrated.
    [HARDDISK] : option to specify that only hard disk LUNs be calibrated.
    lun1 [, lun2] : option to specify list of LUNs by name    [FORCE]    : runs test even while CELLSRV is running.

  Examples:
    CALIBRATE
    CALIBRATE '0_8', '0_10' force

CellCLI>

So let’s do a test run of calibrate on one of the harddisks in this cell in this V2 half rack:

CellCLI> calibrate 0_2 force
Calibration will take a few minutes...
LUN 0_2 on drive [20:2] random read throughput: 168.21 MBPS, and 419 IOPSCALIBRATE results are within an acceptable range.

CellCLI>

Now this does tell us something about the throughput and the IOPS that was measured but it does not tells you HOW cellcli did manage to measure this. Let’s have a peek into how cellcli works, in order to set we have to increase the logmode and start cellcli in debugging mode:

[root@dm01cel01 ~]# cellcli -vvv -d
CellCLI: Release 11.2.3.2.1 - Production on Sun Oct 27 15:19:55 CET 2013

Copyright (c) 2007, 2012, Oracle.  All rights reserved.
Cell Efficiency Ratio: 236

What this does, is dumping a lot of extra information on your screen regarding calls that cellcli does to ossmgmt.jar. Now cellcli is nothing but a perl script that calls to that jar file. With cellcli now running in debug mode and loglevel set to the highest we can now run our calibrate command again:

CellCLI> calibrate 0_2 force
loop
yyn:0  state:0  yychar:-1
tok: -3 sval: calibrate
yy:calibrate
 next yychar:289
reduce
state 0, reducing 0 by rule 2 ($$1 :)
...
..
.
state 37, reducing 4 by rule 4 (line : $$1 cli $$2 EOL)
CLI command: CALIBRATE(289)
SPOOL arg : NONE(0)
CLI verb: CALIBRATE(289)
CLI object: NONE(0) 0_2
CLI addverb:
attr#1: FORCE(334)
++++Over Value is: 0
ForAll is false
CLI addverb:
attr#1: FORCE(334)
reduce
After reduction, shifting from state 0 to state 2
tok: -1 sval: null
state 2, reading 0 (end-of-file)
Getting the state of CELLSRV services...  running
Calibration will take a few minutes...
LUN 0_2 on drive [20:2] random read throughput: 168.33 MBPS, and 420 IOPSCALIBRATE results are within an acceptable range.

CellCLI>

I have truncated the output above a little bit to make readable. Now this does not tell us a whole lot more but we can now see some interesting info in the cellcli logfile in /var/log/oracle/deploy/cellcli.lst.0* where the last parts can vary on your system depending the times you have started cellcli. When scrolling through that logfile you will see the output generated by cellcli you will find our cellcli calibrate command:

Oct 27, 2013 3:21:33 PM oracle.ossmgmt.ms.cli.CellCLI readAndProcessLine
INFO: Executing cellcli command: "calibrate 0_2 force"
Oct 27, 2013 3:21:33 PM oracle.ossmgmt.ms.cli.CellCLI invokeSingleRSCmd
FINE: Executing RS command: 0.0STATE CELLSRV!!!!!!!!!!!!!!.
Oct 27, 2013 3:21:33 PM oracle.ossmgmt.ms.cli.CellCLI invokeSingleRSCmd
FINE: RS comamnd return status: 2.

You see that the oracle.ossmgmt.ms.cli.CellCLI class checks wetter or not cellsrv is running or not (i have added the force parameter so i can run calibrate without stopping cellsrv).

Oct 27, 2013 3:21:33 PM oracle.ossmgmt.ms.cli.Calibrate fetchLuns
INFO: disknames.size: 1 listluncmd - list lun "0_2" attributes name, devicename, physicaldrives, disktype, lunsize
Oct 27, 2013 3:21:33 PM oracle.ossmgmt.ms.cli.Calibrate fetchCliResults
FINER: fetchCli cmd: MOD CALIBRATE :list lun "0_2" attributes name, devicename, physicaldrives, disktype, lunsize
Oct 27, 2013 3:22:14 PM oracle.ossmgmt.ms.cli.Calibrate fetchCliResults
FINER: fetchCli result: .	 0_2	 /dev/sdc	 20:2	 HardDisk	 557.861328125G

Next you see that cellcli uses oracle.ossmgmt.ms.cli.Calibrate to translate my lun name 0_2 to a blockdevice name, in my case /dev/sdc. it the goes on the gather names of all the pun:

Oct 27, 2013 3:22:14 PM oracle.ossmgmt.ms.cli.Calibrate fetchLuns
INFO: listluns: 	 0_2	 /dev/sdc	 20:2	 HardDisk	 557.861328125G

Oct 27, 2013 3:22:14 PM oracle.ossmgmt.ms.cli.Calibrate fetchCliResults
FINER: fetchCli cmd: MOD CALIBRATE :list physicaldisk attributes name, luns, diskMediaGroup
Oct 27, 2013 3:22:20 PM oracle.ossmgmt.ms.cli.Calibrate fetchCliResults
FINER: fetchCli result: .	 20:0	 0_0	 DMG_SYSTEM_HDD1
	 20:1	 0_1	 DMG_SYSTEM_HDD1
	 20:2	 0_2	 DMG_DATA_HDD1
	 20:3	 0_3	 DMG_DATA_HDD1
...
..
.
Oct 27, 2013 3:22:20 PM oracle.ossmgmt.ms.cli.Calibrate fetchLuns
INFO: listpds: 	 20:0	 0_0	 DMG_SYSTEM_HDD1
	 20:1	 0_1	 DMG_SYSTEM_HDD1
	 20:2	 0_2	 DMG_DATA_HDD1
	 20:3	 0_3	 DMG_DATA_HDD1
...
..
.

The output above is shortened for readability. Next oracle.ossmgmt.ms.cli.Calibrate runs scripts_common.sh, which is also run during cell deployment. scripts_common.sh tune_block_io /dev/sdc sets nr_request in /sys/block/sdc/queue/nr_requests to 4096:

Oct 27, 2013 3:22:20 PM oracle.ossmgmt.ms.cli.Calibrate calibrate
FINER: Execute: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/deploy/scripts/unix/hwadapter/diskadp/scripts_common.sh tune_block_io /dev/sdc

The nr_requests sets the amount of IO slots 4096 meaning that the deadline IO scheduler has more chance to merge IO requests. This should already been set to 4096 on all of your cells, the calibrate command sets it again anyway.

Oct 27, 2013 3:22:21 PM oracle.ossmgmt.ms.cli.CalibrateTest executeOrion
INFO: execute: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/orion -run advanced -type rand -matrix point  -disk_start_offset 0 -duration 30 -num_disks 1 -testname rand_0 -disk_end_offset 285184m -size_large 8192 -num_small 0 -num_large 10
Oct 27, 2013 3:22:51 PM oracle.ossmgmt.ms.cli.CalibrateTest executeOrion
INFO: execute: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/orion -run advanced -type rand -matrix point  -disk_start_offset 0 -duration 30 -num_disks 1 -testname rand_IOPS_0 -disk_end_offset 285184m -size_small 8 -num_large 0 -num_small 20

Finally calibrate starts the actual calibrate, it does this by calling Orion which is located in $OSS_BIN. It runs two separate runs, one doing small 8Kb requests and one doing 8192Kb large IO’s.

About these ads

2 thoughts on “About cellcli calibrate

  1. Hi,
    cellcli is a bash script which is calling jar which is communicating with MS via http. Also calibrate is making different configuration on different hardware version. In general is setting nr_requests to 4096, max_hw_sectors_kb to 1024 and scheduler to deadline. But once again, depends on hardware version. Check the scripts directly. The tune_block_io is calling tune_block_io from scripts_common_lib.sh

    Cheers,
    Kovachev

    • Hi Nikolay,

      Thanks for reading, you are correct about scripts_common.sh, it does call scripts_common_lib.sh, i did not want to dive into that to much but it is good that you mentioned it. Also scripts_common_lib.sh makes a difference between harddisks and flash disks setting some different deadline parameters. Thanks for pointing that out.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s