 |
|
Oracle Tips by Burleson |
The
Balancing Act
In order to determine the best utilization of SSD
resources, the DBA must completely understand the various latencies in
their system. RAC related latencies fall under two broad categories:
Interconnect latenciesare calculated based on the Oracle internal views. Contention in
blocks can be measured using the block transfer time. To determine
block transfer time, the statistics
gc cr block receive timeand gc cr blocks received should
be examined. The time is determined by calculating the ratio of
gc cr block receive time to
gc cr blocks received. The
following script shows this calculation.
column "AVG RECEIVE TIME (ms)" format 9999999.9
col inst_id for 9999
prompt GCS
CR BLOCKS
select b1.inst_id, b2.value "RECEIVED",
b1.value "RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG RECEIVE TIME
(ms)"
from gv$sysstat b1, gv$sysstat
b2
where b1.name = 'gc cr block receive time' and
b2.name = 'gc cr blocks received' and
b1.inst_id = b2.inst_id
An example from the actual system on which the
TPC-C tests were performed is shown below.
INST_ID RECEIVED RECEIVE TIME AVG RECEIVE
TIME (ms)
------- ---------- ------------
---------------------
2 86867
16663 1.9
1 85555
17148 2.0
In this example, it is taking nearly two
milliseconds to transfer blocks from one RAC node to another. The
average read time for a block from SSD by Oracle10g was between 0.091
and .000 milliseconds:
Date:
07/29/05 Page: 1
Time: 06:51
AM IO Timing Analysis TPCC
ssd
database
File PHY PHY PHY PHY
Inst File
Name RDS WRTS RDTM WRTM
---- ----
------------------------------------------ ------ ----- ---- ----
2 1
UNDOTBS1 /oracle2/oradata/ssd/temp01.dbf 11 0 .091 .000
1 4 USERS
/oracle2/oradata/ssd/ssd_data01.dbf 189357 84285 .032 .014
2 1
UNDOTBS1 /oracle2/oradata/ssd/system01.dbf 2835 82 .031 .000
2 4 USERS
/oracle2/oradata/ssd/ssd_data01.dbf 162718 66945 .029 .014
2 3 TEMP
/oracle2/oradata/ssd/sysaux01.dbf 681 306 .025 .016
1 2 SYSAUX /oracle2/oradata/ssd/undotbs01.dbf 609 4972 .021
.021
1 1 UNDOTBS1 /oracle2/oradata/ssd/system01.dbf 3434 190 .018
.005
1 3 TEMP /oracle2/oradata/ssd/sysaux01.dbf 490 169 .016
.006
2 2 SYSAUX /oracle2/oradata/ssd/undotbs01.dbf 148 8 .014
.125
2 5
UNDOTBS2 /oracle2/oradata/ssd/undotbs02.dbf 524 4147 .013 .020
1 1
UNDOTBS1 /oracle2/oradata/ssd/temp01.dbf 9 0 .000 .000
1 5
UNDOTBS2 /oracle2/oradata/ssd/undotbs02.dbf 6 4 .000 .000
From these results, it is clear that one will get
a factor of 100-200 increase in speed by reading the blocks from SSD
rather than across this particular interconnect. Of course, many
interconnects can beat this type of transfer speed, so results may
vary. The results shown above are from the actual SSD testing and
reflect up to a 600 user load on the system. The typical latencies for
various interconnect technologies are shown in Table 3.1.
|
MEASUREMENT |
SMP BUS |
MEMORY CHANNEL |
MYRINET |
SUN SCI |
GIG ENET |
|
Latency (uS) |
0.5 |
3 |
7 to 9 |
10 |
100 |
|
CPU Overhead |
<1 |
<1 |
<1 |
|
~100 |
|
Msg/Sec(million) |
>10 |
>2 |
|
|
<0.1 |
|
Bandwidth MB/s |
>500 |
>100 |
~250 |
~70 |
~50 |
The statistics in Table 3.1 can be misleading. The
latency must be measured inside the Oracle system not just at the
network. Oracle internal messaging, latching, and locking all add to
the Oracle latency as reported in previous results. The reported
latency over the gigabyte Ethernet is 100
uS, yet ten times that, 1.9
to 2.0 milliseconds, is seen as the block transfer latency.
Obviously, in the above system, moving the reading
of blocks to SSD makes sense. In most systems, the same I/O timings
for a RAID disk array will range from one to two milliseconds to over
20 depending on how full the disk is and how many users need access to
the same disks at the same time. The output below shows a typical
response profile for a RAID based fibre channel array.
Date: 08/12/05 Page: 1
Time: 10:40
AM IO Timing Analysis PERFSTAT
atltest database
Phys. Phys. Avg.Read
Avg.Write
File#
Name Reads Wrts Time
Time
-----
--------------------------------------- ------ ----- --------
---------
29 /u03/oradata/atltest/eebase_dm.dbf
3945 2824 1.318377 149.03718
1 /u02/oradata/atltest/temp.dbf
38590 25594 1.301528 86.38509
1 /u01/oradata/atltest/system01.dbf
6234 5122 .827879 137.78680
30 /u03/oradata/atltest/eebase_dl.dbf
14513 1955 .716461 59.00716
31 /u03/oradata/atltest/eebase_is.dbf
176 4365 .636363 434.92348
5 /u01/oradata/atltest/BFAPP_STATIC_D.dbf
927 283 .633225 4.59717
19 /u01/oradata/atltest/BFLOB_D.dbf
4874 5690 .615510 3.00615
7 /u01/oradata/atltest/BFAUD_OBJECTS_D.dbf
599 238 .525876 2.75210
3 /u01/oradata/atltest/BFAPP_D.dbf
9582 688 .494051 10.68313
35 /u03/oradata/atltest/eeaudit_im.dbf
4852 29644 .481038 132.40284
9 /u01/oradata/atltest/BFAUD_STATIC_D.dbf
682 232 .439882 2.53879
4 /u02/oradata/atltest/BFAPP_I.dbf
1077 648 .436397 5.00925
13 /u01/oradata/atltest/BFEDI_D.dbf
7630 174 .432110 9.10344
28 /u03/oradata/atltest/3x_is.dbf
186 13953 .419354 271.65455
22 /u02/oradata/atltest/indx01.dbf
382 2230 .403141 49.80896
11 /u02/oradata/atltest/BFAUDIT_D.dbf 49326 1743 .393017
130.45438
17 /u01/oradata/atltest/BFERROR_D.dbf
11401 1577 .392684 145.88839
24 /u01/oradata/atltest/tools01.dbf
358 302 .388268 27.23841
14 /u02/oradata/atltest/BFEDI_I.dbf
299 247 .344481 4.83400
6 /u02/oradata/atltest/BFAPP_STATIC_I.dbf
431 312 .338747 5.54807
For a RAID type array, the interconnect performs
faster, so best performance comes from fully caching the database.
One additional issue with disks versus SSD
technology deals with how the amount of data stored on a disk affects
performance. As the amount of data on a single disk increases, the
performance of that disk will decrease. This is due to the increased
positioning and rotational latencies required as the storage of
information moves away from the outer and middle cylinders towards the
inner cylinders on the drive. Back in the old days, system
administrators spent hours repositioning files into the sweet spots of
disk performance and away from the dead zone. Now with automated load
balancing and RAID, system administrators have little real control
over file placement. Experts agree that for optimal performance, no
disk should be filled to more than 60% of its total capacity, which is
rather like buying a six passenger car and being told that it only
four people should be transported in it.
Also, no matter how fast the disk spins, it can
still only service a single read per operation. Now, modern disk
controllers may optimize these single reads into a series of reads,
but the heads can only be in one position at a time. This requires
consideration of the I/O per second needed for the disk array with
this sometimes overshadowing disk size as the predominant
consideration in buying an array. For example, a 100 gigabyte database
will entirely fit on a single 148 GB disk drive; however, at least
500-1000 I/O’s per second will be required to satisfy user demands
depending on transactions per minute and transaction size. To satisfy
500-1000 I/O’s per second, five to ten disk drives would be required.
On SSD technology, the complete capacity is usable
since there is no positional or rotational latency, and the number of
simultaneous reads/writes is only dependent on bandwidth since there
is no head that needs to be repositioned after each read or write
operation. For the 100 GB database in the above example, 100 GBs of
SSD storage would likely be purchased.
Disk latencies are often quoted at around four to
seven milliseconds on most modern hardware. With arrays with large
caches, the read times can be reduced to near SSD values. Of course,
this approach of placing large caches in front of the disk array is
essentially using solid-state technology with the disks as a backing
store, so better performance would be an expectation. The main problem
with most disk arrays with large caches is that the caches are usually
limited to about 32 GBs, and again, for the cost includes a memory
cache as well as expensive disks.
The above book excerpt is from:
Oracle RAC & Tuning with
Solid State Disk
Expert Secrets for High
Performance Clustered Grid Computing
ISBN
0-9761573-5-7
Donald K. Burleson & Mike Ault
http://www.rampant-books.com/book_2005_2_rac_ssd_tuning.htm
|