 |
|
IBM Shared-Nothing Configuration
Oracle Tips by Burleson
|
IBM has traditionally used the shared nothing
disk architecture for all versions of its DB2 shared database except
OS/390 (see Figure 14.6).
The shared nothing configuration utilizes
isolated disk assets for each server and the database (in the IBM
implementation) is hashed across all of the servers. The database
image is perceived by the users to be a single database and each
machine has access to its own data. However, this implementation of
the parallel database server falls prey to what is known as the
"convoy effect". In a convoy of trucks or ships the speed of the
convoy is limited by the speed of the convoys slowest member. It is
the same with a database that is spread across isolated disk assets by
hashing, the speed of database processing is limited by the speed of
the slowest member of the cluster.
Figure 14.6 IBM's Shared Nothing
Configuration.
Another problem with the hashed spread of a
database across multiple isolated disk assets is that the loss of one
of the servers means loss of the entire database until the server or
its disk assets are rebuilt and/or reassigned.
In addition, due to the way the data is hashed
across the servers and disks standard methods for maintaining data
integrity won't work. This indicates that for a hashed storage
database cluster much, if not all of the data integrity constraints
must be programmed into the applications using complex two-phase
commit algorithms instead of relying on built in constraint
mechanisms.
The final problem with the shared nothing
approach used by IBM is that to add a new cluster member the data must
be re-hashed across all members requiring database down time.
Microsoft Federated Servers Databases
Microsoft actually has two architectures that
they support. The first is limited to a two node structure if you use
WIN2000 Advanced Server and a four node structure if you use a WIN2000
Datacenter implementation. This initial configuration allows for
automatic failover of one server to another and is shown in Figure
14.7.
Figure 14.7 Example of Microsoft's Failover
Cluster
While this architecture allows for failover,
it doesn't improve scalability since each cluster member acts
independently until the need for failover, then the server which
receives the failover activity has to essentially double its
processing load. Each server in this configuration maintains an
independent copy of the database that must be maintained in sycn with
the other copies and has the same restrictions as to referential
integrity as the DB2 database architecture. This architecture also
suffers from the constraint that to add a server the databases have to
properly resynced and balanced.
The second architecture used by MicroSoft is
the SQL Server 2000 federated databases architecture shown in Figure
14.8.
Figure 14.8 Microsoft Federated Databases
In the Microsoft federated database
architecture the data from a single database is spread across multiple
separate databases. Again, if you want to guarantee data integrity you
will have to implement it by:
* Replication of all data used for integrity
such as lookup tables across all databases in the cluster
* Ensure all changes to lookup tables are
replicated across the cluster
* Create triggers in each database that ensure
integrity is maintained across all databases in the cluster
* Create application triggers that check the
federated database for integrity violations, but be aware that this
may fail in certain muli-user situations
However, even the most complex schemes of
referential triggers and replication have been known to fail in
federated systems.
Seeing the High Availability Spectrum
All database systems fall somewhere on the
high availability spectrum (see Figure 14.9). From low-criticality and
low availability systems to systems that absolutely, positively cannot
fail. You must be aware of where your system falls, whether it is at
the low end of the spectrum at the mythical 5-9's end and beyond.
Figure 14.9 The High Availability Spectrum
Many systems, while claiming to be
high-availability, mission critical systems really aren't. If you try
to force a system that by its nature cannot be high on the high
availability spectrum into being a high availability system, you are
opening yourself up for many frustrations and failures. In order to
meet the requirements of a highly available system a database must
have the architectural components shown under each area of the high
availability spectrum shown in Figure 14.9. In other words without
using distributed DB RAC clusters in a multi-site configuration you
would be hared pressed to be able to guarantee zero minutes per year
downtime. If you can handle 60 minutes per year downtime then local DB
clusters on a single site, barring site disasters, would be adequate
and if you can stomach greater than 60 minutes of downtime per year
then various implementations of a single server database will meet
your requirements.
Real Application Clusters
Let's take a more detailed look at Oracle
RAC. The RAC architecture provides:
* Cache Fusion
* True scalability
* Enhanced reliability
RAC allows the DBA true transparent
scalability. In order to increase the number of servers in almost all
other architectures, including Oracle's previous OPS, data and
application changes were required or in many cases performance would
actually get worse. With RAC:
* All Applications Scale – No tuning required
* No Physical Data Partitioning required
* ISV Applications Scale out of the box
This automatic, transparent scaling is due
almost entirely to the RAC's cache fusion and the unique parallel
architecture of the RAC implementation on Oracle. Since the
processing of requests is spread across the RAC instances evenly and
all access a single database image instead of multiple images,
addition of a server or servers requires no architecture changes, no
remapping of data and no recoding. In addition, failure of a single
node results in only loss of its addition to scalability, not in loss
of its data since a single database image is utilized. This is
demonstrated in Figure 14.10.
Figure 14.10 RAC Architecture
The cache fusion is implemented through the
high speed cluster interconnect that runs between the servers. This is
demonstrated in Figure 6. Without the high speed interconnect cache
fusion would not be possible. Each instance in oracle RAC is
independent in that it uses its own shared memory area (which can be
structured differently for each instance) its own processes, redo
logs, control files, etc. However, the cache fusion unites these
shared memory structures such that block images are rapidly
transferred from one shard memory area to another through the high
speed interconnect providing a virtual shared area that is the sum of
the individual areas.
Each query is divided amongst the parallel
instances and processing is done in parallel. Loss of a single server
spreads the processing across all nodes, not just a single designated
failure target node. Addition of a processor is quick and easy.
Figure 14.11 High Speed Interconnect
Processing Prior to Cache Fusion
Prior to the introduction of cache fusion
disks had to be ping-ponged between instances as they where needed.
For example if a block would be read by instance A and then if
instance B required the same block then:
* Instance A would get the request from
Instance B
* Instance A would write the block to disk and
notify Instance B
* Instance B would then read the block from
disk into its cache
This was called a block PING and as you can
imagine was horrible for performance. Data had to be partitioned to
the Instance where it was used most.
Processing with to Cache Fusion
Cache fusion replaces older methods of
concurrency that required disk writes and reads between instances.
Every time a block had to be written to disk so it could be re-read
was called a ping and was a performance hit.
By sharing blocks between caches of multiple
instances in a RAC environment through the high speed interconnect and
controlling access to the block through cache latches, enqueues and
locks, performance is similar to a single instance.
Be careful if you are upgrading from OPS to
RAC, cache fusion only works with the default resource control scheme.
If GC_FILES_TO_LOCKS is set, the old pre-cache fusion behavior is
utilized. In other words, forced disk-writes will be used resulting in
pings.
Cache fusion all but eliminates the need for
data partitioning that was required in OPS early releases. You see
data was partitioned by use such that all of account receivable data
was isolated to a set of tablespaces, as was general ledger, and other
application subsets. Then the node that processed that data could be
relatively assured that pinging would not occur.
See
Code Depot for Full Scripts
 |
This is an excerpt
from Mike Ault, bestselling author of "Oracle
10g Grid and Real Application Clusters".
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |
|