|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Information Systems have become more demanding than ever before.
End users demand instantaneous response time, day or night, and
Oracle Corporation is challenged to help provide continuous availability to
their database products. One
such tool is Oracle’s Real Application Clusters (RAC).
RAC has been
used for more than a decade (via its predecessor, Oracle Parallel Server) to
help provide scalability and high availability in mission-critical databases.
In a nutshell, RAC is a software tool that allows a single database to
be accessed by many Oracle programs.
If one server were to fail, transactions can be redirected to a
surviving server with a minimum of downtime.
As you might
expect, Oracle Corporation advertizes RAC as a cure for many ailments, and
shops can misunderstand the marketing hype and not recognize the costs and
benefits of using RAC in a high availability (HA) environment.
This article
explores best practices with Oracle RAC and shows some of the common mistakes
when using this powerful cluster technology.
-
RAC Planning Best Practices
-
RAC Implementation Best Practices
-
RAC Infrastructure considerations
-
Hardware architecture and RAC performance
-
RAC Backup & Recovery best practices
-
Performance & Tuning best practices
Let’s take a
closer look at these issues and understand the best practices for using a RAC
database.
RAC planning
best practices
One of the
most common mistakes with RAC is misunderstanding the functions and
limitations of Oracle RAC. RAC is used as part of a comprehensive capacity
planning strategy, but the strengths and limitations if RAC are not always
understood, especially from only reading the Oracle marketing hyperbole.
Some of the
most common misconceptions about RAC include these points:
Myth:
RAC is ideal for scalability
Even though
Oracle Corporation wants you to buy tiny “blade servers” and use their grid
computing solution for “horizontal scaling”, that’s not how most shops use
RAC, and RAC is only a legitimate scalability option for super-large shops
that need more horsepower than a single server can deliver.
Instead,
it’s an Oracle best practice to
scale-up first, and then scale out,
first building-up within a single server “vertical scaling”.
It’s only after you have saturated a large server that you need to use
RAC to “scale out” the application with multiple servers.
Today, a single server is capable of huge expansion of RAM and CPU
resources and it’s far easier to add resources to a single server than to
gen-in a new server to the RAC environment.
In real-world environments a single server can handle thousands of
transactions per second and only the world’s largest Oracle databases need to
scale-out using RAC nodes.
Myth:
RAC is a standalone high availability solution
Remember,
RAC only protects you against instance failure, and that’s only one of many
components that can cause an unplanned outage.
For true continuous availability we must deploy triple-mirrored disks
(with a mean-time to failure expressed in centuries) and redundant network
components.
For complete
high availability on each RAC server (node), you will want multiple Host Bus
Adapters, multiple network cards, multiple power sources.
Just as we have failover at the instance layer, you need to purchase
software to allow the multiple Host Buss Adapter cards to automatically
failover and issue a notification that one has failed..
As we have
noted, RAC systems requires a cluster interconnect in order to accommodate
RAM-to-RAM transfers of data blocks in the RCA cache fusion layer.
This interconnect must be very fast, with high bandwidth with low
latency. Interconnects include:
·
Dark
fibre: Dense Wavelength Division
Multiplexing (DWDM) technology
·
Infiniband
·
Myrinet
This cache
fusion bottleneck is another reason why RAC scale-out (horizontal
scalability) is problematic. If
your cluster interconnect cannot handle the traffic, extra servers will
actually degrade your performance instead of helping performance.
The only way around this issue is to change your entire application to
accommodate RAC, or to purchase faster storage such as
Solid State Disk.
Myth:
RAC ensures fast response time
Response
time for transactions is always important, but it’s especially important for
RAC databases because it’s the connection wait time that is used to detect if
a RAC node has failed (a server is called a “node” in Oracle RAC).
Hence, you must plan to ensure that new transactions are serviced in
less than one second wall-clock time, so that you can set a failover time of
two seconds.
Myth:
RAC does not need a disaster recovery component
Except in the
rare vases where you can deploy
Dense Wavelength Division
Multiplexing (DWDM) technology (Dark fiber), you will still need to create a
disaster recovery solution.
Because the RAC nodes are normally located within a few miles of each other,
a natural disaster like a hurricane would still cause a global outage.
Hence, it’s a RAC best practice to also deploy a fast-failover
geographical solution, like Data Guard, or better still, n-way Streams.
Now that we
understand the planning best practices, let’s take a closer look at RAC best
practices issues after we have implemented our shiny new database.
RAC
Implementation best practices
Operational
RAC databases follow many of the same best practices as any Oracle database,
but there are some that are unique to Oracle RAC systems. First, it’s an
important best practice to plan the RAC servers such that we minimize the
geographical distance between the RAC nodes while still keeping them separate
enough to avoid a wholesale failure of all nodes.
See my notes
here on how to implement
RAC implementation
guidelines.
In a busy
RAC database, the speed of the server interconnect is critical for fast
response time, and it’s a best practice to use the fastest possible
interconnect, usually a fire optics solution like dark fiber.
Some shops will
place RAC nodes in separate buildings in the same neighborhood, but with the
advent of the superfast dark fiber interconnect (Dense
Wavelength Division Multiplexing (DWDM) technology),
you can use “Extended
RAC”
and place RAC nodes up to 100 miles apart.
This allows you to combine high availability with disaster recovery.
However, dark fiber is super-expensive, and most shops adopt a best
practice where they combine RAC with disaster recovery solutions like
n-way Streams replication.
The whole
point of RAC is to make end-users automatically re-connect to a surviving
server when one server fails.
This is either done at the webserver level or with the Oracle Transparent
Application Failover (TAF) option.
Whatever tool you choose, you want to wait a small amount of time
(usually less than three seconds) before assuming that the server is dead and
re-trying a new RAC server.
Next, let’s
take a closer look at specific RAC technical best practices.
RAC
interconnect best practices
Since RAC is
a method to have many instances share the same database, shared data blocks
are transferred between the servers using a high-speed
interconnect called “cache fusion”.
In order to keep performance fast, it’s critical that you pay close
attention to the interconnect layer:
·
RAC likes small blocksizes
·
The interconnect must have super-fast (dark fiber) network hardware
·
RAC load balancing is critical to performance
RAC node
load balancing best practices
I disagree
with the Oracle practice of load balancing using a least-loaded approach
because of the overhead that it causes to the cache fusion layer.
In the real-world, like-minded end-users are directed to the same RAC
server. If we have a RAC system
that has different types of end-users, we would want to load balance
according to their data needs.
For example, customer processing might be on node one, order
processing on node two, and product processing on node three.
Grouping RAC end-users by data needs ensures that cache fusion
overhead is minimized.
RAC disk
storage management best practices
In order to
implement a RAC system you must use some sort of shared storage device
because many serves must have concurrent access to the disks.
Whereas a single instance database can use Direct Attached Storage
(DAS), which is an array of inexpensive disks connected to a single server,
you must now use what is known as a Storage Area Network (SAN).
A SAN is much more expensive and complex, a disk array that is capable
of connecting to many servers, usually through fibre-channel connections.
This requires a unique set of hardware, ranging from Host Bus Adapters
(HBA) to the SAN itself, and it’s important that your DBA has complete
knowledge of the internals of the data storage layer.
RAC
blocksize best practices
It’s became
a best practice in RAC to use a small 2k blocksize to minimize the “baggage”
shipped across the cache fusion layer.
Because the blocksize is the unit of work, the smaller the blocksize
means the higher granularity of data being transferred with less overhead.
If you have long rows (greater than 2k), then you will want to move to
a 4k blocksize.
The
implementation of a RAC cluster is only the beginning, and it’s critical to
constantly monitor the health of your RAC clusters so that you can spot and
fix impending problems before the inconvenience your end-users.
RAC
monitoring best practices
In order to
ensure that a RAC node never experiences a global problem (and an unplanned
outage), a proper monitoring infrastructure is an absolute requirement.
RAC databases rarely fail without warning, and if the DBA understands
the proper metrics to watch, they can create an alert system that notifies
them of a pending problem so that they can fix it before the instance
crashes.
The DBA must
monitor the cluster, the shared disk setup, ASM (or OCFS), the database
instance, listeners, and more in-depth metrics such as cache coherency,
interconnect latency, disk times from multiple systems, and many other
things. While extra-cost
performance monitoring tools such as Oracle Grid Control can help perform
rudimentary RAC monitoring for beginners, a RAC DBA should have the coding
skills to build their own RAC monitoring infrastructure using dictionary
queries, dbms_scheduler and e-mail alert mechanisms.
For a
complete treatment of RAC monitoring best practices, see my latest book
Oracle Tuning: The
Definitive Reference 2nd Edition,
all new for 2010.
Finally,
let’s wrap-up our discussion of RAC best practices by reviewing the best way
to define job roles for a RAC database.
RAC staffing
best practices
One best
practice for RAC databases is to always hire an experienced RAC DBA to manage
your cluster and avoid people who have had the RAC training, but have no job
experience.
It’s
important to recognize that human resource costs are the most expensive part
of an Oracle shop. Over the
decades, hardware costs are steadily falling while manpower costs (adjusted
in net present value dollars) remain the same:
It’s also
important to note that those Oracle professionals with RAC skills command a
hefty premium over an ordinary DBA.
A recent
Oracle
salary survey
notes that an average American DBA earns about $97,000 per year, but RAC
experts command a 40% premium, commonly earning $140,000 a year, with those
who manage multi-billion dollars RAC databases commanding upwards of $250,000
per year.
Sadly, there
is not an easy to “grow your own” RAC DBA.
The training courses are very expensive and there is no substitute for
real-world experience. Plus,
training your own DBA in RAC may make them more marketable, and it’s not
uncommon to spend tens of thousands of dollars teaching RAC to your DBA, only
to lose them to a better job offer.
Next, let’s
examine other staffing issues with RAC.
RAC job role
best practices
There is a
perpetual conflict between Systems Administrators (SA’s) who traditionally
manage servers and disks, and the RAC Database Administrators (DBA’s) who
have responsibility for managing the RAC database.
We also have clearly defined job roles for the Network Administrator,
who is especially challenged in a RAC database to manage the cluster
interconnect and packet shipping between servers.
If your DBA
is going to be held responsible for the performance of the RAC database, then
it’s only fair that they be given root access to the servers and disk storage
subsystem. However, not every
DBA will have a sufficient background in computer science to be able to
manage a complex server and SAN environment, so each shop makes this decision
on a case-by-case basis.
RAC training
best practices
One of the
best ways to guarantee unplanned outages is not to properly train your SA,
DBA and Network Administrators.
Complex SAN environments like EMC, Tagmastore and NetApp have complex
architectures and they frequently require training classes.
Disk
configuration for RAC is also challenging and RAC will only function when
using specific disk setups (ASM, OCFS, RAW, or a 3rd Party cluster file
system, and these tools require training classes.
In addition,
Network Administrators will have to learn how to work with the cluster
interconnect, and specialized interconnects such as Infiniband and
Dense Wavelength Division
Multiplexing,
require specialized training
Of all the
RAC staff, the DBAs will have the greatest learning curve.
They will have to understand how to set up and administer all of the
complex RAC components including the clusterware and filesystem storage.
Conclusion In sum, while RAC offers continuous availability, it’s is not magic, and lots of work is required to ensure that your RAC database is always available. Every RAC databases may be unique, but there are some well-known perils and pitfalls, especially for new shops, and using best practices from other shops is a must for ensuring success. RAC is special, and the vast majority of the best practices with RAC relate to properly planning the infrastructure, configuring and deploying the RAC database.
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 1996 - 2009 by
Burleson Enterprises, Inc. All rights reserved.
Oracle® is the registered trademark
of Oracle Corporation. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||