Monday, February 28, 2011
Why You Need to Partition the Database and Applications To Scale with Oracle RAC/Exadata
On Friday I talked about the fundamental difference between Oracle RAC / Exadata and DB2 pureScale. And now I want to dive deeper into why RAC applications need to be cluster-aware to perform and scale well.
Let’s use a small example to show the differences. Let’s consider a 2-server (node/member if you are RAC or pureScale) cluster and a database that is being accessed by applications connecting to these servers.
In the RAC case, if a user sends a request to server 1 to update a row, say for customer Smith, it must get that row from the database into it’s own memory, then it can work on that row (i.e. apply the transaction). Then another user sends a request to server 2 asking it to update the row for customer Jones in the database. First server 2 must read that row into memory and then it can work on it. So far there are no issues, but let’s go on.
Now what happens if another user wants to update the data for customer Jones, but is routed to server 1? In this case server 1 doesn’t have the row, it only has the row for customer Smith. So server one sends a message over to server two asking it to send the row for customer Jones over to server 1. Once server 1 has a copy of the row for customer Jones it can then work on that transaction. Now server 1 has both rows (Jones and Smith) so if a transaction affecting either customer comes to it, it can be processed right away.
The problem now is that any transaction (for customer Smith or Jones) that goes to server 2 requires that server to go to server 1 to get the resource since it has no rows that it can work on directly.
As transactions are randomly distributed amongst the two servers (in order to balance workload) the rows for the customers must be sent back and forth between the two servers. This results in very inefficient use of resources (too much network traffic and a lot of messages between the two servers to coordinate access to data). This limits the scalability of a RAC cluster and also impacts performance. To make RAC scale you have to find the bottlenecks and remove them. In most cases the bottlenecks are too much data being shipped back and forth between nodes (difficult to find in the first place because you now have to look in many different places across the cluster to find the hot spots). To solve the problem you have to repartition your application and your database to make it scale.
DB2 and pureScale on the other hand provide near linear scalability our to over 100 members (servers) with no partitioning of the application or the database.