ColdFusion RAID databases (or database pools)

You've tuned your databases queries, added all of the indexes you can think of and cached all of the queries you can but your database still isn't giving you the performance you need.

So it's time to split up your database into several databases and move each of these new databases off onto separate database servers. Or perhaps a RAID database is the solution to your performance issues.

With RAID hard disk mirroring you have multiple hard disks each with the same information on them. If you want to read information off the RAID disks you can read it off any of the disks and if you want to write information you need to write it to all disks. Basically you can do X times as many reads in the same amount of time for a X disk system as a one disk system. ie 2 mirrored hard disks give twice the read performance, 3 hard disks give three times the performance, etc etc

You can use a similar technique with databases. If you have multiple copies of the same database on separate servers you can implement this fairly easily in ColdFusion.

Building on the data source name singleton code I wrote the other week we have:

DSN.cfc

view plain print about
1<cfcomponent displayname="DSN" extends="singleton">
2    <cfset variables.DSN = arraynew()>
3    <cfset variables.length = 0>
4    
5    <cffunction name="getDSN" access="public" returntype="string" output="false">
6        <cfset var i = RandRange(1,variables.length)>
7        <cfreturn variables.DSN[i]>
8    </cffunction>
9    
10    <cffunction name="getAllDSNs" access="public" returntype="string" output="false">
11        <cfreturn arraytolist(variables.DSN)>
12    </cffunction>    
13    
14    <cffunction name="setDSNs" access="public" output="false">
15        <cfargument name="DSNs" type="string" required="yes">
16    
17        <cfset variables.DSN = listtoarray(arguments.DSN)>
18        <cfset variables.length = arraylen(variables.DSN)>
19    </cffunction>
20</cfcomponent>

For any select query with just use the getDSN method like before:

view plain print about
1<cfset dsn = getInstance("dsn")>
2
3 <cfquery name="myquery1" datasource="#dsn.getDSN()#">
4 select ....
5 </cfquery>

For any update, insert or delete queries you must alter all databases so you need to loop over them like so:

view plain print about
1<cfset dsn = getInstance("dsn")>
2
3    <cfloop item="ds" list="#dsn.getAllDSNs()#">
4        <cfquery name="myquery1" datasource="#ds#">
5         update ....
6        </cfquery>
7    </cfloop>

This technique would only be of benefit for databases that have more reads (ie selects) than writes(ie updates and deletes), but for most web applications that is the case.

The DSN component code could be modified so you can mark data sources as offline and flags datasources that are currently out of action. That way your system will continue to run off the other databases if one (or more) of them are down.

It's also fairly easy to add additional databases to cope with increased load to a running system.

One issue with this method is re syncing the databases once one of them has gone offline and then come back online again. It would be possible to do this re syncing via code by logging and rerunning all SQL update, insert and delete queries but it's probably better to just re sync with your native database tools.

TweetBacks
Comments (Comment Moderation is enabled. Your comment will not appear until approved.)
Good post, though the title was a bit misleading. I thought you were going to tell me that if my DB takes advantage of RAID, I had to do something about it in CF =).

Quick question: Is there a drawback (other than being slightly more complex) of keeping a "busy" flag for each datasource and picking a not-busy one as opposed to picking one randomly?
# Posted By Sammy Larbi | 4/29/07 5:18 AM
Perhaps I should of called them database pools?

Good idea with the busy flag. To keep track of of busy databases you could a) write your own custom tag and use that instead of cfquery or b) use cflock as you need to mark them as busy at the start of the query and not busy at the end.

However I'm not sure that you would gain a lot from it as most databases are built to handle simultaneous requests. You could count requests and share them out more evenly but randrange would give a good approximation anyway. You would need to test the system under load to see if it would help or not.
# Posted By Justin Mclean | 4/29/07 8:37 AM
I think DB pools is a better name, as its more inline with what's being done in similar things we might call pools (like, keeping a pool of objects we might use in an application, rather than a "single"ton or complete transients).

As for the busy flag - I am aware of simultaneous requests being handled by the DB (and CF, for that matter) better than I could, or would want to. I was speaking rather to only the CF side - if we are to choose a random one, why not just make sure its not "busy" first? (and on that note, not relying on locking it, since that won't give us an "isBusy" flag to pick the next not busy one)

I can see how randomly picking one would be a good approximation, but I was wondering specifically about the value of keeping a busy flag within our CF code to ensure we never hit a "busy" db (unless the pool was full, of course).

I'm not convinced there is value, but your post was timely because I've been thinking about doing what I've described lately, and not just for DBs, but pools of objects in general...
# Posted By Sammy Larbi | 4/29/07 1:52 PM
Sammy I think a usages counter may be of more use than a isbusy flag ie select the next datasource with the fewest no of current connections. I think a custom tag that replaces cfquery is the way to go I'll have a think about it and see what I can come up with.
# Posted By Justin Mclean | 4/29/07 7:30 PM
The usages counter - that's a good point
# Posted By Sammy Larbi | 4/30/07 12:37 AM
# Posted By Justin Mclean | 4/30/07 12:44 AM
ok, if you have a need for multiple DBs like this, I suspect you'd be using a LB in front of them to handle it all, eh? Clustering has to be more robust than this.

DK
# Posted By Douglas Knudsen | 5/1/07 7:48 AM
As far as I aware you can't use load balancing for database servers, because the databases would get out of sync as soon as a write is made to one of them.
# Posted By Justin Mclean | 5/1/07 8:25 AM
Oracle can get all clustered up with various LB schemes. I don't know much about others, but I'd be surprised if others can't too. Think myspace or adobe.com, I'd be surprised if they use a single DB or switch DSNs like this.
A quick google turned up
http://www.howtoforge.com/loadbalanced_mysql_clust...
a MySQL article. How these tools do their job I don't know for sure. I'd imagine all cluster members could manage the data on a SAN space communicating to each other about locks and such.

DK
# Posted By Douglas Knudsen | 5/1/07 1:07 PM
I've looked a bit further into it and some databases do support load balancing, however they generally require 3rd party hardware or software (eg http://www.datadirect.com), enterprise versions of OS or database software and are not without issues. (eg http://www.sql-server-performance.com/dk_massive_s...)

The above code gives you a simple way to load balance with any database at a low cost ie nothing extra to buy, install or configure other than standard database servers. Would it be suitable for everyone/every application? Probably not, but them not all of us are as big as Adobe or MySpace.

The load balanced mysql is interesting but looks complex to set up and maintain. eg How would you add an extra server if you needed it? With the code above it would be a matter of just adding a new datasource to the list in the application scope (as long as the DB was set up and populated).
# Posted By Justin Mclean | 5/1/07 2:52 PM
My guess is that something like this might be a good step between a single DB and full on LB, but I'm not an expert on that kind of stuff, so I may very well just be making stuff up. =)
# Posted By Sammy Larbi | 5/1/07 9:32 PM
It's easy to implement so possibly it's worth trying out before looking into other solutions?
# Posted By Justin Mclean | 5/2/07 12:58 AM