[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Synchronizing distant databases



Mathieu Parent wrote:

> Hi everyone,
>
> I'm currently working on a little project for a business.  I would 
> need to synchronize a few tables (maybe even only a few fields in 
> those tables) from 2 distant databases.  One database will be at a web 
> host, and the other one will be local.  For the databases, I was 
> planning on using either postgreSQL or mySQL, or maybe a mix of both.  
> (Personally I'd rather use postgreSQL, but most web host only 
> advertise mySQL)
>
> We taught we could mirror the entire database every so often.  But 
> that would be a big waste of ressources since only certain fields of a 
> few records would need to be updated.
>
> Does anyone have experience synchronizing databases (or parts of 
> databases)?
>
>
> Jd
>
> _________________________________________________________________
> Keep up with high-tech trends here at "Hook'd on Technology." 
> http://special.msn.com/msnbc/hookedontech.armx
>
If it is only a few records that need to be synchronised you could try 
adding a boolean field called synchronised in your tables.
When change or adding (mutating) a record you ensure that you set the 
synchronised value on 0. When you want to synchronise you databases use 
the following routine:
1) Lock your all your database tables,
2) Query all table for  records (select * from XYZ where 
synchronised=false;)
2) update all record where synchronised=false to synhronised=true. (All 
records should now be true since they either were true or were false and 
selected and marked as true.
3) unlock all your database tables
4) Put these selected records under step (2) results in a dump
5) Copy these delta's /dumps to you webserver
6)Lock all your webserver database tables
7) Import these delta's into your webserver. (Using some logic to find 
if it needs to be created or updated (primary keys can help to find this 
out since they must be unique) Otherwise use a field to indicate this 
(synchonised 0=Nothing changed, 1=Created, 2=changed)
8)Unlock all your webserver database tables.

I think this one way replication can help you with a tandem setup of 
master (read/write), slave (read) using a Mcguyver solution, assuming 
normal mysql replication systems can't be used due to your circumstances.

You could if you want to do more advanced MCGuyver replication, use a 
timestamp and a primary key to update keys. This way you could have 
write capability in both servers. In which case the modification of a 
record X on both servers wil result in the most recent change being the 
permanent one. If both time stamps are the same the server with for 
example the highest server ID wins (assume you have a table where the 
database can look this up. (self=1, other=2), in which case you need to 
send the server ID of the server sending the delta in the file as well. 
The server importing can then compare the ID of the sending server with 
the ID of self.
Finding out what deltas need to be send can be done by looking up the 
last timestamp used for delta synchronisation and selecting all records 
with a timestamp after this time. The receiving server will the look to 
see for each record that it receives if his timestamp for that record is
    lower ->Update you record with the new information
    same -> check if your server id is higher then that of the server 
sending the update if yes then update, if no then don't update.
    higher -.>Do not update since your local information is newer then that.
    Not availible -> Add new record into your local table.

Be advised that since these replication are loosly synchronised you MUST 
take care yourself that your querys/routines are not affected by this 
loosy coupled update system. Assuming your primary key is an 
autoincrement int, on both servers that is, then if a client creates a 
new record for a forum then he could get id=3 on your local server. 
Another client is inserting a new but tottaly different forum item on 
the other server and receives ID=3 for this entry as well, then after 
replicating the last inserted record is kept, the other is deleted, in 
this case you would need to make sure the the primary key is for example 
a combination of the server id and an autoincrement number. Otherwise 
you will loose data, but this loss of data can very often be accounted 
for if you design your database /tables with a loosly coupled 
replication in mind.  Asses what updates you are willing to lose, and if 
you are not willing to loose them adjust your design to compensate.

Hope this helps, otherwise move it into /dev/null

Cheers,
Marco Feenstra
Sysadmin @ Den Toom