File Replication on High Availability Load Balanced Systems

HA or High Availability concept for a website is becoming more common. The concept involves, setting up a cluster of computers that are always ready and has a backup when the main server goes down. Extending the possibilities in this concept, the HA Cluster is now implemented with a load balancer and failover redundancy such that, the user will never experience any hindrance of services. To implement such a complex setup in Linux, many open source technologies work in hand to achieve the result. One of the most important tasks of this implementation is keeping the file in different server sync or to attain a replica model of the file system across the servers. We will go through some of the best techniques implemented for this.

  • DRBD (Distributed Replicated Block Device)
    DRBD which when expanded becomes Distributed Replicated Block Device. As the name described it form an array of replicated block devices, unlike any other method this comes as a kernel module and replication take place on block levels. Thus it will be faster and provide better performance, but this service is unable to work alone. It is usually controlled by cluster management software like PCS, HEARTBEAT etc. In normal conditions DRBD work in a master-slave scenario and when master vanishes from the setup, cluster management software promotes a slave to master levels. To help with load balancing system, DRBD has a dual master mode in which both the node act as master and is available for the mount. Another adjective is that It also need file management system and lock management systems to remove any possible split brain scenarios. Keeping apart the bulk of services needed for its successful working DRBD still remains one of the fastest replication available.
  • GlusterFS
    Next, we have RedHat candidate on this platform, the GlusterFS. As the names say it is a file system built on NAS (Network-Attached Storage) background. The system is self-healing and can handle all the task necessary for replication of an issue. This system works by building up blocks or bricks in servers for storage. These brick will act as a single unit for any point of implementation. In an easy word, we could say it is like networked LVM system from RedHat. The advantage that we have here with this system is self-healing and adding up of brick or blocks to the server is pretty much easier and is more easy to implement and keep track than DRBD method. Also, we have two level implementation of brick which is distributed and replicated methods. Keeping apart all the good side of this technology come very serious bad side. Gluster is known for its slow performance and its ability to consume resources in a large manner. The performance can be increased marginally by implementing compression method in Gluster but still has an issue with its speed. So, you are looking for Gluster implementation there should considerable resources in the server for support along with good network speed between servers.
  • CEPH
    Now, we can take a look at CEPH method. CEPH is all in one solution for a distributed file system. The better part of this technology is that it comes with its own file system making it more fast and reliable compared to others. Again, since this is single controller software, there is less process to keep track and maintenance overhead. CEPH come with replicated and distributed architecture like Gluster but with better performance. The cons that come with it is, successful deployment requires 4 nodes, while one is exclusively used for admin and monitoring the services and others serve as OSD. There are 2 node solutions that are stable but the main would require more memory in this case.  Not to mention scaling available for this platform is pretty advanced as OSD can be added and removed without any hindrance.

To conclude we would say if performance is what you are looking for go with DRDB setup hence it stands high on the benchmark. But it is limited to a two-node system in dual master mode(replication).  And if you are looking for multi-node implementation CEPH stands apart good in performance.

Skynats have very well knowledge and experience in setting and configuring the above 3 methods. If you have any queries or need any assistance in implementing them, contact our support team for a free consultation on our live chat or email at

Liked!! Share the post.

Share on facebook
Share on twitter
Share on linkedin
Share on reddit


Get Auditing Report of Your Server for FREE!!



Server Management