Data System Design - Reliability, Scalability and Maintainability
20 Nov 2021A successful data system should be able to meet various requirements while solving the data problems, including functional requirements and nonfunctional requirements. The functional requirements are often application specific, describing what to be done with the data and how the results can be achieved. In the nonfunctional requirements, there are many factors affecting the design and implementation, wherein three aspects are so important that all should consider throughout the development cycle: reliability, scalability and maintainability.
Any data system is a software developed by humans, deployed and run in the environment composed of hardwares. It is important to make the system work correctly even when faults occurs. The problem can be caused by hardware faults (e.g., network interruption, disk failure, power outage), software issues (e.g., bugs) and human errors (e.g., misconfiguration, mistakes in operation). Reliability introduces the concept and a guidance to exploit fault-tolerance techniques.
In the real world, the data system grows as it has an input with larger data or traffic volume, often more complex. We need a precise measurement of load and performance, based on which the strategies can be taken to keep performance constant, therefore achieve good scalability.
Additionally a data system often has a long life cycle, thus its maintainability plays a critical role in the course of evolution. As it suggests, “good operations can often work around the limitations of bad software, but good software cannot run reliably with bad operations”. Both engineering and operations teams ought to work together, sometimes grow with the system.
The book Designing Data-Intensive Applications gives a good discussion and helps with a guidance in the design of data system with reliability, scalability and maintainability considered. Here I presented the first chapter, as the start of a long journey.