The CAP Theorem in Distributed Systems
The CAP theorem, formulated by computer scientist Eric Brewer, states that distributed data stores cannot simultaneously guarantee all three of: Consistency (all nodes see the same data simultaneously), Availability (the system remains operational), and Partition tolerance (the system continues to operate despite network failures). This fundamental constraint shapes the design of every distributed system.
In practical terms, since network partitions are inevitable in distributed systems, you must choose between consistency and availability during a partition. Consistency-oriented systems (CP) like HBase or MongoDB in certain configurations ensure all nodes have the same data but may refuse requests during network issues. Availability-oriented systems (AP) like Cassandra or DynamoDB remain operational but may serve stale data until the partition heals.
The theorem doesn't mean you completely abandon one property - it's about trade-offs during failure scenarios. During normal operation, systems can provide all three properties. The real question is: what happens when things go wrong? An e-commerce site might choose availability, preferring to show possibly outdated inventory rather than refusing customer requests. A financial system might choose consistency, ensuring account balances are always accurate even if it means temporary unavailability.
Modern distributed systems offer tunable consistency levels, allowing developers to make CAP trade-offs on a per-operation basis. Cassandra, for example, lets you choose consistency levels from "ONE" (high availability, eventual consistency) to "ALL" (strong consistency, lower availability). Understanding CAP theorem helps architects make informed decisions about distributed system design and set appropriate expectations for system behavior during network failures.