Distributed Transactions — A way to have strong consistency in distributed system

Shilpa Thota
3 min readJust now

--

Distributed Transaction is a set of operations on data that is performed across two or more data repositories (especially databases). It is typically coordinated across separate nodes connected by a network, but may also span multiple databases on a single server.

There are two possible outcomes

  • All operations successfully complete
  • None of the operations are performed at all due to a failure somewhere in the system. The completed work should be reverted.

This type of operation is in compliance with the “ACID” principles of databases that ensure data integrity. ACID is most commonly associated with transactions on a single database server, but distributed transactions extend that guarantee across multiple databases.

The operation known as a “two-phase commit” (2PC) is a form of a distributed transaction. “XA transactions” are transactions using the XA protocol, which is one implementation of a two-phase commit operation.

A distributed transaction spans multiple databases and guarantees data integrity

How it works?

Distributed transactions is same as regular database transactions but must be managed across multiple resources. This adds more points of failures, such as separate software systems that run the resources, the extra hardware servers, and network failures. So safeguards must be in place to retain data integrity.

For this process to work, we have a distributed transaction manager that coordinates the resources. It can be one of the data repositories that will be updated as part of the transaction, or it can be a completely independent separate resource that is only responsible for coordination. The transaction manager decides whether to commit a successful transaction or rollback an unsuccessful transaction.

First the application requests the distributed transaction to the transaction manager. It then branches to each resource, which will have its own “resource manager” to help it participate in distributed transactions. Distributed transactions are often done in two phases to safeguard against partial updates that might occur when a failure is encountered. The first phase involves acknowledging an intent to commit or a prepare to commit phase. After all resources acknowledge, they are then asked to run a final commit, and then the transaction is completed.

Why do we need Distributed Transactions?

They are necessary when you need to quickly update related data that is spread across multiple databases. If we have multiple systems that track customer information and you need to make a universal update across all records, a distributed transaction will ensure that all records get updated. And if a failure occurs, the data is reset to its original state, and it is up to the originating application to resubmit the transaction

Distributed Transactions for Streaming Data

Distributed transactions are critical today in data streaming environments because of the volume of incoming data. Even a short-term failure in one of the resources can represent a potentially large amount of lost data. Sophisticated stream processing engines support “exactly-once” processing in which a distributed transaction covers the reading of data from a data source, the processing and the writing of data to a target destination (the “data sink”). The “exactly-once” term refers to the fact that every data point is processed, and there is no loss and no duplication. In exactly-once streaming architecture, the repositories for the data source and the data sink must have capabilities to support the exactly-once guarantee. In other words, there must be functionality in those repositories that lets the stream processing engine fully recover from failure, which does not necessarily have to be a true transaction manager, but delivers a similar end result.

When Distributed Transactions are not needed

In some environments, distributed transactions are not necessary and instead extra auditing activities are put in place to ensure data integrity when the speed of the transaction is not an issue. The transfer of money across banks is a good example. Each bank that participates in the money transfer tracks the status of the transaction, and when a failure is detected, the partial state is corrected. This process works well without distributed transactions because the transfer does not have to happen in real time. Distributed transactions are typically critical in situations where the complete update must be done immediately

--

--

Shilpa Thota
Shilpa Thota

Written by Shilpa Thota

Full Stack Developer#TechEnthusiast#Manager#BigFan of Learning AI#

No responses yet