The original purpose of developing the database analyzer is for the performance study of a few concurrency control algorithms. As shown in the figures, the concurrency control algorithms we have implemented in the software are 2PL (two phase locking), SI (Snapshot Isolation), WDL (Wait-Depth Limit), OCC (Optimistic Concurrency Control), ROCC (Read-commit Order Concurrency Control), and MROCC (Multi-version Read-commit Order Concurrency Control). 2PL and SI are two concurrency control algorithms widely used in commercial products. They guarantee transaction execution correctness at different isolation levels. WDL, developed by IBM researchers, is a lock-based algorithm while OCC needs no locks during transaction execution. ROCC is an algorithm we designed to improve performance of single-version database systems. MROCC is an algorithm we designed for multi-version database systems.

The figure above shows the SI and MROCC achieve much higher throughput than 2PL – the peak throughput that SI and MROCC can achieve are about 120 while the peak throughput that 2PL, a widely used CC algorithm in single-version database systems such as Microsoft SQL Server and IBM DB2, can achieve is only 9.5. SI is an algorithm proposed in 1995 for multi-version database systems. PostgresSQL and Oracle 8i/9i implemented SI at “serializable isolation level”, though they both have the “write skew” problem which may cause data corruptions to occur “thousands per day”. Another problem with SI is “snapshot too old” problem. In oracle, an error message “ORA-01555: snapshot too old” will be given if a transaction encounters “snapshot too old” problem. MROCC is mainly devised to fix the “write skew” problem with SI, though it also achieves slightly higher throughput than SI. We also developed a “dynamic” CC algorithm which can fix the write skew problem and partially fix “snapshot too old” problem. The “dynamic” CC algorithm uses “best-effort read” instead of “snapshot read” to diminish the “snapshot too old” problem with SI. MROCC is a special application case of the dynamic algorithm.

It is known that multi-version database systems require more system resources (CPU, memory and disk spaces). 2PL is the predominant CC algorithm in commercial products for single-version database systems. As shown in the figure, 2PL has very poor performance. To achieve better performance, commercial products usually set the default isolation level to “read committed”, which may cause data corruption much more severe than the isolation level enforced by SI algorithm. Our ROCC algorithm is devised to guarantee no data corruption while achieving much better performance than 2PL for single version systems. As we can see from the figure, the ROCC algorithm does demonstrate significant improvement. Among the 4 CC algorithms compared in our experiment, ROCC has the highest transaction throughput while 2PL has the poorest transaction throughput. The WDL algorithm was proposed by IBM researchers in 1992 and the OCC algorithm was proposed in the 80’s. One may wonder why 2PL is still the predominant algorithm in commercial products of single version database management systems. The answer lies in the transaction restart ratio. 2PL has much lower restart ratio than OCC and WDL, as we shall see in the figure below. The performance improvements attained from OCC and WDL are not good enough to the industry to compensate for the costs of increased restarts. Furthermore, more restarts consume more system resources, which was a big concern in the old days when CPU speed was slow and memory bars, disks were expensive.

In the figure we plotted the throughput performance against restart ratio to see more clearly the cost of performance improvements. The restart ratio is defined as the average number of restarts to commit a transaction. From the figure, we can see that 2PL has the lowest restart ratio when it reaches performance peak point while OCC has the highest restart ratios. There is an area in which WDL has higher throughput than OCC while maintaining a lower restart ratio. Considering the restart ratio can be controlled by confining the number of concurrent transactions competing for resources, it is fair to say that WDL is better than OCC if the system is controlled to operate in this area, though the performance gain is not attractive compared to 2PL due to the cost of restart ratio. Interestingly, in the same area, ROCC performs significantly better than WDL – almost doubles the throughput of WDL and quadruples the throughput that 2PL can achieve, which certainly would be a selling point, especially for the machines we have that are much faster and cheaper than before.

The cost of resources may no longer be a problem with today’s technology – machines are fast and amazingly cheap. As a result, multi-version systems become increasingly popular (for example, the impressive market share of Oracle 8i/9i). As shown in the figure, unlike WDL and OCC, SI achieves significantly better performance at little cost of restart ratio – SI’s throughput reaches its highest (120) at the restart ratio 0.1. Compared with OCC which reaches its highest throughput 63 at a restart ratio 0.94, SI holds an obvious advantage. This explains why OCC has failed and why SI is so successful in the industry. For the same reason, we believe our MROCC and dynamic CC will be great success in the future. MROCC and dynamic fix the problems with SI and achieve high performance at low cost. Also ROCC holds a selling point in the SV-DBMSs (single-version database management systems) because of its fairly low restart ratio and much higher throughput than 2PL.

The database analyzer is built in C++ with the help of Borland C++ Builder. Patent on ROCC, MROCC and DROCC (Dynamic CC) is pending.

For further information about the six CC algorithms, please see the following papers.

[1] V. T.-S Shi, W. Perrizo, "Read-commit Order for Concurrency Control in Centralized High Performance Database Systems", to appear, Information journal, International Information Institute.
[2] V. T.-S Shi, "Validation algorithms for transaction processing", submitted to IEEE transactions on knowledge and data engineering.
[3] V. T.S. Shi and W. Perrizo, "A New Method for Concurrency Control in Centralized High Performance Database Systems", in proceedings of ISCA CATA-2002
[4] V. T.-S Shi, Shan Duanmu and W. Perrizo, "Intervening Validations for Concurrency Control", in revision.


Thank you for your visiting!