Azer Bestavros
Boston University
(best@bu.edu)
Kwei-Jay Lin
University of California, Irvine
(klin@uci.edu)
Sang Son
University of Virginia
(son@cs.virginia.edu)
[A shorter version of this report will appear in ACM SIGMOD Record]
On March 7 and 8, 1996, the First International Workshop on Real-Time Database Systems (RTDB'96) was held in Newport Beach, California. There were about 50 workshop participants from many countries, including Germany, Netherland, Norway, Sweden, Finland, Korea, Japan, Hong Kong, Taiwan, and USA. Twenty two papers were presented in the 2-day program and they were actively discussed in six different sessions. There were also two panel sessions, with four panelists each, to review and to suggest the technology needed for real-time database applications.
One of the goals of RTDB'96 was to create a forum for recent advances in real-time databases---an area that is becoming more important as real-time computing is needed in our systems and environment. We hoped to, and indeed we did, bring together researchers and engineers from academia and industry to explore the best ideas in real-time database systems research, and to evaluate the maturity and directions of real-time database systems technology. The interaction among all participants in the workshop (such as discussing the advanced functionalities and timely management of data, arguing about the real-time requirements in practical systems, suggesting the new issues to be investigated in future projects, etc.) provided a very precious and fruitful experience for everyone. The sunny weather and the 80-plus-degree temperature had certainly made the workshop even more enjoyable, especially for people from the east coast where some 12 inches of snow fell in those two days!
In RTDB'96, single track sessions were scheduled to give all participants the opportunity to have full interactions with all speakers and panelists, and to exchange opinions with other participants. The technical program covered a range of issues, such as temporal consistency, scheduling, models and benchmarks, concurrency control, and applications. In this report, we provide highlights of the papers presented in the workshop. Postscript files of the papers are available on the WWW from the URL below:
http://www.eng.uci.edu/ece/rtdb/rtdb96.html
This session was to be chaired by Workshop Chair Jane Liu, but she was caught in an ice storm in Illinois and didn't arrive until the second day. So Program Co-Chair Kwei-Jay Lin chaired the session instead. The session was scheduled as the first session so that all workshop participants may have a common understanding/agreement on the temporal constraint aspects of RTDBS.
The first paper was presented by Ming Xiong of the University of Massachusetts at Amherst. The paper discusses models of RTDB transactions and temporal consistency. Several approaches to maintain data temporal consistency were mentioned, including determining the periods for sensor transactions, data version selections, forcing user transactions to delay for a more up-to-date version, etc. The paper also suggested a particular technique for assigning deadlines to transactions based on the temporal constraints imposed on data.
The second presentation was a position statement by Anindya Datta of the University of Arizona on issues involved in designing Active Rapidly Changing data Systems (ARCS). One of the messages from the paper is that using "good-old" OCC concurrency control techniques may be inadequate for active RT databases because "chaining" of transactions is not considered. It has also raised the issue of how updates from the environment (through sensors) should be scheduled against user transactions and other updates.
The third paper was presented by Lei Zhou of the University of Michigan. The paper looks at a feedback control system, where data (e.g. sensor readings, derived data, state information) is shared amongst several processes. It introduces the concept of "completion probability" i.e. the probability that transactions may complete before deadlines. Also, it defines the notion of "interval constraints" for the interval when a transaction may be executed. The paper shows a number of simulation studies for various scheduling algorithms (RM, EDF, FIFO) and concludes that none of them is really much better than the others.
This session was chaired by John Stankovic, who "challenged" the RTDB community to make the case for RTDB technology by showing the "value-added" it brings to as many applications as possible. The session was also concluded by John who presented a list of application domains that came up and added his belief that there are many more.
The first paper was presented by Oystein Torbjornsen of the Telenor R&D in Norway. It describes the architecture of a fault-tolerant RTDB system, called ClustRa, for telecommunication. A parallel database design on a shared nothing architecture is used to achieve scalable throughput. Two-safe replication synchronization over two sites is used to enhance availability. Moreover, techniques such as main memory resident data, main memory logging, and bounded transaction execution are used to provide real-time performance.
The second paper was presented by Antoni Wolski from VTT Finland. The emphasis in this talk was the difference between "Engineering and Science". The authors argued that as an "engineering application" RTDB systems must provide the "best compromise" as opposed to the "most elegant" solution. The issues of cost consciousness are examined (e.g. the learning curve of new techniques and the man/machine interaction issues are important for systems that involve a "person in the loop"---this is emphasized throughout the paper as in section 3.5 where "user friendliness" is highlighted as a reason to modify their trigger definitions for composite events). One of the messages that came out of the discussion that followed the presentation of this paper was that sometimes deadlines are set based on what the system can do (in other words, if the system is built to be twice as fast, then laxities will be halfed). The paper itself discussed the performance constraints that guided the design of the RAPID system---a Client/Server fast-response and active history database.
The third paper was presented by Anders Torne of Linkoping University, Sweden. The paper addresses the use of database "techniques" in process control simulations and systems. Instead of adding real-time functionality to a database management system, it tries to apply database technology selectively within the real-time system and to add some real-time functionality in the database. It discusses techniques used for indexing time-stamped data so that the speed of insertion and retrieval is improved.
The fourth paper was presented by Holger Branding of Darmstadt, Germany. The paper identified a good application domain for RTDB technology, namely WWW applications. It argued for the "unbundling" of RTDBS technology to support WWW application. In particular, it suggested predictive prefetching and contingency plans as important functionalities to be added to the WWW.
This was the first panel of the workshop. The title of the panel was "Are we looking at the right issues of RTDB?". The panel was monitored by Kwei-Jay Lin. The panelists were Doug Locke of Loral Federal Systems, Lui Sha of SEI & CMU, Krithi Ramamritham of the University of Massachusetts, and Brad Adelberg of Stanford University.
Doug Locke first spoke of the application requirements for Aerospace RTDB systems. He emphasized that transactions that miss their deadlines "must" finish and argued for value-cognizant RTDB systems as opposed to the hard/firm deadline paradigm. Several workload requirements in the Air Traffic Control, Spacecraft Control, Training Simulation, etc. were presented and discussed.
Lui Sha then talked about the need for an equivalent to the ACID properties for RTDB systems. He suggested that the notion of "stream data" is fundamental for RTDB. In other words, one may think of stream-data RTDB as different enough from traditional RTDB that deal with closed systems where changes to data are only carried by transactions from within the system.
Brad Adelberg described the STRIP project at Stanford, which is aimed at financial applications (e.g. financial market monitoring). Much emphasis have been placed on the data update streams received by the database and how they can be effectively handled to provide a real-time view in the database.
Krithi Ramamritham finally discussed the impact that RTDB technology has made on commercial products. He pointed out that while temporal and active database ideas have found their way into commerical products as well as into SQL, real-time database ideas have not. One plausible reason is that developing time-cognizant extensions to database protocols requires a fairly substantial overhaul. The second reason is that a large proportion of the techniques developed thus far apply only to soft real-time constraints with the percentage of missed deadlines being the metric. This implies that the use of the protocols is intended more to improve performance and not for increasing functionality, unlike in temporal and active databases. He emphasized that RTDB researchers must aim at achieving greater predictability in real-time databases so that we get imporved perfomance as well as predictability that is quantifiable, the latter is a property that is not achievable simply by "faster hardware".
This session was chaired by Al Mok of the University of Texas at Austin. The first paper was presented by Padron-McCarthy of Linkoping University in Sweden. This work applies the performance polymorphism ideas from imprecise computation research (e.g. FLEX language) to declarative query languages and to query optimization. The talk defines operations and transformations for producing imprecise results.
The second paper was presented by Azer Bestavros of Boston University. The paper presents a paradigm for admission control in RTDB. It suggests that admission control (and overload management in general) is much more important than other RTDB resource management techniques (e.g. concurrency control and scheduling). For example, the paper shows that proper admission control makes simple concurrency control protocols (e.g. 2PL-HP protocols) perform as well as sophisticated ones (e.g. Wait-50 and SCC protocols). The particular paradigm presented allows for hard deadlines to be specified as deadlines on the termination of either the main transaction, whose WCET and resources are not known a priori, or on the termination of a smaller compensating "recovery block", whose WCET and resources are all known a priori.
The third paper was presented by Aerts of Eindhoven University, Netherland. This paper argued that approximate analytical models of performance for real-time schedulers are important. One interesting aspect of the paper is the use of probabilistic and Markov models to describe conflicts between transactions (for example when trying to model OCC). The paper showed that for some systems, performance prediction through the use of analytical models of performance yields results that are quite good (compared to results obtained via simulations). Questions about the paper concentrated on the tractability of this approach. It may be possible to build models for simple protocols, but as the system complexity increases, it is not clear if such models could be generalized.
The fourth paper was presented by Henky Agusleo of the University of Michigan. This paper argued for the use of special hardware support for high-performance main-memory databases through the use of logic-enhanced memories. The hardware support may be used to improve performance, but not "predictability".
During the discussion that followed these presentations, there were questions about the "cost" of "fancy" RTDB protocols versus the cost of admission control. The motivation of the question was that RTDB must be "light weight" and that unless solutions we provide are easily portable, they will never be implemented in real-systems. Krithi Ramamritham intervened to give evidence of this from concurrency control research, where there are hundreds of algorithms proposed, but only few (mainly 2PL) implemented in real systems. In reference to admission control and overload management techniques in general, Azer Bestavros suggested that these techniques in effect reduce the overall overhead, because they make the use of other complicated real-time protocols (e.g. real-time concurrency control techniques) unnecessary! Another point brought up was that the techniques must be possible to add as a "layer" on top of existing off-the-shelf DB systems. Admission control and overload management is a good candidate "layer".
To conclude this session, Al Mok challenged the participants to identify a set of properties that are worth proposing as "standard" properties of RTDBS (akin to the ACID properties of traditional DBMS). The discussion that ensued questioned whether a single model will ever be possible, given the richness of RTDB systems---a richness that comes from the disparity of application requirements. This point was very much emphasized by John Stankovic. Another thread in this discussion had to do with the "quantification" of predictability and perhaps the use of a probabilistic model (akin to the ideas from the 3rd paper in the session) or the use of values and various QOS guarantees (akin to the ideas from the second paper in the session).
This session was chaired by J.Y. Chung of IBM. The first paper was presented by Marie-Anne Neimat from HP Labs. The paper addressed an important (and often neglected) aspect of RTDB systems---that of query optimization. It proposed a nice paradigm for cost modeling that abstracts away many of the details of the underlying machinery and algorithms. The paper focused on main-memory databases. It contrasted three different approaches, based on hardware costs, application costs, and execution costs, and argued that the latter provides rapid, accurate, and most importantly portable cost model. The cost modeling proceeds in two phases: model design and model instantiation and verification (what we may want to call model validation). The second involves determining the relative execution times for the various "variables" or "parameters" in the model. This is done through the execution of queries that differ *only* in the variable/parameter to be measured. There were a few questions and suggestions. In particular, the work as it stands does not allow for absolute timing analysis because the cost of various operations is measured in a relative fashion. A simple extension would allow the same paradigm to be used to provide estimation of "run time" of the queries, which is crucial for ensuring predictability. Another suggestion was related to "load parametrization" so that it may allow the estimation of cost to be parametrized based on the "load" in the system.
The second paper was presented by Jonas Mellin of University Skovde of Sweden. The paper presents a model that could be used to derive the worst-case execution delay and maximum frequency of events in DeeDS---a prototype of distributed active real-time database systems being developed at University Skovde. Since concurrency control was not considered in the paper, several questions were raised about the correctness issues.
The third paper was presented by Carolyn Boettcher of Hughes Aircraft Co. This paper offered a number of benchmarks that could be used to define the structure of an avionics database and test the various aspects of RTDB systems performance. The benchmarks were abstracted out from actual avionic systems. Seven test scenarios were defined ranging from periodic-readers-only to periodic-readers plus sporadic-readers plus periodic-updates. The paper fills in a void that was identified throughout the workshop---that of quantifying "RTDB" properties. The discussion that ensued reaffirmed the need for similar benchmarks for other applications. This paper fits well as a bridge between an "applications" paper and a "requirement specification for RTDB" paper. It could be improved by showing how these benchmarks could be extended for other applications.
This session was chaired by Kane Kim of the University of California at Irvine. The first paper was presented by K.Y. Lam from the City University of Hong Kong. The paper studies how priorities should be assigned for sub-transactions in distributed real-time databases. Two approaches are used: one is based on the real-time constraint of the base transaction, and the other based on the data contention with other transactions. The performance of different strategies were studied by simulation.
The second paper was presented by James Anderson of the University of North Carolina. This paper presented an efficient implementation of Optimistic Concurrency Control with Broadcast Commit (OCC-BC) using the idea of lock-free objects for main-memory RTDB systems, which is an efficient implementation technique that is applicable, not only to concurrency control for RTDB, but also to other problems involving synchronization.
The last paper was presented by Gultekin Ozsoyoglu of Case Western Reserve University. This paper presents a "model" for cooperation between real-time transactions for multimedia. An implementation of the ECA (Event-Condition-Action) rules used in active databases was presented for cooperative real-time transactions.
This session was chaired by Kang Shin of the University of Michigan. The first paper was presented by Michael Squadrito of MITRE. This paper presents an extension of priority-ceiling to be applied to object-oriented databases. The idea is an extension to the read/write priority ceiling protocol by defining a compatibility table for all the methods defined for an object, and then use this table to come up with an "affected set priority ceilings" (ASPC) that could be used to regulate the access to the object to ensure consistency. During the discussion, there were questions about the overhead of such an approach, because it requires (and the correctness depends on) the designers to identify the "conflict modes". There were questions regarding "automated" identification of a "default" compatibility table to assist the designers.
The second paper was presented by Le Gruenwald of the University of Oklahoma. It presented a simulation study on the recovery issue for main-memory RTDBS. The main argument in the paper is to have the rate of data checkpointing related to the time-constraints associated with the data.
The third paper was presented by Sharma Chakravarthy of the University of Florida. This paper presents a concurrency control algorithm that attempts to reduce the hazards of blocking-based algorithms and restart-based algorithms by suggesting an algorithm that combines them (using alternative "shadow"). The work is similar to the Speculative Concurrency Control (SCC) work by Bestavros and Braoudakis of Boston University in RTSS'94 and VLDB'95.
This last session of the workshop was a panel discussion monitored by Program Co-Chair Sang Son. The panelists were Jane Liu, Al Mok, Kang Shin, and John Stankovic. Janet Prichard of U. Rhode Island was also invited to give a review of the current real-time SQL effort.
Sang Son pointed out that RTDB'96 workshop was timely, since demand for advanced functionalities and timely management of data in new applications require practical solutions. He then asked each panelist whether the current research is on the right track and what are the remaining critical issues to be addressed.
John Stankovic first identified the key issues for RTDB systems, including predictability, fault-tolerance, and QOS for multimedia management. He argued that the technologies developed by the RTDB research community should show that RTDB systems can do significantly better than traditional approaches vis-a-vis properties such as cost, performance, functionality, and availability. It was generally agreed that we should focus on a few driving applications in which traditional DBMS cannot serve or are very inefficient to serve. He also pointed out that integrated solutions for distributed RTDB systems architecture are needed.
Jane Liu talked about the lessons we learned: how to schedule transactions using the timing constraints and how to maintain temporal consistency of data. She emphasized that we need to utilize semantic information and different query processing methods for QOS management. She also felt that deciding on a small set of effective concurrency control algorithms is important. Before this can be accomplished, however, some benchmarks for RTDB systems must be developed.
Kang Shin argued that we need to develop real systems, demonstrating usefulness using benchmarks and real applications. The first step is to form a consensus on terminology and concepts being used within the RTDB research community. He also discussed some technical issues that are yet to be addressed, including OS interface, ACID-equivalent properties for RT transactions, and fault-tolerance issues.
Finally, Al Mok pointed out that there are strong motivations behind the ACID properties: granularity, consistency, non-interference, and failure semantics. He argued that we need to consider what should be the right characterization of the requirements for real-time transactions.
At the end of this panel, the workshop was declared a success. Most participants showed strong support to have the workshop continued next year.