Examining Main Memory Databases
By Steve Graves
This concept has been extended recently with a new type of DBMS, designed to reside entirely in memory. Proponents of these main memory databases (MMDBs) point to groundbreaking improvements in database speed and availability, and claim the technology represents an important step forward in real-time data management.
The question is important for developers of communications switches, caching appliances, set-top boxes, consumer gear, and other Internet-enabled devices. After all, such devices often demand real-time responsiveness. On the other hand, many embedded systems engineers are already familiar with traditional disk-based data management, and will want to know the expected payoff, in terms of performance, from time invested in learning this new technology.
To examine this question, McObject recently benchmarked three approaches to data management. The first used db.linux, a disk-based DBMS, in the traditional fashion, with caching as the only means for reducing disk I/O. The second test differed only in deploying db.linux on a RAM-disk. For the third, application and database design were held constant, but the disk-based database was replaced by McObject's eXtremeDB main memory database.
The open source db.linux was chosen as a representative disk-based database, due to its longevity (first released in 1986 under the name db_VISTA) and also to this author's familiarity with its usage in several Internet-enabled device applications. Technically, eXtremeDB and db.linux presented an "apples to apples" comparison. They have similar database definition languages and are designed to be embedded in applications rather than provide a separately administered server, like Microsoft SQL Server or Oracle. Each has a relatively small footprint when compared to enterprise class databases, and offers a navigational API for precise control over database operations.
The benchmark used two nearly identical database structures and applications, developed to measure performance in reading and writing 30,000 records. Tests were performed on a PC running Red Hat Linux 6.2, with a 400 MHz Intel Celeron processor and 128 megabytes of RAM.
Figure 1 compares the performance of eXtremeDB and db.linux in a multi-threaded, transaction-controlled environment, with db.linux maintaining database files on disk, as it naturally does. The traditional DBMS lags significantly.
What happens when the db.linux DBMS moves to a RAM-drive, completely eliminating its physical disk access? As illustrated in Figure 2, db.linux gains performance improvements of almost 4X for read access and approximately 3X for writing to the database. Clearly, moving a disk-based database's files to a RAM-drive can improve performance.
But it is equally clear that while this approach reduces latency in the disk-based database, the database designed for in-memory use is still faster. The eXtremeDB MMDB outperforms the traditional database by 420X for database writes and more than 4X for database reads.
The RAM-drive approach eliminates physical disk access. So why does the disk-based database still lag? The problem is that disk-based databases incorporate processes that are irrelevant for main memory processing, and the RAM-drive deployment does not change such internal functioning. These processes cannot be "turned off," even when no longer needed, resulting in the following types of performance overhead:
To reduce physical disk access, virtually all disk-based databases use sophisticated I/O minimization techniques. Foremost among these is database caching, which keeps the most frequently used portions of the database in memory. Caching logic includes cache synchronization, which makes sure that an image of a database page in cache is consistent with the physical database page on disk, as well as cache lookup, which determines if data requested by the application is in cache and, if not, retrieves the page. Cache lookup also selects data to be removed from cache, to make room for incoming pages. If the outgoing page is "dirty" (holds one or more modified records), it protects other applications from seeing the modified data until the transaction is committed.
Caching functions present only minor overhead when considered individually, but dampen performance in aggregate. Each process plays out every time the application makes a function call to read a record from disk -- about 90,000 function calls in the test application. In contrast, main memory databases, such as eXtremeDB, eliminate the need for caching.
Transaction Processing Overhead
Transaction processing logic is a major source of processing latency. In the event of a catastrophic failure such as loss of power, a disk-based database recovers, upon restart, by committing or rolling back transactions from log files. Disk-based databases are hard-wired to keep these logs, and to flush log files and cache to disk after the transactions are committed. A disk-based database doesn't know that it is running in a RAM-drive, and this complicated processing continues, even when the log file exists only in memory and cannot aid in the event of system failure.
The MMDB used in this benchmark also provides transactional integrity, but does so by simply maintaining a before-image of the objects that are updated or deleted, and a list of database pages added during a transaction. When the application commits the transaction, the memory for before-images and page references returns to the memory pool (a very fast and efficient process). If an in-memory database must abort a transaction -- for example, if the in-bound data stream is interrupted -- the before-images are returned to the database and the newly inserted pages are returned to memory.
In the event of catastrophic failure, this database image is lost -- and this points to a difference between typical intended uses of main memory databases, and the more business-oriented tasks for which disk-based databases are chosen. If the system using the MMDB is turned off or some other event causes the in-memory image to expire, the database is simply re-provisioned upon restart. Examples of this include most real-time device-based applications, such a program guide application in a set-top box that is continually downloaded from a satellite or cable head-end, a network switch that discovers network topology on startup, or a wireless access point that is provisioned by a server upstream.
This does not preclude the use of saved data. At startup or any other point, the application can open a stream (a socket, pipe, or a file pointer) and instruct eXtremeDB to read or write a database image from, or to, the stream. This feature can be used to create and maintain boot-stage data. The other end of the stream can be a pipe to another process, or a file system pointer (magnetic, optical, FLASH, etc.).
Data Transfer Overhead
Data transfer also helps explain the performance disparity above. With a disk-based database, the application works with a copy of the data contained in a program variable that is several times removed from the database. Consider the "handoffs" shown in Figure 3 for an application to read a piece of data from the disk-based database, modify it, and write that piece of data back to the database.
In contrast, with the MMDB, the application may copy the data in local program variables, but is not required to by the database. Instead, the application can work with the data via a pointer that refers directly to the data item in the database.
Operating System Dependency
Operating system dependency presents a final significant performance variable. Disk-based databases, whether deployed traditionally or on RAM-disk, still use the underlying file system to access data within the database. The quality of data-seeking functions provided by a particular OS (such as lseek() under Linux) will affect performance, for better or worse. In contrast, the main memory database operates independent of the OS file system and is highly optimized for data access.
Footprint and Reliability
While not directly tied to performance, two other distinctions between MMDBs and traditional databases are worth considering. One is footprintthe absence of caching functions and other unnecessary logic means that memory and storage demands are low. In this test, eXtremeDB maintained a total RAM footprint of 108K in this test and 20.85MB when fully loaded with data, compared to db.linux's footprint of 323K and 31.8MB with data. The second benefit is greater reliability stemming from a less complex database system architecture. It stands to reason that with fewer interacting processes, this streamlined database system should result in fewer negative surprises for end-users and developers.
Steve Graves is co-founder and CEO of McObject. A complete report on this comparison, with database schemas and application source code, is available at www.mcobject.com.