What is NoSQLMark?

NoSQLMark is a research project and benchmarking framework developed at the University of Hamburg. Based on YCSB as the most popular and widely accepted benchmarking framework in the field of NoSQL OLTP benchmarking, NoSQLMark takes its best features (e.g. support for many different databases or a generic workload generator) and additionally provides:

Scalability

NoSQLMark is built on Akka and Scala and thus features scalable workloads compatible with YCSB. In addition, nodes can easily join the benchmarking cluster and the workload can define how many nodes should be used to set the database under load. You no longer have to start different YCSB nodes manually nor do you have to aggregate the different measurement results by your own.

Validated Measurement Methods

SickStore is our homegrown single-node inconsistent key-value store designed to validate different measurement methods. Originally developed to validate staleness measurement approaches (cf. Who Watches the Watchmen?), it can also simulate various aspects of system behavior and corresponding anomalies to validate NoSQLMarks new distinctive measurement methods.

Coordinated Omission Avoidance

NoSQLMark avoids the Coordinated Omission Problem by measuring asynchronously by default. The Coordinated Omission Problem became very popular due to a talk from Gil Tene, CTO and co-founder at Azul Systems, who also coined the name. In short, it describes a situation where benchmarking tools drop potential samples in the result set. In fact, most benchmarking tools are broken in this sense. This also applies to YCSB, but since 2015 some efforts have been made to correct the poblem and additionally report corrected latencies alongside the actual measurements. In our publication Coordinated Omission in NoSQL Database Benchmarking, we show that this correction can produce worse results than the uncorrected measurements.

Valid Consistency Measurement

There are several academic papers about staleness-based consistency measurement methods. However, these rely on either system clocks whose clock drift distort the measured values in the distributed context, or the measurement method itself is faulty, as in the case of YCSB++ (see our paper Who Watches the Watchmen?). Akka's actor model enables us to implement our approach as already proposed in our Survey on NoSQL OLTP Benchmarking to provide clearly defined lower and upper bounds for the reported staleness values in a given database setting.

Application Oriented Workloads

work in progress

Transactional Benchmarking

work in progress

Getting Started

Download the latest release: ...

							
Run
cd nosqlmark-1.0.1
bin/backbench
[2015-04-22 15:01:37,495] INFO ...
							
To start NoSQLMark in a cluster environment, you need a change the configuration file config/nosqklmark.conf.
								

Publications & Theses

Image
S. Friedrich, W. Wingerath, N. Ritter
Coordinated Omission in NoSQL Database Benchmarking
Datenbanksysteme für Business, Technologie und Web (BTW 2017), Stuttgart, Germany. Workshopband, GI Bonn, 2017, 215-22
Image
F. Gessert, W. Wingerath, S. Friedrich and N. Ritter
NoSQL database systems: a survey and decision guidance
Computer Science - Research and Development, 2016, 1-13

Image
W. Wingerath, S. Friedrich, F. Gessert and N. Ritter
Who Watches the Watchmen? On the Lack of Validation in NoSQL Benchmarking
Datenbanksysteme für Business, Technologie und Web (BTW 2015), 16. Fachtagung des GI-Fachbereichs DBIS, 4.-6.3.2015 in Hamburg, Deutschland, pp. 351-360
Image
S. Friedrich, W. Wingerath, F. Gessert and N. Ritter
NoSQL OLTP Benchmarking: A Survey

44. Jahrestagung der Gesellschaft für Informatik, Informatik 2014, Big Data - Komplexität meistern, 22.-26. September 2014 in Stuttgart, Deutschland, pp. 693-704

Master and Bachelor thesis topics

An overview of open topics at the database and information systems group (ISYS) at
University of Hamburg with focus on NoSQL database benchmarking

Each thesis may be written in either English or German.

Quantifying Isolation Anomalies of Web-Scale Transactional Databases

How the phantom learns to fly

The aim of this thesis is to develop a transactional workload for NoSQLMark that quantifies isolation anomalies.
Background: Isolation levels and their capabilities to prevent concurrency anomalies are studied in the field of relational database system theory [ Adya, Berenson et al.]. But in practise, many describtions of isolation levels in database vendors' documentations are vague and sometimes their namings are ambiguous (e.g. DB2's Rebeatable Read). Even worse, the gold standard of isolation levels, namely serializability, is rarely used and some famous databases like Oracle 11g do not even provide it [see Bailis et al.]. So practitioners are faced with the problem of deciding which isolation level is appropriate for their application, based on the limited information which guarantees existing databases do provide.

However, only some research investigates how isolation anomalies can be quantified to give insights into actually displayed system guarantees [Fekete et al., Zellag]. One project that aims to manually test which anomalies are actually prevented by isolation levels of some relational database systems is Hermitage [ Kleppmann].

In recent years, there is an increasingly interest on multi-item transactions in highly scalable (NoSQL) database systems [ Bailis et al., Dey]. To the best of our knowledge, there exists only one rudimentary work on quantifing transaction anomalies in these systems. The proposed extension YCSB+T [Dey *] implements a so-called closed economy workload which simulates bank account transactions in a closed system. For validation, a simple anomaly score, defined as the difference between the initial and final sum of all account balances, normalized by the amount of executed operations is computed. However, this score catches only a fraction of lost updates, precisely only those that change the overall sum. In conclusion, there is a need for a scalable transactional benchmark quantifying the different isolation anomalies.

Contact Info

The NoSQLMark team is Steffen Friedrich, Wolfram Wingerath, and Norbert Ritter.

  • University of Hamburg
    Department of Informatics
    Databases and Information Systems
    Vogt-Kölln-Straße 30
    22527 Hamburg
  • +49 40 428 83 2326
  • friedrich@informatik.uni-hamburg.de