MemSQL is a distributed, in-memory database that is part of the NewSQL movement.[2] It is a relational database management system (RDBMS) which complies with the properties of atomicity, consistency, isolation, durability (ACID). It most notably converts Structured Query Language (SQL) into C++, via automatic programming, termed code generation.[3] It is being developed by MemSQL Inc., that was founded in 2011 and is a graduate of the Y Combinator startup program. MemSQL Inc., has raised more than $45 million to date from a variety of investors including First Round Capital, IA Ventures, NEA, and several prominent angels including Paul Buchheit, Max Levchin, Aaron Levie, and Ashton Kutcher.[4] MemSQL Inc. launched its database to the public on June 18, 2012.[5]

Core technology

MemSQL combines lock-free data structures and a just-in-time compilation (JIT) to process highly volatile workloads.[6] More specifically, MemSQL implements lock-free hash tables and lock-free skip lists in memory for fast random access to data. Queries sent to the MemSQL server are converted into C++ and compiled through GNU Compiler Collection (GCC).[7] Queries are stripped of their parameters and the query template is stored as a shared object which is subsequently matched against incoming queries to the system. Code generation and the execution of pre-compiled query plans removes interpretation along hot code paths, providing highly efficient code paths that minimize the number of central processing unit (CPU) instructions required.

MemSQL is wire-compatible with MySQL.[8] Applications can connect to MemSQL through standard Open Database Connectivity (ODBC) and Java Database Connectivity (JDBC) connectors and MySQL clients and drivers.[9]

Durability

Even though MemSQL stores data in memory, MemSQL is durable by implementing a write-ahead log and snapshots, similar to checkpoints. On default settings, as soon as a transaction is acknowledged in memory, the database will write the transaction to disk as fast as the disk allows.[10]

Replication

MemSQL supports a native replication protocol that ships its transactional log to slaves. MemSQL currently supports master-slave replication.

Distributed architecture

MemSQL is a distributed database that works by the concept of aggregators and leaf nodes.[11] An aggregator is responsible for breaking up the query across the relevant leaf nodes and aggregating results back to the client. A leaf node is a MemSQL database. MemSQL uses hash partitioning to distribute data uniformly across the number of leaf nodes.[12] MemSQL made the distributed version of its system generally available on April 23, 2013[13] with a trial edition available for download on their website.[14]

Version history

  • MemSQL 1b – first general availability in June, 2012.[15]
  • MemSQL 1c – minor feature update, released July 2012.
  • MemSQL 1.8 – replication and expanded SQL surface area, released December 2012
  • MemSQL 2.0 – general availability of distributed system.[16] First release of MemSQL Watch operational dashboard.[17]
  • MemSQL 2.5 – JSON Data type[18]
  • MemSQL 3.0 – Columnar data store[19]
  • MemSQL 3.1 – Views, Cross-Datacenter replication[20]
  • MemSQL 3.2 – Improvements to column store engine[21]
  • MemSQL 4.0 – Geospatial support, distributed joins[22]
  • MemSQL 4.1 – Integration with Spark, CTEs[23]

References

External links