Mixing SQL with NoSQL using MariaDB and MongoDB

Suppose you have an application developed on Node.js (or any other platform). This application connects to a MongoDB database (NoSQL) that stores ratings of books (number of stars given and reviews). Let’s also say you have another application developed in Java (or Python, C#, TypeScript…etc.). This application connects to a MariaDB database (SQL, relational) and is used to manage the book catalog (title, year of publication, number of pages).

You are asked to create a report that displays the title and rating information for each book. Note that MongoDB databases do not contain book titles, and relational databases do not contain ratings. We need to mix data created by NoSQL applications and data created by SQL applications.

A common approach is to query both databases independently (using different data sources) and process the data to match, for example, by ISBN (ID of the book) and put the combined information into a new object. This needs to be done in Java, TypeScript, C#, Python, or any other imperative programming language capable of connecting to both databases.

[squids.cn] You can currently experience the lowest price RDS on the entire network, free migration tool DBMotion, SQL development tools, etc.

Picture

Multilingual applications

This approach is possible. However, the joining of data is the job of the database. They are created for this kind of data manipulation. Furthermore, with this approach, the SQL application is no longer just a pure SQL application; it becomes a database polyglot, which adds complexity and makes it harder to maintain.

Using a database agent like MaxScale, you can connect to this data at the database level using the language best suited for the data – SQL. Your SQL application does not need to be multilingual.

Although this requires additional elements in the infrastructure, you also get all the functionality provided by the database proxy. For example, automatic failover, transparent data masking, topological isolation, caching, security filters, etc.

MaxScale is a powerful and intelligent database agent that understands SQL and NoSQL. It also understands Kafka (for CDC or data ingestion), but that’s another topic. In short, with MaxScale, you can connect your NoSQL applications to a fully ACID compliant relational database and store the data next to the tables used by other SQL applications.

Picture

MaxScale allows SQL applications to consume NoSQL data

Let’s try this last method in a simple and easy-to-follow MaxScale experiment. You need to install the following on your computer:

  • Docker

  • mariadb-shell tool

  • mongosh tools

Setting up the MariaDB database

Using a simple text editor, create a new file and save it as docker-compose.yml. The file should contain the following content:

version: "3.9"
services:
  mariadb:
    image: alejandrodu/mariadb
    environment:
      - MARIADB_CREATE_DATABASE=demo
      - MARIADB_CREATE_USER=user:Password123!
      - MARIADB_CREATE_MAXSCALE_USER=maxscale_user:MaxScalePassword123!


  maxscale:
    image: alejandrodu/mariadb-maxscale
    command: --admin_host 0.0.0.0 --admin_secure_gui false
    ports:
      - "3306:4000"
      - "27017:27017"
      - "8989:8989"
    environment:
      - MAXSCALE_USER=maxscale_user:MaxScalePassword123!
      - MARIADB_HOST_1=mariadb 3306
      - MAXSCALE_CREATE_NOSQL_LISTENER=user:Password123!

This is a Docker Compose file. It describes a set of services created by Docker. We are creating two services (or containers) – a MariaDB database server and a MaxScale database agent. They will run locally on your machine, but in a production environment they will often be deployed on different physical machines. Please keep in mind that these Docker images are not intended for production environments! They are designed for quick demonstrations and testing. You can find the source code for these images on GitHub. To get the official Docker image for MariaDB, head to the MariaDB page on Docker Hub.

The preceding Docker Compose file configures a MariaDB database server with a database (or schema; they are synonyms in MariaDB) named demo. It also creates a user named user and password Password123!. This user has appropriate permissions on the demo database. There is an additional user named maxscale_user with the password MaxScalePassword123!. This is the user used by the MaxScale database agent to connect to the MariaDB database.

The Docker Compose file also works by disabling HTTPS (don’t do this in a production environment!), exposing a set of ports (more on that later), and configuring the location of the database user and MariaDB database agent (usually an IP address, but here we The database agent can be configured using the name of the container previously defined in the Docker file). The last line creates a NoSQL listener that we will use to connect as a MongoDB client on the default port (27017).

To start a service (container) using the command line, move to the directory where the Docker Compose file is saved and run the following command:

docker compose up -d

After downloading all the software and starting the container, you will have a MariaDB database and MaxScale agent, both pre-configured for this experiment.

Create SQL tables in MariaDB

First, let’s connect to a relational database. At the command line, execute the following command:

mariadb-shell --dsn mariadb://user:'Password123!'@127.0.0.1

Check if you can see the demo database:

show databases;

Switch to the demo database:

use demo;

Picture

Connect to the database using MariaDB Shell

Create books table:

CREATE TABLE books(</code><code> isbn VARCHAR(20) PRIMARY KEY,</code><code> title VARCHAR(256),</code><code> year INT</code><code>);

Insert some data. I’m going to use the cliche method of inserting my book:

INSERT INTO books(title, isbn, year)</code><code>VALUES</code><code> ("Vaadin 7 UI Design By Example", "978-1-78216-226-1", 2013),</code><code> ("Data-Centric Applications with Vaadin 8", "978-1-78328-884-7", 2018),</code><code> ("Practical Vaadin", " 978-1-4842-7178-0", 2021);

Check if the book is stored in the database by running the following command:

SELECT * FROM books;

Picture

Insert data using MariaDB Shell

Create a JSON collection in MariaDB

We don’t have MongoDB installed yet, but we can use a MongoDB client (or application) connection to create collections and documents as if we were using MongoDB, except the data is stored in a powerful, fully ACID-enabled, scalable relational database middle. Let’s give it a try!

From the command line, use the MongoDB shell tool to connect to MongoDB…wait for it…actually a MariaDB database! Execute the following command:

mongosh

By default, this tool attempts to connect to a MongoDB server (which happens to be MariaDB this time) running on your local machine (127.0.0.1), using the default port (20017). If all goes well, you should be able to see the demo database when you run the following command:

show databases

Switch to the demo database:

use demo

Picture

Connect to MariaDB using Mongo Shell

We have connected to a relational database from a non-relational client! Now, let’s create the ratings collection and insert some data:

db.ratings.insertMany([</code><code> {<!-- --></code><code> "isbn": "978-1-78216-226-1",</code><code> "starts": 5,</code><code> "comment": "A good resource for beginners who want to learn Vaadin"</code><code> },</code><code> {<!-- --></code><code> "isbn": "978-1-78328-884-7",</code><code> "starts": 4,</code><code> "comment": "Explains Vaadin in the context of other Java technologies"</code><code> },</code><code> {<!-- --></code><code> "isbn" : "978-1-4842-7178-0",</code><code> "starts": 5,</code><code> "comment": "The best resource to learn web development with Java and Vaadin" </code><code> }</code><code>])

Check if the rating has been saved in the database:

db.ratings.find()

Picture

Query MariaDB database using Mongo Shell

Using JSON functions in MariaDB

At this point, we have a single database that looks from the outside like a NoSQL (MongoDB) database and a relational (MariaDB) database. We are able to connect to the same database and read and write data from MongoDB client and SQL client. All data is stored in MariaDB, so we can use SQL to connect data from a MongoDB client or application with data from a MariaDB client or application. Let’s explore how MaxScale uses MariaDB to store MongoDB data (collections and documents).

Use a SQL client like mariadb-shell to connect to the database and display the table in demo mode:

show tables in demo;

You should see both the books and ratings tables listed. ratings are created as MongoDB collections. MaxScale interprets the commands sent from the MongoDB client and creates a table to store the data. Let’s look at the structure of this table:

describe demo.ratings;

Picture

A NoSQL collection is stored as a MariaDB relational table

The ratings table contains two columns:

  • id: object ID.

  • doc: Document in JSON format.

If we look at the contents of the table, we see that all data about ratings is stored in the doc column in JSON format:

SELECT doc FROM demo.ratings \G

Picture

NoSQL documents stored in MariaDB database

Let’s get back to our original goal – displaying book titles and their rating information. This is not the case below, but let’s assume for the moment that the ratings table is a regular table, with stars and comments columns. If so, it’s easy to join this table with the books table and our job is done: ?

/* this doesn't work */</code><code>SELECT b.title, r.stars, r.comment</code><code>FROM ratings r</code><code>JOIN books b USING(isbn)

Back to reality. We need to convert the doc column of the actual ratings table into a relational expression of the new table in the query. like this:?

/* this still doesn't work */</code><code>SELECT b.title, r.stars, r.comment</code><code>FROM ratings rt</code><code>JOIN ...something to convert rt.doc to a table... AS r</code><code>JOIN books b USING(isbn)

That thing is the JSON_TABLE function. MariaDB includes a comprehensive set of JSON functions for manipulating JSON strings. We will use the JSON_TABLE function to convert the doc column into a relational form that we can use to perform SQL joins. The general syntax of the JSON_TABLE function is as follows: ?

JSON_TABLE(json_document, context_path COLUMNS (</code><code> column_definition_1,</code><code> column_definition_2,</code><code> ...</code><code> )</code> <code>) [AS] the_new_relational_table

in:

  • json_document: String or expression to use to return a JSON document.

  • context_path: JSON Path expression that defines the node used as the row source.

Column definitions (column_definition_1, column_definition_2, etc.) have the following syntax:

new_column_name sql_type PATH path_in_the_json_doc [on_empty] [on_error]

Combining this knowledge, our SQL query would look like this: ?

SELECT b.title, r.stars, r.comment</code><code>FROM ratings rt</code><code>JOIN JSON_TABLE(rt.doc, '$' COLUMNS(</code><code> isbn VARCHAR(20) PATH '$.isbn',</code><code> stars INT PATH '$.starts',</code><code> comment TEXT PATH '$.comment'</code><code> )</code><code>) AS r</code><code>JOIN books b USING(isbn);

Picture

Join NoSQL and SQL data in one SQL query

We could have used the ISBN value as the MongoDB ObjectID and thus the id column in the ratings table, but I’ll leave that to you as an exercise (tip: use _id instead of isbn when inserting data using a MongoDB client or application).

A word about scalability

There is a misconception that relational databases do not scale horizontally (adding more nodes), while NoSQL databases do. However, relational databases do not sacrifice ACID properties when scaling. MariaDB has multiple storage engines suitable for different workloads. For example, you can use Spider to implement data sharding to scale a MariaDB database. You can also use various storage engines to handle different workloads on a per-table basis. Cross-engine joins are possible within a single SQL query.

Picture

Combine multiple storage engines in one logical MariaDB database

Another, more modern option is distributed SQL using MariaDB Xpand. A distributed SQL database is presented to the application as a single logical relational database through transparent sharding. It uses a shared-nothing architecture that can scale read and write operations.

Picture

Distributed SQL database deployment

Conclusion

Our work is done here! Now your system can have an ACID-compliant, extensible 360-degree view of data created independently from SQL or NoSQL applications. Reduces the need to migrate from NoSQL to SQL or make SQL applications polyglot for database users.

Author: Alejandro Duarte

For more technical information, please follow the official account “Cloud Native Database”

[squids.cn] You can currently experience the lowest price RDS on the entire network, free migration tool DBMotion, SQL development tools, etc.

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. MySQL entry skill treeSQL advanced skillsCTE and recursive query 68104 people are learning the system