
data Engineer
Data Engineer Data Engineer’s role, responsibilities, skills , and what is the background they come from? More and
NoSQL database is a different concept from Relational Database Management System (RDBMS). NoSQL is built to handle many read and write requests (> 1 Million) which a regular database can’t handle. These databases can be used for Caching for fast response time like redis, couchbase, or store huge amounts of data also on disks like Cassandra, MongoDB, and Elasticsearch. The term NoSQL is used as the interface to work with these databases is not SQL, it can be REST API, GET and SET requests, or Dedicated syntax for technology, sometime it can have basic SQL syntax like in Cassandra and Elaticsearch.
The NoSQL database can be structured like Cassandra or in most use cases unstructured document databases. In a structured database, the schema definition should be prepared in advance. In an unstructured/document database (all above except Cassandra) the schema dynamic field types are defined automatically by the database engine.
NoSQL unstructured databases store the documents / objects key value format .
Where the Key (primary key) is used to get the document and the Value is the document which can be text or JSON containing all the object fields. The document is flexible and can have new fields on the fly or contain only the required fields per object. Some of the NoSQL database engines allow creating indexes (secondary indexes) on the document itself like Elasticsearch (by default it creates on all fields) or MongoDB.
The advantage of this architecture is that it can use clusters where data is shared and replicated between nodes so if one node fails the database can continue to operate without the downtime and also ability to scale out very easily in case more resources are needed (like Disk, CPU, and RAM)
There are no tables in most NoSQL databases. They can be called collections, indexes, or buckets.
There is no option to Join between NoSQL data “tables” in most cases a dedicated “table” should be created or the “table” should contain all object records including related items.
This can lead to data consistency issues. You need a good data architect to design the data model and the cluster sizing for the NoSQL database to achieve the best performance and business requirements.
Some of the technologies we specialize in are listed below:
Elasticsearch is a free open source search engine based on Apache Lucene with many goodies such as aggregation, analytics, and ETL using Logstash and Kibana dashboard which make it today the best search engine in the world with many capabilities to become a centralized database for many products and companies.
Read more about Elasticsearch
Apache Cassandra is a free open source distributed wide column store initially developed by Facebook.
Cassandra is masterless – no need to define a master node, every node writes and reads its local shard data bringing the best high availability and speed.
Cassandra belongs to the NoSQL family but has CQL (SQL Like) Interface which enables you to create tables, primary keys, cluster indexes and write “SQL” to insert and select data.
Read more about Cassandra
Data Engineer Data Engineer’s role, responsibilities, skills , and what is the background they come from? More and
Data Warehouse is a data platform where organisations store all their information from external or internal sources .
MySQL 8 Galera Cluster High Availability In MySQL 8 Galera Cluster Installation we described how to set up
MySQL Galera Cluster Introduction MySQL Galera cluster is the common solution for MySQL high availability and bring