Big Data and Column Oriented Databases

Tuesday, November 25, 2014

columnar database

 Computer Science
Efficient storage and retrieval of information is vitally important when it comes to working with big data. Database systems using columnar structures offer some tangible benefits for certain conditions.




In the era of big data, efficient storage and retrieval of information is important when it comes to processing the massive amount of information involved. Databases area key to this, and the very structure of the database can have dramatic impacts on performance.

A column-oriented DBMS is a database management system (DBMS) that stores its content by column rather than by row. This translates into substantial advantages for data warehouses and library catalogs where aggregates are computed over large numbers of similar data items.

Rows and columns—this may seem like a trivial distinction, it is the most important underlying characteristic of columnar databases.

Related articles
The main difference between columnar and the more traditional row-based database structure is that when they are stored, all of the columns are not entered successively into storage.  This eliminates redundant metadata, which minimizes the data management requirements of the system. This also means the database can be navigated and searched more rapidly.

These features make columnar databases ideal for high-volume, incremental data gathering and processing, real-time information exchange like messaging and frequently changing content management. These are also elements of the three 'V's' of big data, volume, velocity and variety.

Shutterstock has deployed a columnar database to be the foundation on which its platform is monitored, using it for immediate anomaly detection and real-time analysis of more than 20 thousand data points per second.

Compared with relational row-based databases, columnar database systems offer better analytic performance when simultaneous queries are not used. The method also allows for more rapid joins and aggregation with data streaming along in an incremental manner.

The columnar database approach is also highly suitable for compression, by eliminating multiple indexes and views. With tools like this, the process of changing big data into information, which then changes into knowledge is another step closer.


By 33rd SquareEmbed

0 comments:

Post a Comment