Big Data

Introduction of Big Data: Now days the large data sets are produced by the social media data, transport data, black box data, stock exchange data and many more and the traditional way is not that efficient to handle such a large data set. The big data consist of the higher velocity, volume and diverse variety of data. Web utilization is being developed with the advancement in the technologies by which we can handle large data sets be it the information and views of the particular product on facebook or twitter or the performance information of the aircrafts. The data can be present in three forms like the structured data which includes relational data in it, semi structured data like XML data and the unstructured data in which the word, pdf, text, media are included. There are four organizations of big data volume, velocity, variety and values. To manage the big data instruments like Hadoop and Spark can be used. Hadoop can be used to handle the large and complex data sets by storing it and running the complex queries in very efficient way. It is the distributed technologies that shares the work to various servers that is the large job is splited to small tasks and these tasks are run concurrently, thus Hadoop is the massively parallel processor. The main pattern in the Hadoop includes the Map Reduce which is written in Java. But there are some limitations in Hadoop like it does not give the real time processing of the data, the SQL support is limited, insufficient execution and many more.

Post Views: 280