Big Data, what it is and what it's for | SEIDOR
Seidor
Big Data

November 17, 2022

Big Data, what it is and what it's for

Big Data is a combination of structured, semi-structured and unstructured data collected by organisations, from which information can be extracted.

Although there is no specific threshold over which a database becomes Big Data, most of these collections involve terabytes, petabytes and even exabytes of data created and collected over time.

These huge volumes of data can come from several sources (both internal and external) and be used for different projects, from machine learning systems to predictive modelling and other advanced analysis applications.
The V's of Big Data

To better define what Big Data entails, we usually refer to several V's that define these systems:

  • Volume of data
  • Variety of the types of data stored
  • Velocity with which this data is generated
  • Veracity of the data collected and used
  • Validity when using the data
  • Value this data has and provides
  • Variability in terms of its composition, frequency and availability
  • Volatility, since these systems are neither eternal nor immutable
  • Viability of the data
  • Visualisation of the data

What Big Data is for

All this volume of information must be processed and analysed in order to extract value from all this data. How it is used will depend heavily on the type of organisation that wants to leverage the Big Data: from creating customised marketing campaigns to cancer research. Ultimately, Big Data lets us meet our stated objectives faster, more efficiently and effectively.

The success or failure of all these operations will also depend heavily on the quality of the data available, as well as on how it is filtered and processed, the questions that are asked of the systems responsible for processing all the information, and the analysis capacity that is available.

As we said before, to have a good Big Data system, the information must come from multiple sources. Internally, much of this information comes from processing transactions, customer databases, documents, emails, website click logs, mobile apps and social media. It also includes machine-generated data, such as network and server log files and sensor data from manufacturing machines, industrial equipment and Internet of Things devices.

In addition to the data from internal systems, Big Data environments often incorporate external data on consumers, financial markets, weather and traffic conditions, geographical information, scientific research, and more. Images, videos and audio files are also forms of Big Data.

Quickly analyse your business data

In any case, Big Data helps companies to quickly analyse every detail of their business so they can identify potential areas of improvement and others that must be leveraged to reduce expenses, increase income and maximise profits.

Big Data analytics is one of the most complicated and important parts of this field. Its purpose is to examine vast troves of data to find hidden or less visible information, such as hidden patterns, correlations, market trends, customer or system preferences, and which can lead to better decisions (based on data).

In order to perform this analysis, data professionals (such as data analysts and scientists) collect, process, clean and analyse the information and correlate it with other data sets using specific applications.

In many cases, they can develop predictive models to automate these tasks and make the company even more efficient so it can make the most of all this information. This will sometimes require using technologies such as Machine Learning, Deep Learning, Artificial Intelligence, Business Intelligence and even visualisation applications.

Finally, it should be noted that given the large computing capacity that is usually needed to execute all these Big Data related operations, most applications are hosted in the cloud, since this allows users to better scale the systems.

You may be interested in

September 26, 2023

Pros and cons of protecting data in the cloud

As more and more organisations become data driven, companies must ensure that this information is always accessible, safe and secure.

SEIDOR
February 09, 2023

Trends that benefit data centres: Hyperscale and 5G

The IDC consultancy expects the collective sum of the world's data to grow to 175 zettabytes by 2025, which is 150 times more bytes of data than stars in the observable universe.

SEIDOR em PT
February 01, 2023

What is ChatGPT? And the importance of AI in business

One of the trending topics of the last few weeks has been the famous #ChatGPT, the reason is very simple, its creators decided to launch this chat to everyone and thus surprise us with its power, give a message about technological advancement and ask us what will be next after this tool.