Big Data Management because all this amount of data must be managed, stored and maintained for what is needed and in the time it takes, precisely in order to be able to make queries that are increasingly frequent and increasingly in relation to artificial intelligence algorithms.
If we go to reread a post from 2012 taken from COMPUTERWORLD.COM, it already contained a prediction calculated precisely for 2020 that is that each of us, regardless of men, women or children, would have generated 5,200 gb of data, considering that the bulk of them does not pass through our hands but is generated precisely by the systems that talk to each other.
The above numbers are really scary but most of them, even if generated, are not used but remain silent even if they could be useful.
The amount of data generated daily is so high that the resources to be able to manage all this large amount of information must be above imaginable ...
All these statistics collected by the site INTERNETLIVESTATS.COM (if you visit it you will see that they will have already changed a lot) give an idea of how much data can be produced and that it requires level resources to be put in place for data analysis only.
Now we give you a "sample" of the statistics between two dates precisely to realize how much Big Data related to shares (in this case Big Data Analytics) grow dramatically the first to March 31, 2020 and the second, to today's day we write , i.e. October 05, 2020 (and this affects Big Data Management).
TOTAL INTERNET USERS 4,517,295,074 are internet users in the world as of March 31, 2020 out of a total population of 7,530,000,000 people
big data: internet users
1,760,112,297 the number of websites on the network
234,691,422,222 emails sent today only
6,367,266,286 searches carried out within the Google search engine
6,089,703 posts written today alone
6,495,686,845 videos viewed on Youtube today
694,298,082 tweets made by users today only
130,661,933 posts written on the social Tumblr
76,418,304 photos uploaded to Instagram today
2,462,581,363 Facebook users currently active
807,074,031 Google+ users
284,244,253 active users on Pinterest
358,637,469 Twitter users to date
353.156.804 chats made on Skype
136,016 sites hacked today
3,845,503 smartphones sold today
620,949 computers sold today
6,996,126,755 GB of traffic moved today
380,066 tablets sold today
3,724,132 MWh of electricity consumed e
3,039,645 tons of CO2 emissions generated by the internet
All these statistics collected by the INTERNETLIVESTATS.COM site (if you visit it you will see that they have already changed a lot) give an idea of how much data can be produced and that it requires level resources must be put in place for the sole analysis of data.
The more we go along with the years, the more the amount of data that is collected at any level grows, in order to study behaviors, analyze sales, study the effects of decisions, etc.
Data and therefore Big Data today, are fundamental for understanding phenomena; data is an essential resource for economic growth, competitiveness, innovation, job creation and the advancement of society in general.
Let's see how the EUROPEAN COMMISSION has structured this infographic that makes us realize how many aspects can be touched by the processing of Big Data:
The decoding of the human genome in 2003 took 10 years, with today's computing capabilities it takes 1 week while in the future it could only take a few hours.
All this because we have highways (broadband) that allow the exchange of large amounts of data while, thanks to the collection via the cloud, we can also dispose of them remotely and, due to the high performance of current computers, we can generate reports, statistical models. and so on with great naturalness.
In every aspect of our life, data and their knowledge can positively affect the improvements that allow us and will allow us to increase our everyday life to 360 degrees by increasing every aspect.
When traveling, they can help manage intelligent traffic lights and traffic flows, they can improve the understanding and diagnostics of health problems by improving our average life both in terms of quality with fewer diseases and in terms of real life extension.
In agro-zootechnical supply chains they can help to make the use of natural resources more and more efficient, while in industrial production they help to implement efficiency and productivity.
In our home they can help manage intelligent smart home systems.
The whole world that revolves around the processing of large amounts of data and artificial intelligence systems has benefited in the last 20 years from huge investments made by the main market players.
The more the years pass, the more the companies that deal with big data acquire importance and are in turn acquired by larger companies as, in the future, among the elements that cannot be done without we find:
Storage and cloud computing in general
Big data processing and management
The communication infrastructures
Artificial intelligence systems
Crowdstrike and Elastic achieved great valuations at the time of the IPO ($ 7 billion and $ 5 billion respectively).
Other IPOs included PagerDuty ($ 1.8 billion), Anaplan ($ 1.8 billion) and Domo ($ 500 million).
There have been major acquisitions, such as Qualtrics (acquired by SAP for 8 billion $), Medidata (acquired post-IPO by Dassault for 5.8 billion $), Hortonworks (merged with Cloudera adding 5.2 billion of $ in value), Imperva (acquired by Thoma Bravo for 2.1 billion of $), AppNexus (acquired by AT&T for 2 billion of $), Cylance (acquired by BlackBerry for 1.4 billion of $), Datorama (acquired by Salesforce for 800 million $), Treasure Data (acquired by Arm for 600 million $), Attunity (acquired post-IPO by Qlik for 560 million $), Dynamic Yield (acquired by McDonalds for 300 million $) and the list is still long.
Even at the startup level, investments are increasingly huge as there are more and more companies on the market… ours Big Data Innovation Group it is concrete proof of this.
On the site of MATTTURK.COM Matt Turck is updated annually the explanatory table of all the groupings of companies operating in this world and that we publish: Big_Data_Landscape_Final
Several major players in this field give almost univocal definitions to Big Data but characterize its aspects in different ways:
According to one of the main operators in the world of big data, IBM, in fact, these large data sources are characterized by 4 V:
Volumes of data and therefore amplitude of quantities, Variety or variety of data and therefore types, Veracity or reliability of the same and Velocity or speed for their collection and processing.
In next table all the essential characteristics according to them:
We have seen so far how the interactions we carry out every day with Facebook, Whatsapp, email, web browsing, purchases, etc., involve the collection of data by a myriad of subjects who, beyond the proper treatment of the same according to the regulations in force. in the various countries for the processing of sensitive data (GDPR on privacy), it is stored by an equal number of sources.
Just think of Google's Universal Analytics or Facebook Insights which alone represent impressive amounts of Big Data to analyze and correlate with each other.
Today's computing skills have made it possible to relate almost inexhaustible data sources in order to structure them or otherwise relate them after having rationalized them.
Structured: data format organized with a fixed schema, for example RDBMS
Semi-structured: partially organized data that does not have a fixed format, such as XML, JSON etc.
Unstructured: Unorganized data with an unknown pattern such as audio file, video file etc.
As is increasingly said… the future is in the data and, above all, in the correlation between them.
Increasing computing capabilities will provide the ability to relate sets of data and plot patterns.
Increasing storage capacities will allow us to extend searches over longer periods by providing greater reliability to the models themselves.