We can hear terms such as big data, machine learning, and artificial intelligence used increasingly on the news. They are all part of a phenomenon which has grown hugely in the last decade as a result of the digitalisation of private, public, and business life. This has changed society and the way it works.

Every time we make a financial transaction, post something on social media, search for something in a search engine, or record our exercise on a wearable device, we leave a digital fingerprint that is stored somewhere as data. The analysis of this data can provide us with information on our habits, circumstances, or shopping plans… and much more.

If we add to this the ever-increasing prevalence of smartphones, wearables, home automation devices, and IoT systems, we have a flood of data that is nigh on impossible to process using conventional means. Some companies have been storing their data for decades in the hope they would be able to make use of it in the future when it was technologically possible.

This data then began to be processed using large, expensive, purpose-built computers which were soon obsolete due to the increase in the volume of data and the speed with which it was created. Evidently, these computers were only available to large companies that had significant resources. Distributed computing changed all that.

Groups of connected conventional computers working in unison were able to offer greater capacity than the highly expensive supercomputers, which democratised the field of data processing that was emerging as a result of the huge amounts of data obtained from social media and smartphones. From that point onwards, the appearance of different technology that could handle data by extracting, storing, and analysing it has continuously grown and improved.

Nowadays, the extraction, transformation, and uploading of data provides governments and companies with the chance to obtain information from the data and create knowledge. This knowledge is then used in every area: economics, medicine, social issues, and even the environment. It allows us to make better decisions, improve processes by making them more efficient, and predict with greater accuracy the behaviour of various systems: the stock exchange, detecting health problems, consumer trends, consumption/generation of energy, and even the prediction of bird migration.

Yet beyond the benefits that data science provides us with, it raises many other challenges, beyond the technological details, in the field of ethics. Is it ethical to use an algorithm if its biases perpetuate inequalities? At what point is the use of my data a violation of my privacy? Are my rights being threatened in some cases? The Cambridge Analytica scandal has brought to light the dangers of misuse of data analysis. For those who want to delve further into the subject, this and other issues are covered in “Weapons of Math Destruction” by Cathy O’Neil. You can also learn more about the issue through documentaries such as Citizenfour and the Great Hack, which cover the Snowden case and Cambridge Analytica respectively.

If, as they say, “data is the new oil”, creating a framework that ensures growth, investment, and research into artificial intelligence whilst protecting the rights and freedoms of citizens is the real challenge for European authorities. The battle to lead the development of artificial intelligence is already underway and the loss of opportunities resulting from the lack of a European big data and artificial intelligence strategy could set us back hugely with disastrous consequences for our economy.