exit-btn-mobile-menu

Data mining, AI and Media: A brave new world?

As the real world continues to take surreal twists and turns, our imaginations have been captured by something seemingly not-of-this-world: artificial intelligence (AI).

The mainstream press has become preoccupied, asking whether we should tax AI? Can we hold AIs to ethical standards? Is AI racist/sexist/leading us into a Kafkaesque future? What does AI mean for the future of work and social equality? And, inevitably, will AI take our jobs and turn on us?

What’s stopping us? 

In the media, entertainment and sport sectors, meaningful emerging uses of AI rarely exist independently of another major digital trend: data mining. Here we mean the process of extracting, compiling and analysing huge warehouses of data. With vast amounts of high-quality, data to feed on and learn from, AI’s ability to structure, analyse and extract value from this data becomes crucial. AI’s potential to help businesses exploit this data like never before also becomes evident. Effortless talent identification across sport, broadcasting and music; AI-created content; social media image recognition for brands – the benefits could be otherworldly.

But what obstacles are standing between us and this potential? And even if we accept some degree of change, will this really eclipse the pace and depth of recent decades? We’ve watched the media, entertainment and sport industries be subverted and weather crises, so can they re-imagine themselves as digital industries in touch with the future?

And where in all this are the legislators, regulators and arbitrators of what’s right and reasonable? As we’ve already seen, there are few easy policy choices to be made between the various competing priorities and often unequal pressures in this new world.

Canary in the data mine: embryonic issues

Synthesising two cutting-edge technologies can bring a host of foundational challenges.

Data Mining: Dirty words

Businesses often shy away from the term ‘data mining’. It has become stained by the implication that miners might be engaging in unauthorised web scraping, which might infringe others’ intellectual property and proprietary rights in the data. This negative perception, unfair as it may be, does speak to a fundamental question: where do you get the massive volumes of high quality data that you need for AI to really reach its potential?

Important and complex legal implications flow from how data miners choose to answer this question, especially in the virtual world of the internet where physical jurisdictional boundaries just don’t exist. The sometimes overwhelming complexity – if not impossibility – of doing the right thing can easily lead to the view that there’s no alternative to innovating first and asking for forgiveness later. For embryonic businesses, how they address this question can prove decisive in whether they succeed or fail.

This issue of data ownership and copyright infringement has recently been in regulators’ crosshairs, with the recently approved text of the Copyright Directive containing a copyright use exemption not only for data mining carried out for academic research, but also for wider commercial purposes. This broader exemption is subject to an important caveat: rightsholders have the ability to opt out of the commercial exemption if they reserve their rights. Rightsholders might just try to prevent data miners from using their data for commercial purposes by opting out in their terms and conditions. But there will undoubtedly be fierce battles ahead over what is, and isn’t, effective for data owners to avoid this potentially important limitation to their rights, providing a chink of light for those advocating liberal use of data in the digital ecomony.

Massive is the new big

Once gathered, how is this data to be structured? An unorganised swamp of data is useless until ordered in a way that allows insights to be rapidly computed. And once you have a fully structured data warehouse, how will you analyse your data to harvest the most significant insights?

This is where AI comes in for many. But to what extent does AI need its hand held to ensure great-quality output? The sheer scale of data required to drive meaningful insight has become so overwhelming that an AI solution is the only realistic answer for most; as a Gartner analyst wrote in 2017 – “Big data became the new normal, and now it is just data.” This shift has made AI and machine learning a necessity if we are to unlock the insights within the data lakes that we’re creating.

What could we achieve?

Technologies using AI and advanced data mining techniques will soon become technically and financially accessible to most businesses. The question is whether the media, entertainment and sports sector will find captivating use cases for them?

We believe so. Early pioneers have demonstrated how they can be used to both streamline business processes and drive real innovation in content creation and distribution – and we are only glimpsing their full potential.

Early stage data mining and AI: knowing you better than you know yourself

Although at an early stage, data mining and AI are already changing the media, entertainment and sports industries in ways both obvious and subtle. Admittedly from a low base, recommendations across film/TV/music and games are improving. Existing technology can compute the complex matrix of your watching history, firmly understanding the mood, tone and genres of that content, and then use that data to suggest the irresistable sequel to your entire watching history. Recommendation engines per se are nothing new, but we are moving past engines which are simple algorithms to AI, all with the help of data sets that grow every minute of every day.

At a recent RTS London event AI in Broadcasting, BBC Four demonstrated its evolving AI tool, which it has let loose to trawl through the BBC’s gallactic video archives and attribute metadata to each scene, accurately understanding not only what is happening in each scene (i.e., “train travelling over bridge”) but the so-called “energy” of each scene. In October 2018, this AI put together an entire hour of content by fitting footage together into (what it regarded as) a logical montage that stretched the boundaries of TV production. We might be a way from humans being cut out of the TV production process, but the implication of BBC Four’s experiment is that AI could soon play an essential part. Meanwhile, the production of formulaic film trailers – usually done by humans over weeks – was accomplished by AI in 24 hours back in 2016.

We can soon expect AI to be composing inoffensive corporate music to a brief, and filling businesses’ websites with sharp, self-descriptive copy.  AIs might well be some of the composers, copywriters and production assistants of the future.

Personal assistants like no other

Personal assistants might prove another vital component in AI and data mining. As your new best tech pal soaks up all your listening, watching, shopping and searching history, in some cases having the benefit of listening in to your private conversations, the data mine produced could surpass anything that media companies can currently tell you about yourself; Spotify might know that you’re a modular jazz aficionado, but they (probably) don’t yet know about the 80s hip hop you were chewing your partner’s ear off about on Friday night. Will this hand media dominance to the big tech companies currently at the forefront of the personal assistant market?

Not all emerging use cases are immediately visible to consumers, though. One of the most compelling uses we’ve seen is Instrumental, an AI tool which trawls and analyses streaming data and uses its experiences to predict the future success of the artist. Clients include Sony and Warner Music, who use the insights to scout and sign up new talent. Surely it won’t be long before similar AI tools analyse athletes’ performance from video footage and performance data, attributing objective stats, and reproducing the work of hundreds of scouts?

Democracy at risk, technology to the rescue

Other developments are already having significant impacts on global democracy. Fake news and disinformation in our media is putting our democracy at risk, according to the Digital, Culture, Media and Sport Committee’s (DCMS) report of February 2019. Recently, OpenAI’s model proved so effective at creating believable fake news that the team resolved to keep their research private, warning that this technology could become mainstream within years.

But technology could also become the solution to its own problem. AIs trained on deliberate misinformation are currently no better than random in identifying disinformation, but should these tools advance as expected they will soon be deployed by social media companies to help identify and remove fake content, plugging a potentially fatal flaw in the vast information flows that now feed and influence voters, vital institutions and the media industry whose job it is to hold those in authority to account.

Towards unimaginable change?

The idea of massive data sets being effectively analysed by media, entertainment and sports businesses with the requisite ambition and investment promises to change not just the way we distribute and monetise content, but the content itself.  Media has always been a data driven business; with data mining and AI combining, the landscape is changing fast.

What is clear with each innovation is that both a mountain of data and highly developed AI are required if they are to reach a level of sophistication high enough to move the needle.  An AI is worthless if it doesn’t have a trove of good quality data to act on. A data empire is worthless without the machine learning and analytical genius to tie seemingly disparate threads together and noticing patterns that would escape the human eye.

Will AI and massive data drive change on a scale comparable to the first digital revolution of the past two decades? Probably.  And will we slip painlessly into this new era, or will the battle for territory in the new world of data result in years of skirmishes ahead?  Who knows.

Our experience is of a thriving industry in which rights holders and tech enthusiasts will continue to grapple its nascent challenges, where legality, practicality and innovation collide. Those who survive this cocktail will probably change the media, entertainment and sport industries in ways which we weren’t even capable of imagining just a few years ago.

If you’d like to know more:

Please sign up for our Tessellate series here.