Мария Савона, профессор экономики инноваций университета Сассекса, рассказывает о том, как данные людей могут остаться товаром на рынке и при этом перестать быть собственностью цифровых платформ
The growth of interest in personal data has been unprecedented. Issues of privacy violation, power abuse, practices of electoral behaviour manipulation unveiled in the Cambridge Analytica scandal, and a sense of imminent impingement of our democracies are at the forefront of policy debates. Yet, these concerns seem to overlook the issue of concentration of equity value (stemming from data value, which I use interchangeably here) that underpins the current structure of big tech business models. Whilst these quasi-monopolies own the digital infrastructure, they do not own the personal data that provide the raw material for data analytics.
The European Commission has been at the forefront of global action to promote convergence of the governance of data (privacy), including, but not limited to, the General Data Protection Regulation (GDPR) (European Commission 2016), enforced in May 2018. Attempts to enforce similar regulations are emerging around the world, including the California Consumer Privacy Act, which came into effect on 1 January 2020. Notwithstanding greater awareness among citizens around the use of their data, companies find that complying with GDPR is, at best, a useless nuisance.
Data have been seen as ‘innovation investment’ since the beginning of the 1990s. The first edition of the Oslo Manual, the OECD’s international guidelines for collecting and using data on innovation in firms, dates back to 19921 and included the collection of databases on employee best practices as innovation investments. Data are also measured as an ‘intangible asset’ (Corrado et al. 2009 was one of the pioneering studies). What has changed over the last decade? The scale of data generation today is such that its management and control might have already gone well beyond the capacity of the very tech giants we are all feeding. Concerns around data governance and data privacy might be too little and too late.
In this column, I argue that economists have failed twice: first, to predict the massive concentration of data value in the hands of large platforms; and second, to account for the complexity of the political economy aspects of data accumulation. Based on a pair of recent papers (Savona 2019a, 2019b), I systematise recent research and propose a novel data rights approach to redistribute data value whilst not undermining the range of ethical, legal, and governance challenges that this poses.
Unpacking the data value chain
That data are part of the intangible capital of firms is not at all new (Corrado et al. 2009), but this has been largely unquestioned, until now. Economists have overlooked the evolving nature of the data value chains, and missed the opportunity to identify, let alone to quantify, both the final product and the sources of (re)production. The business models of big tech rely on a complex integration of layers. These include data gathering, accumulation and in-house treatment; the involvement of third parties, intermediate users and providers of data analytics; and interfaces that provide ‘free online services’ to individuals. We do not know whether the (equity) value of large platforms is truly aligned to the scale of data accumulation, to stick with the capital metaphor, or to something else. Intangible assets include, besides data, investments in R&D, patents and licenses, trademarks, organisational capital, training, engineering, design, and so on (OECD 2018). However, as most of the equity value from data analytics is in advertising, it is difficult to argue that all intangibles are knowledge-based capital. Economists have de facto legitimated the notion of data as intangible capital (Brynjolfsson et al. 2018, 2019), though did not fully unpacked the evolution of the data value chain until it was too late to prevent the current quasi-monopolistic market structure of large platforms.
The political economy of data: The basics
Economists have also underestimated the complexity of the political economy aspects of data accumulation, despite some attempts (Posner and Weyl 2018). Arguably, the political economy of data partly builds upon the economic nature of data in terms of rivalry and excludability.
Public knowledge is an example of a public good, neither excludable nor rivalrous. The granting of intellectual property rights (IPRs) aims at protecting and remunerating intellectual creation as an individual incentive to contribute to collective fruition for the public interest, rather than to exclusive fruition of a private good.
A club good is excludable but not rivalrous, so that an individual can be denied/deny access to and use of it, though once allowed, the good is non-rivalrous. The conundrum around personal data is that they are closest to a club good. Data are (potentially) highly excludable but non-rivalrous, as their fruition is virtually infinite at zero marginal cost. Data are an intangible yet durable asset as they do not become obsolete. Their value is in their scale. Yet, unlike traditional mass-production goods, data are far from being standardised and homogeneous, rather, each piece of data is a unique combination of nested individual characteristics.
It is not straightforward to establish what kind of rights data require and for which purposes. For instance, data protection laws as enacted by the European Commission assume privacy to be a fundamental right, embedded in the European Convention on Human Rights. GDPR regulates the rights to excludability (privacy) comparatively more than the management of rivalry (e.g. use by third parties or the right to be forgotten).
Governance models for redistributing data value and implementation challenges
Taxing intangible capital (more)?
If data are an intangible asset, would it be straightforward to design and implement an effective system to tax intangible capital by a fiscal authority?
A seminal proposal for a taxation system in the form of a ‘bit tax’ was put forward some 20 years ago by Soete and Kamp (1997), who advocated taxation of the number of ‘bits’ rather than of the value added of intangibles, which was then at an embryonic stage. An adequate system of taxation of current large platforms could be designed in a similar way, by taxing large platforms at the start for each individual’s data collected.
The idea of implementing a ‘data tax’ is, however, based on the heroic assumption that it is easily possible to track a data value chain, based on an appropriate price system to quantify costs of storage, aggregation, and treatment. Currently, there is no such price system to value data. Balance sheets at best measure data analytics in a similar manner to how software investments are estimated, that is, in terms of compensation of the work of data scientists and engineers employed in data analytics tasks (Corrado 2019).
In addition, the very nature of data would make the role of supranational fiscal institutions more appropriate, yet, in the context of increasingly undermined traditional national tax bases, this would open up a Pandora’s box of implementation challenges.
Creating data labour markets or radicalising capitalism?
Posner and Weyl (2018) argue that, as the big techs exploit the lack of public understanding of how their data are collected and treated, there is a missing labour market for data generators. The notion of labour dignity becomes ‘data dignity’. This view offers the opportunity to revisit the traditional forms of collective bargaining and collective representations of workers and adapt them to the digital ecosystem. The ‘mediators of individual data’ proposed by Posner and Weyl (2018) could allow for (i) collective bargaining with big techs, and (ii) quality certification of data.
Creating a credible institutional actor that can represent and collectively bargain on behalf of data generators is a necessary, though not sufficient, condition to make this governance model work. People might (or indeed, might not) have intrinsic and extrinsic incentives to generate data as a job (Bénabou and Tirole 2006). Altruistic incentives would increase the likelihood of generating high-quality data and a sense of belonging to a community. However, perverse incentives might lead citizens to generate a mass of low-quality data to maximise financial remuneration. As convincingly argued by Pavel (2019), this might particularly be the case as more vulnerable, less-skilled workers have perverse incentives to generate data as a result of income constraints. However, if they are less educated and skilled, their low-quality data might be remunerated less, creating a vicious circle of inequality. Current labour markets issues – such as technological unemployment, skill-biased technical change, and wage polarisation – would simply be reproduced in a data labour market.
Advocating for data labour markets to address data value redistribution is an endeavour whose success would depend on an adequate system of collective representation and bargaining.
Large platforms as large publishers: Recognising author rights of data generators
Here, I lay down the basics of a rationale for considering data as an intellectual creation and recognise authorship rights for the individual who has generated them.
First, personal data broadly make up the (digital) identity of an individual. Hence, the concepts of data ownership and property (i.e. an individual owning her data) can be argued to be meaningless, as the individual is at once the original intellectual creation embodied in her own data. This results from the complexity and unicity of individual histories, knowledge, preferences and value systems. Even allowing for digital identity as a social intersection (Immorlica et al. 2019) – that is, based on personal data from and shared with others – it is down to the individual to consent to the use of her data. In summary, because of their nature as a club good and the uniqueness of each individual piece of data, I argue that an individual might claim authorship rights over her data. Big techs that aggregate, reproduce, and use data as original intellectual creations should be considered publishers, rather than platforms, and therefore required to protect, recognise, and remunerate individual authorship rights, for life, and regardless of job status.
Second, being entitled to authorship rights would potentially increase individuals’ agency over their own data and give content to the notion of data dignity. Data subjects could choose to be paid a use license fee when data analytics are used for private purposes and to feed profits (e.g. marketing analytics). Alternatively, they could choose to openly share it where personal data feeds into public knowledge (e.g. research).
There are some advantages to an approach based on recognising, protecting, and remunerating authorship’s rights over the other governance models. For example, it could (i) reduce the infrastructural burden of administering a digital tax or changing digital ownership; (ii) ensure dismissed workers do not lose their rights to data wages once they are out of the labour contract; (iii) ensure that large platforms keep paying authorship rights to consumers of digital services who have completed/exhausted use of online services but who have provided data that continue to contribute to the intangible assets of the firm. Innovative firms would not necessarily be taxed, but profits would be redistributed directly. Finally, authorship rights could also be collectively licensed (for limited purposes).
A bold vision to design a governance model based on data rights that pursues economic fairness alongside social justice requires rethinking the relationship between centralised and decentralised data governance (Pavel 2019). A competitive system of bottom-up and purpose-specific data trusts, as recently advocated by Delacroix and Lawrence (2018), would need to be complemented by centralised public institutions that regulate them, ensure scalability, and enforce compliance by big tech. The European Commission has the comparative advantage of a first mover as a regulator of the data system. We should start from there.