Skip to main content

Geza Sapi Publications

Publish Date
Discussion Paper
Abstract

The ability to make accurate predictions relating to consumer preferences is a key factor of a digital firm’s success. Examples include targeted advertisements and, more broadly, business models relying on capturing consumers’ attention. The prediction technologies used to learn consumer preferences rely on consumer generated data. Despite the importance of data-driven technologies, there is a lack of knowledge about the precise role that data-scale plays for prediction accuracy. From a policy perspective, a better understanding about the role of data is needed to assess the risks that “big data” might pose for competition. This article highlights potential complementarities in algorithmic learning, which suggest data-scale advantages might be substantial. We analyze our hypothesis using search engine data from Yahoo! and provide evidence consistent with locally increasing returns to scale. The ability to make accurate predictions relating to consumer preferences is a key factor of a digital firm’s success. Examples include targeted advertisements and, more broadly, business models relying on capturing consumers’ attention. The prediction technologies used to learn consumer preferences rely on consumer generated data. Despite the importance of data-driven technologies, there is a lack of knowledge about the precise role that data-scale plays for prediction accuracy. From a policy perspective, a better understanding about the role of data is needed to assess the risks that “big data” might pose for competition. This article highlights potential complementarities in algorithmic learning, which suggest data-scale advantages might be substantial. We analyze our hypothesis using search engine data from Yahoo! and provide evidence consistent with locally increasing returns to scale..