Chrome Extension
WeChat Mini Program
Use on ChatGLM

Time and the Value of Data

Social Science Research Network(2020)

Cited 1|Views23
No score
Abstract
This paper investigates the effectiveness of time-dependent data in improving the quality of AI-based products and services. Time-dependency means that data loses its relevance to problems over time. This loss causes deterioration in the algorithm's performance and, thereby, a decline in created business value. We model time-dependency as a shift in the probability distribution and derive several counter-intuitive results. We, theoretically, prove that even an infinite amount of data collected over time may have limited substance for predicting the future, and an algorithm that is trained on a current dataset of bounded size can attain a similar performance. Moreover, we prove that increasing data volume by including older datasets may put a company in a disadvantageous position. Having these results, we answer questions on how data volume creates a competitive advantage. We argue that time-dependency weakens the barrier to entry that data volume creates for a business. So much that competing firms equipped with a limited, but sufficient, amount of current data can attain better performance. This result, together with the fact that older datasets may deteriorate algorithms' performance, casts doubt on the significance of first-mover advantage in AI-based markets. We complement our theoretical results with an experiment. In the experiment, we empirically measure the value loss in text data for the next word prediction task. The empirical measurements confirm the significance of time dependency and value depreciation in AI-based businesses. For example, after seven years, 100MB of text data becomes as useful as 50MB of current data for the next word prediction task.
More
Translated text
Key words
data,value,time
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined