For AI to really succeed, we need to protect private data
An American AI model based on clear property rights and data privacy will inspire more participation than a Chinese AI model with its data controlled by the CCP.
BY Matt Calkins
A few days ago, 13 AI employees wrote an open letter calling for greater transparency into AI operations. They understand this is a turning point in the evolution of AI, and the future of our industry depends on maintaining public confidence.
The AI industry is approaching a transformation in which user data will become paramount, and trust will be the most important commodity. I call this “AI 2.0,” a parallel to Web 2.0, the second wave of the internet. Back in 1999, Web 2.0 made the internet participatory, creating bilateral relationships between websites and their audiences. AI is set for a similar revolution. To be useful, AI needs to be personal; AI 2.0 is going to be about us.
Generative AI 1.0 is impersonal. The algorithms offer the same answers regardless of who asks the questions. These answers are more amazing than useful because they’re unconnected to our identity and interests.
We can ask AI to write a letter, but not to write it in our own personal style. And what good are recommendations if AI doesn’t know our preferences? What’s true for people is also true for businesses: AI that doesn’t know the user isn’t very useful.
For AI to understand us, it must have data about us, and before we allow that, we must have trust. Trust is the most important factor blocking the progress of this industry today.
All new technologies make errors, and AI with hallucinations makes many. AI appears not merely to err, but to breach trust. Sometimes it conveys a distorted truth, as when Google Gemini offered counter-historical images that fit the political biases of its programmers. Other times it appears to take what belongs to others, like the content of copyrighted images or articles, or the sound of a famous actress’ voice.
We can build trust with regulation that protects private data and promotes transparency. Unfortunately, this is not the regulation that Washington has proposed so far.
The White House statement of October and the Schumer report last month say almost nothing about transparency or data rights. They’re heavily influenced by the AI elite, whose priorities are different from those of the rest of the AI economy. Most executives in this business understand that AI will be a stronger industry when it stops taking liberties with others’ intellectual property. We need to convince our customers that AI will respect their private information.
I propose the following rules for transparency and data privacy:
- AI models must disclose their data sources.
- AI use of private data requires consent and compensation.
- AI use of personal identifiable information requires anonymization and permission.
- AI use of copyrighted information requires consent and compensation.
These points should be written into legislation, enforced by the government not by industry cooperation. Firms have too much incentive to break these rules unless they are asserted with force of law. Today it’s common for AI firms to process terabytes of private or copyrighted information without permission, disclosure, or compensation. Until regulation forbids that practice, there will be pressure on every company to do the same.
If AI firms need to train on copyrighted information, they can pay for it. Their budgets run to tens and hundreds of billions, allocated to things like infrastructure, energy, and personnel. If required to, they’d pay for training data as well.
Some people worry that regulating data privacy rights would restrain American AI innovation and give China an edge. In fact, it would do the opposite. Once AI is trusted with personal information it will be more useful and more broadly used. An American AI model based on clear property rights and data privacy will inspire more participation than a Chinese AI model with its data controlled by the CCP.
AI 2.0 will deliver greater value and require a new relationship with the public. The data-centric race is ending; the trust-centric race is beginning. America has a distinct advantage in this race, but we need the right regulations to win.
ABOUT THE AUTHOR
(5)