This piece was featured on Greenbook
Let’s all calm down as we discuss what ‘near-human’ really means.
Recently, four people shared an article by Mark Ritson in Marketing Week that declared synthetic data as good as real. It went on to herald the bad news for market research and put the marketing strategy profession on the chopping block.
Before you gird your loins for a defensive barrage from a former strategist and current market research professional, I want to be clear that I am not an AI Luddite. In fact I’ve developed and deployed AI for 6 years, and am heavily invested in AI chip stock. If anything, I'm an evangelist, but I’d like to point out a flaw in what constitutes ‘as good as real.’
I’ll start by quoting one of my AI heroes - Jensen Huang - the CEO and founder of Nvidia. When asked recently about when AI would be as intelligent as humans he answered in two ways. He stated that, if defined by the ability to ‘pass human tests’ such as those we give students, we are probably 5 years away from human-level aptitude. However, if you were to broaden the definition of human intelligence beyond the passing of rote tests the timeline to ‘human-like’ becomes unclear as we don’t truly know what defines us. This answer is at the heart of my problem with the flippant statements like ‘near-human’ or ‘as good as real’ that swirl around synthetic data. In Huang’s answer, he refers to engineers who define and carry out tests of ‘humanness.’ So the entire sum of humanity is codified by incredibly intelligent but task-oriented humans who believe in simplicity, efficiency, and rationality. As a data professional, my ‘researcher bias’ radar is immediately activated. Did these left-brain engineers leave room for right-brain traits? Are we accounting for chaos, weirdness, anomalies, contradiction, or creativity? What about instinct, intuition, non-verbal communication, and feelings? Sure, engineers are aware of all of these, but to what extent do they prioritize them in their defining of human values?
I’d like to raise a similar question in the matter of synthetic data.
Let’s first look at how we got here. The ‘as good as real’ moniker refers to studies that have compared synthetic data-led surveys with human ones and they produced 90-95% similarities. We have been conditioned to see numbers in that realm as a slam dunk, but as a qual-minded person, I’m always suspicious of such hubris.
Firstly - the studies are comparing side-by-side quant studies (which makes sense when trying produce a solid number). The truism with quant surveys is that they’re so often conforming or validating something you already know. Yes - the vast majority of quant data (perhaps even 90-95%) is cliche or pretty bland information as it’s used to identify the popular position of people. So it’s no surprise that AI can guess the cliches. We already know this is one of AI’s great talents. Scraping through vast databases to determine what the masses would say, write, and opine, based on what the masses have already opined countless times before is very likely going to get a high percentage. It would be more remarkable if they didn’t. So my first point is one of caution: Let’s not get overly excited by the high numbers as this is to be expected when dealing in cliches!
My second challenge speaks to the engineering story above and takes it further. The quantitative methodology is an even blunter science than that of testing our intelligence. The requirement to input finite answers to enable neat and consistent results massively thwarts quant’s ability to be interesting or creative. The output is only as imaginative as the brain of the quant practitioner. And I'd like to respectfully suggest that this isn’t quite as intelligent as the brain of a Nvidia engineer. So the second factor that’s informing this impressive percentage is the fact that the questions asked are very likely to be mind-numbingly dull and prosaic. The thing about dull questions is that they aren’t particularly good at coaxing emotional humans from their shells with groundbreaking insights. Dull, robot questions produce dull, robotic survey answers.
The final point delves deeper into this. Thanks to decades of substandard interfaces, survey takers have effectively been conditioned to unthinkingly provide the most obvious answers.
So it’s absolutely no surprise, after generations of dull UX and annoying and repetitive surveys, that WE become the robots. Therefore even less surprising that our answers deviate so little from those of the real bots.
What this stat really tells me is less about how we need to pack up our tools and find new jobs, but rather that we haven’t been doing our jobs properly for a long time. What we should be doing is retraining people to be people again. Don’t take the rise of the robot as evidence of the robot getting smarter - instead see it as a call to arms to reanimate ourselves, our researchers and our research participants. There are material ways we can engage in this reanimation and rehumanizing process - things like play, imagination, or a simple pause.
Finally, let’s take one more look at that “impressive” 90-95% figure. Now remind ourselves of qualitative research’s remit. Aren’t we the ones who are charged with extracting insight from the remaining 10%? Wasn’t it precisely in the 1-10% that the truly unexpected insights show up? And isn’t it these insights that end up driving the real business decisions that change the world?