When Amit Jain talks about artificial intelligence, he doesn’t talk about chatbots. He talks about civilizations.
At Web Summit, the Co-Founder and CEO of Luma AI, made it clear that the industry is still obsessing over the wrong milestone. Yes, language models have been “productionized and deployed at scale.” Yes, they’re writing emails, generating code, automating back-office tasks.
But for Jain, that’s the warm-up act. “The next wave is going to be models that can also understand the world and aren’t just useful for digital tasks,” he told Communicate.
“They are useful for creative tasks, are useful for world understanding tasks, they are useful for robotics, they’re useful for operation in the real physical world. That is actually a substantially bigger part of the economy than the digital economy.”
This is not an incremental improvement. It is a redefinition of what AI is for.
From words to world models
Luma AI is building multimodal systems, models trained jointly on audio, video, language and image, designed not just to generate content, but to interpret and operate within reality.
Its flagship platform, Dream Machine, has become known for high-fidelity generative video. But Jain frames the company’s mission as broader than creative tooling. He sees three major arenas where the next trillion-dollar opportunity will emerge: media and advertising through video generation; industrial “world understanding” through large video-language models; and robotics.
The shift, he argues, is economic as much as technological. The physical economy dwarfs the digital one. If AI systems can meaningfully engage with that world, the scale of value creation expands accordingly.
Asked to explain Luma’s work “like you’re speaking to a five-year-old,” Jain resists simplification.
“As a five-year-old, I don’t think five-year-olds have enough context to really understand what this is what Luma does,” he replied. “Luma makes AI models that are like the human brain, or more like the human brain than the current language models.”
It is an audacious comparison. But it signals ambition—AI not as an assistant, but as a cognition engine.
Why the Middle East matters
Luma’s recent expansion into Riyadh is not opportunistic box-ticking. Jain sees the Middle East as one of the few regions positioned to shape AI’s infrastructure layer.
“Commercially thinking about it is probably in five years, three important markets that matter in the world, maybe four, I guess, US, China, India, and the Middle East, unless Europe gets stuck together,” he said.
His reasoning is both structural and political. A largely shared language market across MENA. Significant capital. Energy abundance. Geographic centrality. Regulatory agility.
“It’s sunny 364 out of 365 days. That is a ridiculous amount of energy,” he said, referencing the region’s solar potential. Combined with subsea cable connectivity and proximity to global population centres, he sees a strategic advantage embedded in geography.
But he is equally emphatic about leadership intent.
“I think the leadership in the region, especially in Saudi Arabia… this is the most forward-looking vision of what AI will be able to do for the world.”
For Jain, this is about leapfrogging. “They can actually skip a bunch of stages.” Just as India bypassed widespread landline infrastructure in favour of mobile connectivity, Gulf nations, he argues, can bypass bureaucratic and legacy infrastructure layers through AI-native systems.
In Qatar, discussions are at an earlier stage. “We’re just starting to think about Qatar, to be honest with you,” he said, noting conversations with local authorities and entities such as QI, while highlighting brands like Qatar Airways as examples of global impact.
AI as cultural archive
Where Jain becomes most philosophical is on language and culture, specifically Arabic.
At first glance, building Arabic-native AI models might seem like a regional commercial play. He rejects that framing.
“I think its utility towards regional media… is honestly pointless. I think because there’s a much bigger thing at stake here.”
Historically, civilizations left physical artefacts such as architecture, pottery and inscriptions. Today, culture leaves data. And increasingly, that data is synthetic.
“How would future generations know a culture today existed? Because of the Internet.”
Jain claims that in June 2024, “more than 50 percent of new text was created with LLMs.” For video and richer media, he predicts an even faster shift, adding that “by the end of 2026, we’ll be at a point where most things people watch on the internet is one way or the other generated by a generative model.”
If generative systems are trained predominantly on Western datasets, underrepresented cultures risk being diluted or disappearing.
“The models have no idea… the risk isn’t really like, ‘oh, what media can be made? But more like ‘will these cultures be forgotten?’”
He points to nuances invisible to non-native systems, differences in attire, terminology and regional identity. “A thwab from the Emirates looks different from the one from Saudi Arabia.”
In that framing, multimodal AI is not just a creative accelerator. It is a cultural preservation mechanism.
Hyper-personalization, finally
The advertising industry has promised hyper-personalization for decades. Jain is unconvinced that most players ever knew how to deliver it.
“I think nearly everyone you probably spoke to at the Web Summit does not know how to actually do it,” he said. “They have been saying this now for two, three decades.”
True personalization, he argues, is not demographic segmentation. It is a large-scale content variation. Luma is working on projects producing “10,000 to 100,000 variants of the same ad.”
When production costs shift from $20,000 to $100,000 per minute to potentially $10 to $100 per minute, creative economics collapse and rebuild.
“It’s going to be faster, about 100 times faster,” he said of generative video production.
But speed is not the decisive factor. Reliability is.
“What matters to the end user is how reliable it is. They don’t really care that this model knows physics. It matters to them that it’s reliable.”
He compares it to consumer technology. Users do not obsess over chip architecture. They care that the device works.
For AI to embed into creative workflows, it must handle end-to-end tasks, not just clip generation, but narrative coherence, causality and continuity.
“Systems that are intelligent can help you with the end-to-end task, and finally, they are reliable.”
The leadership reckoning
Jain’s closing message was not technological. It was organizational.
“You need to really think about how you’re running your business.”
If production economics change by a factor of 10 or 100, agency models built on billable hours become structurally unstable.
“Selling people’s time by the hour, I don’t think that survives as an economy.”
For leaders, he frames the moment as a moral and managerial test.
“It’s your job to think about your people.”
Ignore AI, he warns, and you risk more than competitive disadvantage. “If you’re working at a company where the leadership is that oblivious, leave.”
For Jain, AI is inevitable. The only question is who shapes it and whose stories survive within it.
In his view, the next trillion-dollar opportunity will not be built solely in Silicon Valley labs or Chinese mega factories. It will be built wherever energy, capital, political will and cultural intent intersect.
That definitely includes the Gulf.






