
What next for Apple’s AI, OpenAI’s funding impetus, and an impressive Noise Tag 1
The annual Apple WWDC keynote is next week. You may have heard the rumours about significant redesign and naming scheme changes for iOS, iPadOS and so on. My hunch goes beyond the clickbait and 140 character attention spans. Apple, trying to overcome a deficit that’s still increasing every week (the upgraded Siri remains delayed), will reconfigure its Apple Intelligence plans. As consumers, you and I may only see the changes on the interface layer, but my belief is, there’s more to that story. I don’t foresee any AI company acquisitions, but more on the lines of exclusivity driven core partnerships. There are three names that, I feel, might play a big role. Perplexity, which already has deals in place with Samsung and Motorola to integrate their AI in upcoming phones, could have something to offer Apple as well. It will certainly be more than just a search plug-in, if that partnership has to fructify. Could that means Perplexity’s agentic browser plans, intersect with Safari, for instance? Very much a chance of that happening. Case in point, Eddy Cue, Apple’s senior vice president of services saying during the Google antitrust trial that, “We’ve been pretty impressed with what Perplexity has done, so we’ve started some discussions with them about what they’re doing.”
Secondly, Google itself stands to lose a little if Samsung (being by far the largest Android phone OEM) reduces reliance on Gemini which itself has received significant updates recently, with the Perplexity partnership. The ideal recourse would be to find a way to give Gemini a home on the iPhone, beyond the app. Not to be forgotten, is Apple and Google’s search partnership. And to that point, it has been noted in the recent antitrust trial that traditional searches are reducing, for the first time, via Safari. That’s because of AI, and Gemini is better positioned than anyone else to realign its placement within Safari and Apple’s broader OSes. The upgraded Siri, for that matter, alongside OpenAI’s GPT search? And finally, Anthropic. With the latest Claude models being touted for their coding skills (as are many others, increasingly so), might find a place behind-the-scenes. Apple’s code writing tool for developers, called Xcode, could do well with a Claude layering. Reminds me of Microsoft’s Satya Nadella saying recently that, “maybe 20 to 30 percent of the code that is inside of our repos today in some of our projects are probably all written by software.”
That may be a poignant quote, particularly for Apple that missed the AI train the last time, and wouldn’t want to do so ahead of the next big chapter. And whichever AI company finds the best partnership alignment with Apple in 2025, stands to gain the most. With things the way they are in the AI space, that’ll be a race to the top. We’ll most certainly chat about this next week.
RANKING MODELS

With every new artificial intelligence (AI) model, inevitably come the claims that they’re the best ever at what they do. And that they’re better than every rival. When Anthropic released Claude 4 a week ago, the artificial intelligence (AI) company said these models set “new standards for coding, advanced reasoning, and AI agents”. They cite leading scores on SWE-bench Verified, a benchmark for performance on real software engineering tasks. OpenAI also claims the o3 and o4-mini models return best scores on certain benchmarks. As does Mistral, for its latest, open-source Devstral coding model. It is a common theme, not limited to this troika of AI companies. I talked about the need for ‘AI benchmarks’ to evolve quickly, else we risk running into the same sort of scenario that unfolded in the PC and smartphone space for years — where one metric (such as processor GHz speeds) fallaciously defined the worth of a product.
- Benchmark obsession needs reassessment, lest we all get excited every time a new model rolls out to even more robust claims (which in this AI era, seems to be almost every day). AI companies frequently showcase high benchmark scores (like MMLU, SWE-bench, GSM8K), but there’s a definite argument that these don’t always reflect real-world intelligence or usefulness, especially in nuanced or creative tasks.
- Benchmarks have limitations and can be manipulated. Irrespective of their defence to recent allegations, Meta’s Llama 4 Maverick appearing with an optimised model for certain tests, raising is an illustration about all questions regarding fairness, real-world applicability, and transparency.
- Microsoft and other institutions are developing new frameworks like ADeLe that assess AI on task difficulty and cognitive abilities, aiming to better evaluate generalisation, long-term memory, and human-aligned reasoning.
- Context matters in evaluation. Benchmark scores (often reported as percentages) lack standardised interpretation and may hide weaknesses in multi-step reasoning or creativity, emphasising the need for contextual and culturally sensitive evaluations.
- Global and local innovations are emerging too, which leads us to efforts such as Google’s Infini-attention tool for memory extension, SG-Eval (regional benchmark for Southeast Asia), and India’s Sarvam LLM highlight a shift towards more comprehensive, localised, and robust AI evaluation models.
TECH SPOTLIGHT: Noise Tag 1
The Apple AirTag has, irrespective of whatever competition may claim, has remained the most accurate tracker for all things easy to forget and for tracking luggage during your travels. The universal applicability is such, you’ve to really hide this in your luggage, because thieves know what to search for, and throw it into the nearest bin (or canal). Many have made an Apple AirTag rival in recent years, but none have had the sort of accuracy or consistency. Noise, a company that I’ve often noted is on a premiumisation trajectory, recently sent something called the Tag 1 Smart Tracker. Priced at ₹1,499 at this time, it’s available in Charcoal, Midnight and Ivory colours. It is compatible with Apple Find My network on the iPhone and the Android Find My Device service for Android devices. I was able to seamlessly set up both.
In the Find My app on an iPhone, for instance, the Noise Tag 1 shows up just as an AirTag would. The location accuracy and refresh window, seems to be very similar to the AirTag — I had both in the same travel bag, for an even-kneeled comparison. If I left that luggage behind at the hotel when heading out for briefings and the keynote, the Noise Tag 1 generated a notification on the phone that the linked piece of luggage had been left behind, with the location details. In this instance, there was nothing to worry, but clearly the Noise Tag 1 is on the job rather seamlessly. There is a way to chime a lost tracker with the ring mode (it is louder than an AirTag, at least to my ears), and can also be shared with family or friends, if you need help in tracking.
It is to be expected that the Noise Tag 1 would have a year’s worth of battery life, similar to my experience over the years with an Apple AirTag — both use the same CR2032 type 3V lithium coin cell battery, easily replaceable too. The IPX4 splash resistance should help too, just in case your luggage finds itself immersed in the contents of a leaked bottle of water. The price of the Noise Tag 1 certainly adds value to this proposition, as does the compatibility with Android phones. There are choices aplenty for luggage and item trackers, but most don’t do the task as consistently as the Noise Tag 1, or simply don’t work with Android as well as an iPhone. For once, no limitations to work around.
INCLUSIVITY DONE RIGHT

Ahead of WWDC, Speechify has won the 2025 Apple Design Award for Inclusivity. The developers say this is the most popular app in the world to have anything read out loud to you, with hundreds of voices and more than 50 languages available. The proposition is simple — turn any written text into audio, be it from a document, a scan, a PDF file or a web page. “The app offers an approachable UI with a variety of accessibility features – Dynamic Type and VoiceOver among them – that make it an instantly helpful tool for everyone: students, professionals, and leisure readers, as well as auditory learners and people with low vision. The design team clearly worked to reduce cognitive load all throughout the app too,” Speechify’s Founder & CEO, Cliff Weitzman, tells us. You’ll see them on the WWDC main stage.
OPENAI’S FUNDING PUSH
OpenAI has shared details on the next phase of its AI for Impact Accelerator in India. The key statistics include expanded funding and technical support to 11 nonprofits that use AI across multiple domains including across healthcare, education and agriculture, and that brings the total value of technical grants to $150,000. The new solution developers include Educate Girls, Rocket Learning, Noora Health and Digital Green. “These organisations are solving some of the country’s most complex challenges with ingenuity and empathy. The AI for Impact Accelerator – now part of OpenAI Academy – is our way of learning from them, while ensuring frontier technology is being shaped by and in service of real communities,” Pragya Misra, Policy & Partnerships Lead for OpenAI India, tells us.
Rocket Learning, for instance, is using WhatsApp and generative AI to deliver personalised early learning experiences for parents and daycare workers, in the hope of improving school readiness across underserved communities. They claim to have reached 4 million children across 11 states. Digital Green is an attempt at scaling peer-to-peer agricultural learning, with curated insights and crop recommendations, to help boost agriculture. Educate Girls claims to have re-enrolled over 2 million girls, and improved learning processes for as many as 2.4 million children across 30,000 villages, while using AI to identify out-of-school girls and help with faster re-enrolment.