Enterprise AI Weekly #13

Klarna slows AI driven job cuts, Manus is now available to everyone, Gemini will soon be everywhere, insight from BCG and OpenAI bring the goodies.

May 15, 2025

Welcome to Enterprise AI Weekly #13

Welcome to the Enterprise AI Weekly Substack, published by me, Paul O'Brien, Group Chief AI Officer and Global Solutions CTO at Davies.

Enterprise AI Weekly is a short-ish, accessible read, covering AI topics relevant to business of all sizes. It aims to be an AI explainer, a route into goings-on in AI in the world at large, and a way to understand the potential impacts of those developments on your business.

If you’re reading this for the first time, you can read previous posts at the Enterprise AI Weekly Substack page.

Enjoy #13. Fortunately, I’m not superstitious!

Explainer: What are embeddings?

You may have heard embeddings or embedding models referenced when discussing augmenting data for use by LLMs. But what are they? Embeddings are numerical representations - typically vectors - that capture the semantic meaning and relationships within data such as text, images, or audio. By translating complex information into a mathematical form, embeddings enable AI systems to process and compare data based on meaning rather than just surface features. For example, similar words or sentences are mapped to points that are close together in the embedding space, allowing AI models to recognise nuanced relationships and context.

These embeddings are produced by specialised neural network models, known as embedding models, which learn to represent input data in a dense, lower-dimensional space. This process is crucial for tasks like semantic search, recommendation systems, and natural language understanding. For instance, in natural language processing, embeddings make it possible for AI to understand that “cat” and “meow” are semantically related, or that “king” minus “man” plus “woman” yields a vector close to “queen,” capturing analogies and deeper context.

In Retrieval-Augmented Generation (RAG) systems, embeddings play a vital role: both the documents in a knowledge base and user queries are converted into embeddings using the same model. The system then performs a vector search to retrieve the most semantically relevant content, which is provided as context to a language model for generating accurate, context-aware responses.

For RAG applications, several embedding models are widely used. Proprietary options like OpenAI’s Ada-002 and the newer text-embedding-3-small and text-embedding-3-large models are popular for their robust performance and ease of integration. Open-source alternatives such as E5 and Cohere’s embed v3 also offer competitive accuracy and flexibility for organisations that prefer more control or need to self-host their solutions. Choosing the right embedding model depends on factors like retrieval accuracy, latency, scalability, and whether your use case requires open-source or proprietary technology.

For our business, leveraging state-of-the-art embedding models in our AI systems means we can unlock the full value of our data assets. This enables smarter search, more relevant recommendations, and better customer interactions, powered by AI that understands the context and meaning behind the information we handle.

1. Klarna slows AI driven job cuts

Klarna, the Swedish fintech giant known for its “buy now, pay later” services, has been at the forefront of AI-driven transformation in the financial sector. Over the past two years, Klarna aggressively integrated artificial intelligence across its operations, resulting in a dramatic reduction of its workforce. The company’s headcount shrank from around 5,500 employees to just over 3,400 - a cut of around 40% - as AI tools replaced hundreds of roles, particularly in customer service and marketing. Klarna’s partnership with OpenAI and the rollout of internal AI assistants like "Kiki" and ChatGPT Enterprise means that all engineers and most staff in communications and marketing now use AI daily.

This bold approach was not just about augmenting human work but directly replacing it. Unlike competitors such as Stripe and PayPal, which focused on upskilling staff to work alongside AI, Klarna opted for cost-cutting by reducing its workforce and imposing a hiring freeze for most roles except AI and machine learning specialists. The company’s strategy also included moving over five hundred employees to its sister company and using natural attrition to downsize without mass layoffs. These changes yielded significant efficiency gains, including a reported $10 million reduction in marketing costs within six months.

However, Klarna is now slowing the pace of its AI-driven job cuts and reconsidering the balance between automation and human input. In a recent interview with Bloomberg, CEO Sebastian Siemiatkowski indicated that the company is looking to bring back more real-person customer service representatives, acknowledging that a purely AI-driven model led to a drop in service quality. This shift comes as Klarna prepares for a long-anticipated IPO and responds to market and regulatory pressures, as well as concerns from unions about the broader societal impact of AI-induced job losses.

For our business, Klarna’s experience is a timely reminder of both the opportunities and challenges posed by rapid AI adoption. While automation can drive efficiency and cost savings, it is crucial to monitor its impact on service quality, employee morale, and brand reputation. As we continue to integrate AI into our own operations, Klarna’s evolving approach underscores the importance of maintaining a thoughtful balance between technological innovation and the irreplaceable value of human expertise - especially in areas where customer trust and nuanced decision-making are paramount.

2. Manus no longer has a waitlist

Previously referenced in EAW #4 and EAW #5, Manus AI, a leading autonomous AI agent, has just announced a major expansion of its access model. Effective immediately, anyone can sign up and use Manus without a waitlist. Every user now receives a one-time bonus of 1,000 credits and a free daily task (worth 300 credits), which refreshes at midnight but does not roll over. This move is designed to make Manus more accessible, flexible, and valuable for both new and returning users, allowing them to explore its advanced capabilities without upfront financial commitment.

For those requiring more intensive or frequent use, Manus offers paid subscription plans at $19, $39, and $199 per month, each providing additional credits, enhanced features, and priority support. The platform’s credit system is central to its usage: credits are consumed based on the complexity and duration of tasks, which can range from generating reports and automating workflows to complex data analysis and software development. The daily free credits enable users to handle routine tasks, while the bonus and paid plans support larger or more intricate projects.

Developed by the Chinese startup Monica and officially launched in early March 2025, Manus is touted as the world’s first fully autonomous general-purpose AI agent. Unlike traditional AI assistants, Manus AI is designed to independently plan, execute, and deliver on complex tasks-ranging from report writing and data analysis to content creation and software development-without requiring step-by-step human input. This leap in autonomy is underpinned by a multi-agent architecture, enabling Manus to break down intricate workflows into specialised components, each handled by sub-agents, and to operate seamlessly across text, image, and code modalities.

The platform’s impact has been amplified by its performance on the GAIA benchmark, where it outperformed leading models like OpenAI’s GPT-4 and Microsoft’s latest AI systems, setting a new standard for real-world problem-solving by AI agents. Its ability to integrate with external tools such as web browsers, code editors, and databases - and its adaptive learning capabilities position Manus as a serious contender in the race for practical AI automation. The buzz around Manus AI is not limited to its technical achievements; its closed beta phase has generated intense demand, with invitation codes being resold at premium prices, reflecting both the hype and the perceived value of its agentic capabilities.

However, Manus AI’s meteoric rise has also prompted scrutiny regarding its legal structure, data governance, and broader implications for the future of AI. While the company maintains a legal entity in Singapore, most development is based in China, raising questions about data privacy, regulatory compliance, and the potential for cross-border data transfers. Industry observers have also noted the importance of distinguishing genuine innovation from marketing hype, especially as the agentic AI space becomes increasingly competitive and subject to global regulatory oversight.

This development is relevant for our business as it lowers the barrier to experimenting with next-generation AI automation. Manus AI’s autonomous task execution, multi-modal capabilities, and seamless integration with existing tools can streamline our workflows, reduce manual workloads, and accelerate innovation across departments. However, this capability is offered against the backdrop of concerns around compliance. We should note that employees are likely to discover Manus and want to try the functionality themselves, and we should ensure we manage that as required.

3. Gemini will soon be everywhere

It wouldn’t be EAW without at least some Google news, right? Google has just announced a significant expansion of its Gemini AI, bringing its advanced generative models to a wider range of devices and platforms - including Android smartwatches, cars, TVs, and extended reality (XR) environments. This move aims to make Gemini a truly ubiquitous AI assistant, capable of providing context-aware help and intelligent suggestions wherever users interact with technology. On Wear OS smartwatches, Gemini will assist with tasks like quick replies and reminders; in vehicles equipped with Android Auto, it will power smarter navigation and voice controls; and on Android TV and XR devices, it will enhance content discovery and immersive experiences.

Beyond Google's own ecosystem, there is mounting evidence that Gemini may soon appear on Apple devices as well. Multiple sources, including court testimony from Google CEO Sundar Pichai and code found in iOS betas, suggest that Apple is actively working to integrate Gemini into its Apple Intelligence suite. This partnership is expected to be announced as early as June 2025 at Apple’s Worldwide Developers Conference, with Gemini offered as an opt-in, privacy-focused enhancement alongside Apple’s existing AI tools. The integration could bring more advanced reasoning, natural conversations, and context-aware suggestions to iPhone users, potentially leapfrogging the current capabilities of Siri and Apple’s in-house models.

Google have this week updated their Gemini 2.5 models, introducing implicit caching, a powerful enhancement that automatically reduces costs by reusing common parts of previous requests without requiring developers to manually manage caches. Building on their earlier explicit caching feature, implicit caching dynamically detects shared prefixes in requests and applies a 75% token discount on those cached portions. This means developers can achieve significant cost savings simply by structuring prompts with consistent beginnings and variable endings, while also benefiting from lowered minimum token thresholds for cache eligibility. The system even provides metadata showing exactly how many tokens were cached, making cost tracking transparent and straightforward.

In other exciting news this week, Google have updated their logo. You mean you didn’t spot it at the top of this post? On the left here - before, on the right - after. I like it.

For our business, these developments (except the logo) are highly relevant. The rapid proliferation of Gemini across both Google and (potentially) Apple platforms signal a new era of cross-ecosystem AI capabilities. As employees and customers increasingly expect seamless, intelligent assistance on any device, our digital products and internal tools must be designed with interoperability and AI integration in mind.

Leveraging implicit caching in Gemini 2.5 models can lead to substantial savings on AI-related operational costs, especially as we scale AI-driven features that involve repetitive or templated prompts. By optimising prompt design to maximise cache hits, we not only reduce expenses but also improve response efficiency, enabling faster and more cost-effective AI interactions.

4. Boston Consulting Group have some useful insights

This week I came across several posts from Boston Consulting Group on the AI Insights section of their website (minus points for how bad their site looks on an ultrawide monitor, though).

The first post that caught my eye is titled ‘AI Agents Can Be the New All-Stars on Your Team’. The article explains how AI agents are emerging as transformative tools for businesses, moving beyond traditional predictive and conversational AI to actively reshape business processes and deliver tangible outcomes. Of particular interest is the emphasis that to unlock their full potential, companies must strategically redesign processes, codify knowledge, ensure high-quality data and systems, and rigorously manage risks, with most effort focused on people and processes rather than just technology. Embracing AI agents requires rethinking the business for an AI-driven future, where the true competitive advantage lies in how well organisations integrate and leverage these agents within their unique operational context.

Next up is a post titled ‘Closing The AI Impact Gap’ (with an embedded deck). The article reveals that while AI remains a top strategic priority and investment is accelerating (with one in three companies planning to spend over $25 million on AI in 2025), only about a quarter of organisations are realising significant value from their AI initiatives. The main reason for this “AI impact gap” is not technology itself, but how AI is deployed: most companies focus on too many small-scale, productivity-driven projects and fail to track financial outcomes, whereas leading organizations concentrate their efforts on a few high-impact initiatives that reshape core business functions and invent new offerings, invest in upskilling their workforce, and systematically measure operational and financial returns. The report emphasizes that winning with AI is as much a sociological challenge as a technological one, requiring a disciplined, focused approach centered on organisational change, workforce development, and embedding AI into workflows - following the 10-20-70 rule (10% algorithms, 20% tech/data, 70% people/processes) - to truly unlock AI’s transformative potential.

Finally, ‘When Companies Struggle to Adopt AI, CEOs Must Step Up’ is also a worthwhile read, not because of the provocative title, but because of the focus it places on employees, and the recognition that AI adoption is a human challenge. The article emphasises that successful AI adoption requires co-creation with employees, upskilling managers to personalise the change, and focusing on a high-impact projects. Companies that empower experimentation and treat adoption as a human challenge - addressing emotional and psychological barriers - position their organisations to thrive amid accelerating technological change.

For our business, these posts are reassuring. Many of the approaches, themes and challenges discussed are well understood with our organisation and reflect actions we are already taking, and plan to accelerate. We already have strong support from our leadership (including but not limited to our CEO), which is invaluable as we navigate this era of innovation.

5. OpenAI continues to deliver

Before pulling this week’s post together, I felt like this week has been a little slower on the AI front. Opening my Pocket list revealed that this is not really the case, with no less than 4 announcements from OpenAI!

OpenAI has expanded its fine-tuning capabilities with two distinct approaches now available: reinforcement fine-tuning (RFT) for the 4o-mini model and supervised fine-tuning (SFT) for the 4o-nano model. RFT on 4o-mini allows organisations to iteratively improve model performance using a chain-of-thought reasoning process and a task-specific grading function-ideal for complex, domain-specific tasks where nuanced reasoning is critical. This method enables the model to learn from feedback and optimise for outcomes aligned with business goals, such as legal or scientific reasoning. In contrast, SFT on 4o-nano is designed for speed and cost efficiency, letting users train the model on labeled datasets for straightforward tasks like classification or content generation. While RFT is best for adaptive, reasoning-heavy scenarios, SFT offers a fast, affordable route for high-volume, structured outputs.

GPT-4.1, previously accessible only via API, is now integrated into ChatGPT for all paid users. This model brings enhanced coding and instruction-following capabilities, improved context handling, and greater reliability. Notably, GPT-4.1 mini replaces GPT-4o mini as the fallback model for free users, ensuring broader access to advanced AI features even without a subscription. This update bridges the gap between developer tools and end-user applications, making GPT-4.1’s advanced reasoning and efficiency available directly within the ChatGPT interface.

A highly requested feature, PDF export, is now live in ChatGPT. Users can seamlessly export entire conversations or deep research reports as fully formatted PDF documents, complete with tables, images, and citations. This enhancement streamlines the process of sharing AI-generated insights, archiving important discussions, and integrating ChatGPT outputs into business workflows. The PDF export feature is available to Plus, Pro, and Team subscribers, with Enterprise and Edu support on the horizon.

ChatGPT’s Deep Research tool now includes a GitHub connector, allowing users to link their repositories for real-time analysis of codebases and engineering documentation. This integration enables developers and technical teams to ask complex questions about their actual projects, generate documentation, and break down product specifications into actionable tasks-all within ChatGPT. The connector respects existing GitHub permissions and is available to Plus, Pro, and Team users, with Enterprise access coming soon. This marks a significant step toward integrating AI directly into developer workflows and knowledge management systems.

For our business, these developments collectively enhance our ability to customise, operationalise, and scale AI across the organisation, particularly as the capabilities proliferate to our preferred provider. Reinforcement and supervised fine-tuning allow us to tailor models for both complex and routine tasks, improving accuracy and efficiency in domain-specific applications. PDF export streamlines documentation and compliance, while the GitHub connector provides the potential to bridge the gap between AI and our engineering processes, should we choose to use it.

POB’s closing thoughts

There are no scary deepfake images this week, you’ll be pleased to hear!

I’ll leave you all with a TechCrunch article which suggests that legal AI startup Harvey, which provides specialised AI tools for the legal sector, is in discussions to raise $250M at a $5B valuation. With the news this week that the company would start using AI models from Anthropic and Google, adding to the models it uses from its backer, OpenAI, it leaves me pondering what the ‘secret sauce’ is that supports this valuation. In the same way I am surprised at the $3B valuation for Windsurf, the AI development startup, as covered in EAW #12, I guess.

Oh, one final thing - someone is building an AI hedge fund. 😀

Thanks for reading, enjoy the rest of your week and have a great weekend. 👍

Regular readers will know I have created a Teams channel to discuss topics mentioned in this email, and AI in general, with your fellow readers, and of course me too. To join, use this link.

I also post the things that made my Pocket list, but didn’t make it to the email - ‘EAW Extra’ - to Teams.

I’d love to hear your feedback on whether you enjoy reading the Substack, find it useful, or if you would like to see something different in a future post. What AI topics are you most interested in for future explainers? Are there any specific AI tools or developments you'd like to see covered? Remember, if you have any questions around AI at Davies, you can reply to this message to reach me directly or drop a note to the AI mailbox.

Finally, remember that while I may mention interesting new services in this email, you shouldn’t upload or enter business data into any external web service or application without ensuring it has been explicitly approved for use.

Disclaimer: The views and opinions expressed in this post are my own and do not necessarily reflect those of my employer.

Enterprise AI Weekly