Enterprise AI Weekly #9
RAG explained, OpenAI RELEASE ALL THE THINGS, Shopify lays down the AI rules, Cloudflare quietly builds its AI capability, Canva goes vibe coding, and we think about moats.
Welcome to Enterprise AI Weekly #9
Welcome to the Enterprise AI Weekly Substack, published by me, Paul O'Brien, Group Chief AI Officer and Global Solutions CTO at Davies.
Enterprise AI Weekly is a short-ish, accessible read, covering AI topics relevant to business of all sizes. It aims to be an AI explainer, a route into goings-on in AI in the world at large, and a way to understand the potential impacts of those developments on your business.
If you’re reading this for the first time, you can read previous posts at the Enterprise AI Weekly Substack page.
This week I’d like to collect some feedback on the posts. Nothing super complicated, just a couple of questions with simple links to click to submit your answer.
How do you find the technical level of the Enterprise AI Weekly post? Do you find it a bit basic, just right, or too techie?
How do you find the length of the Enterprise AI Weekly post? Do you find it too short, just right, or too long?
Thank you for answering, I’ll report back on results in next week’s issue!
Explainer: Retrieval Augmented Generation
Retrieval-Augmented Generation (RAG) is an AI framework that combines the generative power of large language models (LLMs) with real-time information retrieval. Unlike traditional LLMs, which rely solely on their pre-trained knowledge, RAG enhances responses by incorporating up-to-date, domain-specific data from external sources. This is particularly useful for tasks requiring accuracy, such as customer support, legal document analysis, or dynamic knowledge updates.
A key feature of RAG is its use of embeddings and vector databases. Embeddings are numerical representations of text, images, or other data types capturing their semantic meaning. When a user submits a query, it is converted into an embedding using the same model that generated the database embeddings. The system then performs a similarity search in the vector database to identify the most relevant data points. By retrieving contextually relevant information and feeding it into the LLM as part of the input prompt, RAG ensures that responses are both accurate and contextually grounded.
Vector databases play a pivotal role in this process by enabling fast and efficient similarity searches. They store embeddings as dense numerical vectors and use algorithms like approximate nearest neighbor (ANN) search to find matches. For example, if a query asks about "renewable energy policies," the vector database might retrieve documents related to "solar incentives" or "wind energy regulations," even if those exact terms weren't used in the query. This capability allows RAG systems to deliver highly relevant and nuanced outputs.
While RAG excels in combining retrieval with generation, other approaches may be better suited for specific scenarios. Cache-Augmented Generation (CAG), for instance, preloads frequently accessed data into the model's context window for faster responses without real-time retrieval. Knowledge-Augmented Generation (KAG) integrates structured knowledge graphs to enhance logical reasoning and accuracy, making it ideal for domains like medicine or law. Additionally, fine-tuning LLMs with domain-specific data remains an option but requires significant resources and lacks the flexibility of on-demand updates provided by RAG.
For businesses adopting AI, RAG offers significant advantages in scalability, accuracy, and cost-effectiveness. By leveraging vector databases to retrieve real-time information, companies can reduce risks associated with outdated or incorrect responses while minimising the need for extensive model retraining. This makes RAG particularly appealing for industries where information changes frequently or where precision is critical.
Moreover, RAG empowers businesses to tailor AI solutions to specific use cases by integrating proprietary knowledge bases or customer data into their workflows. For example, an e-commerce platform could use RAG to provide personalised product recommendations based on live inventory data, while a healthcare provider could retrieve the latest clinical guidelines during patient consultations. In essence, RAG transforms AI from a general-purpose tool into a domain-specific expert, driving innovation and competitive advantage across industries.
1. It’s been a busy week for OpenAI
OpenAI is back in the news this week with a series of significant updates and new products, each pushing the boundaries of AI capability, safety, and application.
First came the release of GPT-4.1 and its smaller counterparts. The release of the new model, along with its accompanying versions (GPT-4.1 mini and nano), marks a leap forward in coding, instruction-following, and long-context comprehension. These models can process up to one million tokens, matching recent Google releases (coincidence?), potentially making them highly effective for handling extensive documents and complex tasks. GPT-4.1 outperforms previous models in coding by up to 27% and is now available exclusively via OpenAI’s API. The models also feature updated knowledge up to June 2024 and are more cost-efficient, which is particularly beneficial for businesses seeking scalable AI solutions. GPT-4.1 is 26% less expensive than GPT-4o, while GPT-4.1 nano is the company’s cheapest and fastest model to date.
OpenAI’s o3 and o4-mini reasoning models also arrived this week, and are designed for advanced reasoning, excelling in coding, math, science, and visual perception. The o3 model pushes the state of the art in analytical tasks and visual understanding, while o4-mini offers fast, cost-effective performance, especially in math and coding. Both models demonstrate improved instruction-following and provide more natural, conversational responses by leveraging memory and web sources. Their multimodal abilities allow them to interpret and reason with images, sketches, and diagrams, opening new possibilities for visual data analysis and creative ideation.
Following the recent release of Claude Code, OpenAI has launched Codex CLI - an open-source command-line tool that brings OpenAI’s latest reasoning models directly to developers’ terminals. Powered by o4-mini by default, Codex CLI can autonomously read, modify, and execute code locally, supporting macOS and Linux (with experimental Windows support). This tool is a step towards agentic AI, allowing developers to automate complex coding and system tasks, manage files, and even interact with other software tools - all from the command line.
With the release of o3 and o4-mini, OpenAI has also implemented its most comprehensive safety program to date, referencing an updated "Preparedness framework". This includes new safeguards against biorisks and other high-risk misuse scenarios, with the company reserving the right to adjust its safety standards in response to the broader AI landscape.
OpenAI is reportedly developing an X-like social media platform, with an internal prototype that integrates ChatGPT’s image generation into a social feed. While it’s unclear if this will become a standalone app or a feature within ChatGPT, the move signals OpenAI’s ambition to compete with established platforms like X (formerly Twitter) and Meta. This would give OpenAI access to real-time data streams, enhancing its models and expanding its reach in digital content and interaction, and one wonders whether it’s also a new front in the ongoing battle between Sam Altman and Elon Musk, linked to OpenAI’s non-profit status.
At TED 2025, CEO Sam Altman addressed the profound opportunities and governance challenges posed by AI. He emphasised the transformative potential of AI but also acknowledged the moral and societal dilemmas that come with unprecedented technological power. Altman’s remarks highlighted the need for robust oversight and transparency as AI becomes integral to daily life and business operations - it’s well worth a watch.
Finally, rumours are swirling that OpenAI is considering acquiring Windsurf, the AI enabled code editor, for $3bn. This will be surprising if it happens, especially at this price, but let’s see. It could be interesting for the direction of the product and the vibe coding space.
These developments are highly relevant to us as they signal the availability of AI tools that are more powerful, flexible, and safer than ever before. Enhanced reasoning, multimodal capabilities, and tools like Codex CLI can streamline workflows, boost productivity, and unlock new avenues for innovation. At the same time, OpenAI’s focus on safety and governance aligns with our commitment to responsible AI adoption. Staying informed and engaged with these advancements ensures we remain competitive, agile, and prepared to leverage the full potential of AI in our business operations.
2. Reflexive AI use at Shopify: a new baseline
Shopify CEO Tobi Lütke’s recent tweet and internal memo have potentially set a new standard for AI adoption in the workplace.
The memo reads “Reflexive AI usage is now a baseline expectation at Shopify.” This marks a significant cultural and operational shift for the company, moving from encouraging AI experimentation to mandating its integration into daily workflows. Employees are no longer simply asked if they use AI, but rather how effectively they are leveraging it to drive results. AI tools are now as fundamental to Shopify’s operations as email or Slack, with proficiency in AI factored into performance reviews and resource allocation. Teams must demonstrate that a task cannot be accomplished by AI before requesting additional resources, underscoring the seriousness of this mandate.
Lütke frames this change as both an opportunity and a necessity, aligning it with Shopify’s core values of constant learning and thriving on change. He emphasises that AI is not just a tool for automation, but a “creative and productivity multiplier” that empowers employees to become more effective thought partners, researchers, critics, tutors, and pair programmers. The expectation is that everyone, regardless of their current familiarity with AI, will embrace these tools to augment their skills and fill gaps in their expertise. This is positioned as the most rapid transformation in work practices Lütke has witnessed in his career, and he invites all employees to experiment and learn collaboratively.
The impact of this shift extends beyond internal operations. Shopify’s approach is influencing the broader e-commerce industry, signaling to competitors, partners, and merchants that AI proficiency is now a core competency for future success. By democratising access to user-friendly AI tools, Shopify aims to lower the complexity barrier for entrepreneurs and small businesses, enabling them to achieve more with fewer resources. The integration of AI into every stage of project development - from ideation to prototyping and execution - sets a new benchmark for innovation and efficiency in digital commerce.
Shopify’s move to make reflexive AI use a baseline expectation is a clear sign of where the future of work is headed. As AI becomes more deeply embedded in everyday business processes, organisations that proactively build AI literacy and integrate these tools into their culture will gain a significant competitive edge. This shift is not just about automating routine tasks; it’s about empowering employees to think bigger, move faster, and deliver more value with less friction. For any business, embracing a reflexive AI mindset means staying relevant in a landscape where adaptability, continuous learning, and technological fluency are key drivers of growth and resilience.
3. Cloudflare is quietly ramping up its AI capability
It may not be the first name you think of when it comes to AI (although I am a long time fan, as a customer since 2011!), but Cloudflare continues to position itself as a key platform for building, deploying, and securing AI-powered applications at scale.
With the recent launch of Cloudflare for AI, the company has unified its growing suite of AI tools and infrastructure, making it easier for organisations of all sizes to leverage advanced AI capabilities - while ensuring robust security, compliance, and operational resilience. This integrated approach is designed to accelerate AI adoption, reduce complexity, and empower businesses to innovate confidently in a rapidly evolving digital landscape. Here are a few recent announcements.
Cloudflare Workflows has reached general availability, offering a production-ready, durable execution engine for building resilient, multi-step applications on Cloudflare Workers. Workflows allow developers to break down complex processes - such as order fulfilment, payment processing, or AI agent orchestration - into discrete, retriable steps with automatic state persistence. Notably, Workflows are tightly integrated with the Agents SDK, enabling developers to build sophisticated AI agents that can coordinate tasks, wait for external events (including human-in-the-loop approvals), and recover gracefully from failures. This architecture is especially powerful for agentic AI, where agents must manage long-running, stateful interactions with both users and external systems.
AutoRAG is Cloudflare’s new fully managed solution for Retrieval-Augmented Generation (RAG) pipelines, now in open beta. Traditionally, building a RAG system required stitching together vector databases, embedding models, LLMs, and custom logic - often resulting in brittle and hard-to-maintain infrastructure. AutoRAG abstracts away this complexity: just point it at your data (such as files in R2, Cloudflare’s S3 equivalent object storage), and it automatically handles ingestion, chunking, embedding (using Workers AI), vector storage (in Vectorize), semantic retrieval, and high-quality response generation. AutoRAG continuously monitors and reindexes your data sources, ensuring your AI remains up-to-date without manual intervention. This makes it ideal for powering AI search, chatbots, and internal knowledge assistants with fresh, context-aware responses.
Cloudflare has embraced the Model Context Protocol (MCP), the emerging open standard for enabling AI agents (like Claude or Cursor) to interact with external tools and services via MCP servers. With Cloudflare’s new remote MCP server tooling, developers can now deploy MCP servers directly to the Cloudflare platform, making them securely accessible over the Internet - complete with OAuth-based authentication and persistent context. This unlocks a new paradigm: AI agents can not only issue instructions but also execute real-world tasks (like sending emails, updating databases, or triggering workflows) on behalf of users. The integration with the Agents SDK and Workflows means these servers can orchestrate complex, multi-step operations, further blurring the line between conversational AI and autonomous digital workers.
Cloudflare’s broad AI platform is built on a set of robust capabilities, including Workers AI, which provides the ability to run inference on 50+ open-source models (text, vision, embeddings, etc.) on serverless GPUs, globally distributed for low latency and scalability. Pay only for what you use, with no infrastructure management overhead, Vectorize, a fully managed vector database for storing and querying embeddings, essential for semantic search, recommendations, anomaly detection, and powering RAG workflows and AI Gateway, a platform to centralise, monitor, and control AI application traffic with analytics, caching, rate limiting, request retries, and model fallback. AI Gateway Integrates seamlessly with Workers AI and third-party providers, offering observability and cost management.
Cloudflare’s AI offerings are directly relevant for any business seeking to accelerate digital transformation, improve operational efficiency, and unlock new value from data. By providing a unified, secure, and developer-friendly platform, Cloudflare lowers the barriers to deploying advanced AI - whether you’re building customer-facing chatbots, automating internal workflows, or experimenting with agentic AI. The platform’s focus on security (including data privacy, prompt filtering, and abuse prevention), cost efficiency (serverless, pay-as-you-go), and scalability (global edge deployment) ensures that innovation does not come at the expense of risk or complexity.
4. An unexpected player enters the vibe coding space
Canva has made a surprise move by launching its own AI code generator, known as Canva Code, directly within its popular design platform. This new feature enables users to create interactive mini-apps - such as custom calculators, quizzes, or interactive maps - simply by describing what they want in plain language. The AI, powered by Anthropic’s Claude large language model, then generates the necessary code and embeds the resulting widget seamlessly into any Canva design. This marks a significant step in democratising software development, as it empowers non-programmers - designers, marketers, educators, and small business owners - to add sophisticated functionality to their projects without writing a single line of code.
Unlike traditional coding, where deep technical knowledge and manual programming are prerequisites, Canva’s AI code generator lowers the barrier to entry. Users can now prompt the AI assistant with requests like “create an interactive map that highlights each region when clicked,” and instantly receive a working, customisable widget. Canva’s integration of this technology within its familiar drag-and-drop interface makes the experience accessible and intuitive, especially for those whose primary expertise lies outside of software engineering, for example in marketing teams.
The launch of Canva’s AI code generator is poised to introduce the concept of vibe coding to millions of new users. This method has already gained traction among tech enthusiasts and startups, but Canva’s massive user base means that everyday creators - many with little or no coding experience - will now have the opportunity to experiment with and benefit from this paradigm.
For businesses, the implications of Canva’s AI code generator, and the expansion of vibe coding tools from development platforms to other products, could be profound. By making software creation vastly more accessible, companies can empower a wider range of employees to contribute to digital innovation. Product managers, marketers, and other non-technical staff can now prototype tools, automate workflows, or enhance customer-facing content without waiting for engineering resources. This leads to faster product development cycles, reduced costs, and the ability to test and iterate on ideas with unprecedented speed.
Organisations that embrace these tools will be better positioned to respond to market changes, customise internal processes, and deliver richer experiences to their customers. However, it’s important to remain mindful of the limitations: while AI-generated code can accelerate development, it still requires oversight to ensure quality, security, and compliance. Ultimately, Canva’s move signals a future where creativity and technical implementation are more tightly integrated - and where the ability to “vibe code” may become a vital skill for business teams across industries.
5. What is a business moat, in the AI future?
A business moat is a sustainable competitive advantage that protects a company from competitors, much like a medieval moat protects a castle (or protects our characters above!). In the context of AI and rapid technological change, the nature of these moats is shifting. Traditionally, moats were built on economies of scale, proprietary technology, network effects, high switching costs, and strong brands. Companies like Microsoft, Google, and Meta have historically leveraged these moats to maintain dominance.
However, the AI revolution is disrupting these traditional defences. The rise of open-source AI models and cloud platforms, together with the incredible growth in capability alongside falling costs, means that access to cutting-edge technology is no longer limited to a few giants. Instead, smaller companies and startups can now build powerful applications using open-source tools and cloud infrastructure. This shift is eroding the value of traditional moats based solely on proprietary technology or scale, making it harder for any single company to maintain an unassailable lead.
According to Jerry Chen’s article "The New New Moats", the future of defensibility in business lies in building what he calls "systems of intelligence" - AI-powered applications that create value by integrating and analysing data from multiple sources. These systems go beyond simply owning data; they derive actionable insights and automate workflows, making them deeply embedded in customer operations. For example, an application that combines web analytics, customer data, and social signals to predict user behavior creates a defensible position that’s hard for competitors to replicate.
Chen argues that while open-source and cloud have made deep technology a shallower moat, defensibility can still be achieved by focusing on customer experience, workflow integration, and solving specific industry problems. The new moats are not about having the biggest model or the most data, but about delivering unique value through intelligent applications that are hard to displace once adopted.
In essence, the new moats are a return to the fundamentals: delivering value, building trust, and embedding your solution into the core workflows of your customers.
Despite the technological upheaval, Chen notes that the fundamentals of business building remain unchanged. The most enduring companies will still be those that master go-to-market strategies, develop strong brands, and create sticky products with high switching costs. In the AI era, the ability to integrate AI into core business processes and deliver tangible outcomes will be the differentiator. Companies that focus solely on technology without solving real customer problems risk being caught "between open source and a cloud place," vulnerable to rapid commoditisation.
Understanding and building business moats in the AI era is crucial for our long-term competitiveness. As AI and open-source tools level the playing field, our defensibility will come from how we apply these technologies to solve unique customer problems, integrate into their workflows, and deliver consistent value. This means investing in customer experience, building trust, and ensuring our solutions become indispensable to our clients. By creating systems of intelligence and embedding ourselves deeply in customer operations, we can create the kind of moat that not only survives technological shifts but thrives on them.
POB’s closing thought(s)
Well done, you read to the end! As a reward I have a little treat. It’s Easter this week, of course, so to celebrate, I have hidden an Easter egg somewhere in this post! There is a hidden link to a spreadsheet with ten free codes for Perplexity Pro, worth $200 each. The first ten of you to find it get a licence (for personal use)! No codes left!
You may have noticed the icons in this post (normally fully licenced ‘Color’ icons from Icons8), have been replaced with custom generated cartoons. I’ve been experimenting with consistent image creation using ChatGPT 4o. It’s neat, no?
Over the last few posts, I’ve talked about large context window LLMs, supporting a million tokens or more, so it’s interesting to see OpenAI going that route too. But do large context windows work? This article on VentureBeat entitled ‘Bigger isn’t always better: Examining the business case for multi-million token LLMs’ suggests not.
Finally, a colleague asked me yesterday whether I just get AI to draft this post. The answer is no, but of course AI helps me out! I am unhealthily obsessed with following what is happening in AI, so throughout the week whenever I see something interesting, I store it away in Pocket. Then, when I have a bit of spare time during the week, I prepare segments for those items, using AI to check my writing and suggest improvements as needed. But one day…!
That's it from me this week. The AI progress keeps coming fast, one day I won’t be sending these out every single week, but there are no signs of things slowing down yet.
Happy Easter if you celebrate it, and hope you have a great weekend. 👍
Thanks for reading and I’d love to hear your feedback on whether you enjoy reading the Substack, find it useful, or if you would like to see something different in a future post. Remember, if you have any questions around AI at Davies, you can reply to this message to reach me directly or drop a note to the AI mailbox.
If you’re reading this for the first time, you can read previous posts at the Davies AI Substack page.
I have also created a Teams channel to discuss topics mentioned in this post, and AI in general, with your fellow readers, and of course me too. To join, use this link. I’ll also post the things that made my Pocket list, but didn’t make it to the post!
Finally, remember that while I may mention interesting new services in this post, you shouldn’t put business data in any web service or application without ensuring it has been approved for use.
Disclaimer: The views and opinions expressed in this post are my own and do not necessarily reflect those of my employer.