Enterprise AI Weekly #6

State of the art is no longer SOTA, MCP gets juiced up further, automation and AI go hand in hand, AI agents explained and the latest AI audio enhancements.

Mar 27, 2025

Welcome to Enterprise AI Weekly #6

Welcome to the Enterprise AI Weekly Substack, published by me, Paul O'Brien, Group Chief AI Officer and Global Solutions CTO at Davies.

Enterprise AI Weekly is a short-ish, accessible read, covering AI topics relevant to business of all sizes. It aims to be an AI explainer, a route into goings-on in AI in the world at large, and a way to understand the potential impacts of those developments on your business.

If you’re reading this for the first time, you can read previous posts at the Enterprise AI Weekly Substack page.

Explainer: LLM generation parameters

As we increasingly explore and implement Generative AI tools like Large Language Models (LLMs) across our business, it's essential to understand that we have significant control over how these models generate content. Beyond crafting the perfect prompt, we can use specific settings known as 'generation parameters' to fine-tune the AI's output. Think of these as the control knobs on a complex machine, allowing us to guide the AI towards producing results that are more creative, more focused, more concise, or better aligned with specific task requirements. Understanding and utilising these parameters is key to moving from generic AI responses to tailored, valuable business solutions.

Some of the most common and impactful parameters you'll encounter include 'Temperature' and 'Top-P' (sometimes called Nucleus Sampling). 'Temperature' controls the randomness or creativity of the output. A low temperature (e.g., 0.1 or 0.2) makes the model more deterministic – it will consistently pick the highest-probability next word, leading to focused, predictable, and often more factual-sounding text. This is useful for tasks requiring consistency, like summarising reports or answering factual questions. Conversely, a higher temperature (e.g., 0.7 or 1.0) increases randomness, allowing the model to consider less likely words, resulting in more diverse, creative, and sometimes surprising outputs – potentially ideal for brainstorming marketing slogans or drafting multiple creative options.

'Top-P' works in a related way - it tells the model to consider only the smallest set of most probable next words whose cumulative probability exceeds the 'P' value. A high 'Top-P' (like 0.9) allows many possibilities, fostering creativity, while a low 'Top-P' (like 0.2) severely restricts the choices, making the output more predictable. Often, developers adjust either Temperature or Top-P, not usually both at extreme values simultaneously.

Other key parameters help manage the output's structure and content. 'Max Tokens' (or 'Max Output Tokens') sets a hard limit on the length of the generated response, which is crucial for ensuring brevity, controlling API costs (typically priced per token), and preventing excessively long outputs. 'Frequency Penalty' and 'Presence Penalty' are used to discourage repetition. Frequency Penalty reduces the chance of a word being selected proportionally to how often it has already appeared, while Presence Penalty applies a one-time penalty if a word has appeared at all, helping to introduce broader vocabulary.

It's important to note that while these concepts are common, the exact parameter names, their ranges, default settings, and specific behaviors can vary slightly between different LLMs (like OpenAI's GPT models, Google's Gemini, Anthropic's Claude, or open-source models).

Want to see these parameters in action and try them for yourself? Google AI Studio provides the option in the interface to set Temperature and Top-P, which are often only accessible via API.

For us as a business, effectively using these parameters is fundamental to successful Generative AI deployment. Tuning these controls allows us to align with brand voice - ensuring AI-generated content consistently reflects our desired tone, whether formal, creative, or technical, optimise for specific tasks - generating concise summaries with low temperature, brainstorming creative ideas with high temperature, or drafting varied responses using Top-P, manage costs - using 'max tokens' to effectively control API usage and costs and improve quality & safety - reducing repetitive or nonsensical output using penalties and careful temperature settings.

Experimentation is key. By understanding what these parameters do and testing different configurations for our specific use cases, we can unlock more reliable, efficient, and valuable results from our Generative AI initiatives.

1. Last week’s SOTA is this week’s chip paper

A week is a long, long time in AI. Like extreme dog years or something. 🐶 Remember last week in EAIW #5 when we talked about exciting advancements in the latest models? Forget last week’s ‘State Of The Art’. This is 2025 week 13 and it’s all changed again.

Chinese AI startup DeepSeek has quietly unveiled (via a release on open-source AI hub Hugging Face) its latest model, DeepSeek-V3-0324, showcasing significant improvements in reasoning and coding abilities. This update builds upon the success of their acclaimed initial V3 release in December, further solidifying DeepSeek's position as a formidable competitor in the global AI landscape. The new model demonstrates enhanced performance across various benchmarks, particularly excelling in tasks requiring advanced reasoning capabilities. You can try the new model out in the Hugging Face Playground.

In keeping with their apparent new ‘release every 7 days’ cadence, Google has introduced Gemini 2.5 Pro, touting it as their most intelligent AI model to date. This experimental version incorporates advanced reasoning capabilities directly into the base model, eliminating the need for separate "thinking" designations, and marks a point where all releases will be reasoning models going forward. Gemini 2.5 Pro has achieved state-of-the-art performance on various benchmarks, including a remarkable 18.8% score on the notoriously difficult Humanity's Last Exam. This advancement signifies a significant leap in AI's ability to handle complex problems and support more capable, context-aware applications. Most significantly, the release marks Google’s arrival at the lead of the AI pack after a decidedly slow start. You can experiment with the new model for yourself, free, in Google AI Studio.

Of course, OpenAI couldn’t let Google steal all the limelight, particularly after the release of the Gemini 2.0 Flash (Image Generation) model mentioned in EAIW #4. OpenAI has integrated a groundbreaking image generation capability into their GPT-4o model. This natively multimodal approach allows for precise, accurate, and photorealistic image outputs directly within the language model. The new feature promises to unlock useful and valuable image generation, augmented by the vast world knowledge contained within the model. This integration represents a significant step forward in combining text and visual modalities, potentially revolutionising how we interact with and create visual content using AI. It’s good. Very good. Social media is buzzing with users of the paid tier (the feature isn’t available free just yet) converting images to the ‘Studio Ghibli’ style, which works incredibly well. In my tests so far, the model has proved to be extremely impressive (if a little slow due to demand). It excels at creating text (something AI has been bad at) and has a clear understanding of context in its image creation. For example, I was able to give it some code for a UI component and it created an image of the rendered component. Impressive.

Of course, I need to include a demo of the rendering, right? 😀

Or, from this prompt: “Create a photograph of a person outside their house. In the driveway is a white Tesla model 3, which is damaged at the front as it looks like it has been in a minor accident. The person, the owner, is talking to a man who is the claims adjustor. He is holding an iPad that he's using to record information, and wearing a black gilet which has this Davies logo on it.”:

The competition between DeepSeek and Gemini highlights the ongoing debate between open-source and closed-source AI models. DeepSeek, as an open-source model, offers greater transparency, collaboration potential, and cost-effectiveness, making it more accessible to researchers and smaller entities. This approach fosters innovation and allows for community-driven improvements. On the other hand, Gemini, being a closed-source model developed by Google, benefits from substantial financial resources and proprietary control, potentially driving faster advancements and ensuring more consistent performance across applications.

The implications of this rivalry extend beyond mere technological competition. DeepSeek's open-source nature democratises AI technology, potentially leading to more diverse applications and faster global adoption, especially in underserved regions. However, it may face challenges in terms of security and quality control. Conversely, Gemini's closed-source approach allows for stricter control over access and usage, potentially reducing risks of misuse, but at the cost of limited transparency and collaborative potential. This dichotomy reflects the broader industry debate on balancing innovation, accessibility, and responsible AI development, with each approach offering distinct advantages and trade-offs for the future of AI technology.

2. MCP continues to gain traction with OpenAI support

This week brought significant updates to the Model Context Protocol (MCP) we spoke about in EAIW #2 and a surprising adoption by OpenAI, marking a major shift in the AI landscape. The MCP, an open standard developed by Anthropic, received key upgrades to enhance security, functionality, and interoperability of AI agents. The revised specifications now include an OAuth 2.1-based authorisation framework, streamable HTTP transport, JSON-RPC batching, and comprehensive tool annotations. These improvements aim to standardise how AI models interact with external tools and data sources, making it easier for developers to build robust AI applications. You can try out MCP servers in most AI enabled developer IDEs (e.g. Cursor, Windsurf, Trae or Cline), or Anthropic’s own Claude desktop app.

In an unexpected move, OpenAI announced its commitment to integrating MCP support across its products. CEO Sam Altman revealed that MCP support is now available in OpenAI's Agents SDK, with plans to extend it to the ChatGPT desktop application and the Responses API. This adoption by a major player like OpenAI signifies growing momentum for MCP within both enterprise and open-source communities, potentially establishing it as the de facto standard for AI integration.

For our business, the widespread adoption of MCP could revolutionise how we integrate AI into our workflows. This standardised protocol will allow our AI systems to seamlessly connect with various data sources and tools, eliminating the need for custom integrations for each new application. As a result, our AI assistants will become more capable, able to not just process information but also take actions across multiple platforms. This development opens up new possibilities for automation, data analysis, and decision-making processes, potentially leading to significant improvements in efficiency and innovation across our organisation.

3. n8n and Zapier add AI features

This week, the workflow automation space saw significant developments with n8n securing €55 million in Series B funding and Zapier introducing new AI features and MCP support.

n8n, the Berlin-based AI-powered workflow automation platform, raised €55 million ($60 million) in a Series B round led by Highland Europe. The company has seen impressive growth, with revenues increasing 5X and doubling in the last two months alone. n8n now boasts over 3,000 enterprise customers and around 200,000 active users. The funding will be used to further develop their AI-driven workflow automation technology and expand into new markets, particularly the U.S.

Meanwhile, Zapier has introduced AI features and Model Context Protocol (MCP) support, allowing AI assistants to perform real-world actions across more than 8,000 integrated apps without custom API development. This new capability transforms AI assistants from conversational agents into fully functional digital operators, capable of sending messages, scheduling meetings, and managing data across various platforms.

In related news, a recent report highlights the surge in AI and automation adoption across enterprises. The 2025 Automation Benchmark Report reveals that 99% of organisations recognise the critical need for seamless integration and automation, though 71% still lack unified platforms to achieve this. This underscores the growing importance of AI-powered workflow solutions in addressing the complex integration needs of modern businesses.

For our business, we know the AI-enablement of workflow platforms represents a significant opportunity to enhance operational efficiency and innovation. By leveraging these advanced tools, we can automate complex processes, reduce manual errors, and free up our teams to focus on higher-value tasks. This shift towards AI-driven automation should lead to substantial productivity gains, cost savings, and improved decision-making across our organisation. As we move forward, it will be crucial to evaluate how we can best integrate these AI-powered workflow solutions into our existing systems to maximise their potential benefits.

4. Agentic patterns, clarified

Earlier this week, I came across Simon Taylor’s post on LinkedIn, discussing agentic AI practices and presenting an excellent visual representation of key patterns. [Edit: this appears to be the initial source].

The diagram provides a clear visual representation of various AI workflows, ranging from automated rule-based systems to advanced agentic practices. At its core, agentic workflows emphasise a dynamic, iterative approach where AI systems plan, execute tasks using tools, and reflect on outcomes to refine their responses. This contrasts with simpler automated or non-agentic AI workflows that follow predefined steps or act directly on user queries without deeper reasoning.

Key patterns highlighted include Tool Use, Planning, and Reflection. The Tool Use Pattern showcases how AI systems leverage external tools (e.g., databases or APIs) to enhance task execution and generate more informed responses. The Planning Pattern emphasises breaking down complex queries into smaller tasks, executing them systematically, and ensuring the final output aligns with user needs. Lastly, the Reflection Pattern introduces an iterative feedback loop where the system evaluates its results and adjusts its approach if necessary. The Agentic Retrieval-Augmented Generation (RAG) Workflow further integrates external knowledge sources to refine responses, making it ideal for handling nuanced or data-intensive queries.

Adopting agentic AI practices will significantly elevate our business operations. These workflows enable us to tackle complex challenges with precision, leveraging AI's ability to plan, adapt, and improve iteratively. Whether it's optimising customer interactions, streamlining decision-making processes, or enhancing data analysis capabilities, agentic approaches ensure that our AI systems deliver tailored solutions that evolve with our needs. By embedding these practices into our operations, we position ourselves at the forefront of innovation, driving efficiency and fostering deeper insights across all areas of the business.

5. AI audio is advancing too

OpenAI has launched OpenAI.fm, a platform showcasing their latest text-to-speech capabilities. This release is part of a broader rollout of next-generation audio models, including advanced speech-to-text systems that outperform previous benchmarks. The new gpt-4o-mini-tts model offers unprecedented control over tone and timing, allowing developers to instruct the AI on how to speak. These innovations are now available through OpenAI's API, enabling more sophisticated voice-powered applications. You can try it out for yourself at the demo site. The ‘Fable’ voice is my favourite!

In a controversial move, audio startup Krisp has introduced an AI feature that converts Indian English accents to American English in real-time. The tool, which works with popular video conferencing platforms, aims to enhance mutual understanding in business communications. While Krisp reports improved sales conversion rates in trials, the technology raises ethical questions about cultural identity and accent bias.

Other notable audio AI developments include DeepVA's new Audio Enhancement Module, which uses AI-driven noise reduction and frequency restoration to improve speech clarity in recordings. This technology has potential applications in transcription accuracy and content moderation.

These advancements in AI audio technology will have significant implications for our claims third-party administrator business. The improved speech-to-text capabilities could streamline claims processing, while audio processing advancements will enable faster and more accurate transcription of customer calls. Accent conversion technology, while controversial, might enhance communication between claimants and adjusters from different regions. However, we must carefully consider the ethical implications and potential biases of such tools. Overall, these AI audio innovations present opportunities to improve efficiency and customer service in claims handling, but their implementation should be approached thoughtfully and responsibly.

POB’s closing thought(s)

Once again there’s been far too much happening in AI to be able to talk about everything, so remember to head on over to our Teams channel to get the full list of interesting topics.

This week I had a cool submission (thanks Nathan!), demonstrating the use of AI to provide scores on the ‘Split the G’ drinking game. Brilliant!

I was excited this week to see the Flutterflow team leaning into the vibe coding trend with Dreamflow, a Bolt / Lovable style development platform for Flutter.

Finally, you might be interested to check out Napkin, the ‘the AI for business storytelling’, that ‘turns your text into visuals so sharing your ideas is quick and effective’. It’s neat, and free right now, so give it a look!

Thanks again for reading and remember to forward the post to your colleagues if you think they’d be interested in signing up. 👍

Thanks for reading and I’d love to hear your feedback on whether you enjoy reading the Substack, find it useful, or if you would like to see something different in a future post. Remember, if you have any questions around AI at Davies, you can reply to this message to reach me directly or drop a note to the AI mailbox.

I have also created a Teams channel to discuss topics mentioned in this post, and AI in general, with your fellow readers, and of course me too. To join, use this link. I’ll also post the things that made my Pocket list, but didn’t make it to the post!

Finally, remember that while I may mention interesting new services in this post, you shouldn’t put business data in any web service or application without ensuring it has been approved for use.

Disclaimer: The views and opinions expressed in this post are my own and do not necessarily reflect those of my employer.

Enterprise AI Weekly