Enterprise AI Weekly #39
NotebookLM updated, McKinsey's 'State of AI 2025' survey, GPT 5.1 and group ChatGPT, Google's 5 day agents course, AI self-awareness, a developer study, celebrity AI voices and cheap model training
Welcome to Enterprise AI Weekly #39
You’re reading the Enterprise AI Weekly Substack, published by me, Paul O'Brien, Group Chief AI Officer and Global Solutions CTO at Davies.
Enterprise AI Weekly is a short-ish, accessible read, covering AI topics relevant to businesses of all sizes. It aims to be an AI explainer, a route into goings-on in AI in the world at large, and a way to understand the potential impacts of those developments on your business.
Alongside the main newsletter, I’m also exploring the capabilities of AI-Enhanced Development (AIED) and Vibe Coding. I set aside a bit of time in my week for keeping up with tech and doing this sort of thing - usually a Sunday morning - so I’m reserving an hour each week to create some interesting things with AI. Our first project was Boring Expenses, created to demonstrate the Vibe / AIED process and to show what can be achieved in only one virtual working day. If you haven’t seen the finished product yet, head on over to our “Boring demo”. 😊
If you’re reading this for the first time, you can read previous posts at the Enterprise AI Weekly Substack page. Enterprise AI Weekly is now available for anyone to sign up at https://enterpriseaiweekly.com! Please share the link and encourage others who might find it interesting to sign up.
There’s a cool new NotebookLM update out!
I normally talk about NotebookLM, Google’s AI-powered research and writing assistant, in the final paragraph of an issue, but there’s been a big update this week… and it’s worthy of being here up top!
The update adds a new, much-requested Deep Research feature along with expanded support for a variety of common file types. The Deep Research tool enhances the research process by autonomously exploring and analysing sources to produce detailed reports. This allows users to assemble a rich, evidence-based knowledge base without interrupting their workflow. Deep Research works by reading uploaded documents, cross-checking public web data, and generating multi-layered reports with summaries, tables, and citations.
Alongside Deep Research, NotebookLM now supports more file types to better integrate into users’ diverse work environments. These newly supported file types include Google Sheets, PDFs stored in Google Drive, Microsoft Word (.docx) files, images such as photos of handwritten notes or brochures, and Drive files via URLs. This broader file support allows users to directly add spreadsheets for structured data analysis, upload research papers or reports without downloading them locally, and include images as part of their notebooks. Previously supported source types like Google Docs, Slides, website URLs, and YouTube videos remain available, but these new additions significantly enhance convenience and functionality.
Google emphasises that this update is about helping users build a systematic, usable knowledge base without leaving their existing workflow by combining tools like Deep Research, data tables, and artefacts within the ecosystem of Google Drive, Docs, and Slides. The update reflects an effort to make NotebookLM a more comprehensive assistant for deep research, synthesis, and content creation in enterprise or academic settings. If you’ve not given it a try yet, check it out!
Now, onto the rest of the news. Enjoy EAIW #39!
1. McKinsey release their ‘state of AI in 2025’ survey report
Nearly nine out of ten large organisations now use artificial intelligence in at least one function, but only about a third have succeeded in scaling AI across their enterprise, according to McKinsey’s 2025 State of AI survey. Although interest in agentic AI - systems that can plan and execute multi-step tasks - is high, deployment is mostly limited to IT and knowledge management, and the overall adoption landscape remains marked by experimentation and pilots rather than full operational integration.
Key Findings:
Widespread AI experimentation: 88% of companies report using AI in at least one area, up from 78% last year, with the majority still in the pilot phase. Only a third have achieved full-scale deployment, and these are more likely to be larger businesses - nearly half of companies with revenues above $5 billion have reached scaling, compared to just 29% of those under $100 million.
Agentic AI on the rise: 62% of respondents are experimenting with agentic AI, but only 23% are scaling such systems, primarily in IT and knowledge management roles. For most business functions, agent deployment is still rare, with less than 10% of companies scaling agentic AI beyond isolated use cases.
High performers lead: A small segment (about 6%) reports attributing more than 5% of their organisation’s operating profit (EBIT) to AI. These companies pursue ambitions beyond efficiency, using AI for innovation, growth, and competitive differentiation - and are three times more likely to anticipate transformative change. High performers also invest heavily, with over a third allocating more than 20% of their digital budgets to AI, and routinely redesign individual workflows to maximise business impact.
Workforce impact mixed: 32% expect workforce reductions from AI over the next year, while 43% forecast no change, and 13% anticipate hiring for new roles. Larger organisations and high performers are the most likely to expect a significant shift in workforce size, sometimes both reducing headcount in automated functions and hiring for AI-specific skills such as software and data engineering.
Rising risk mitigation: Companies are managing more AI-related risks, addressing concerns about inaccuracy, privacy, explainability, and compliance as experience grows. High performers see more issues - especially with intellectual property and regulation - because of their broader, mission-critical AI integration.
The findings highlight the importance of not just experimenting with AI, but developing solid strategies for scaling these technologies. For enterprises, the real value emerges when leadership actively sponsors AI initiatives, invests in skills and infrastructure, and prioritises workflow redesign. The survey is a timely reminder that the competitive gap is widening - not just between those who use AI and those who don’t, but between those who manage to weave it deeply across the fabric of their business and those still testing the waters. As we continue our AI adoption journey, it’s crucial that we focus on integration, risk management, and measurable impact rather than chasing buzzwords. If you notice an AI capability working well in one business function, perhaps it’s time to ask - “shouldn’t we consider scaling that across the business?”.
2. OpenAI releases GPT-5.1 and group chats for ChatGPT
OpenAI has just released GPT‑5.1 for ChatGPT, introducing improvements to both intelligence and ease of use. The upgrade sees the main model split into two flavours: GPT‑5.1 Instant, focused on speed and conversational warmth, and GPT‑5.1 Thinking, which brings extra rigour and clarity to complex queries. The headline change is a more user-friendly experience, with responses that now feel friendlier and more natural in tone, all while maintaining sharpness in reasoning where it matters (hopefully avoiding the sycophancy issues we discussed in EAIW #11)!
GPT‑5.1 introduces new “personality presets”, letting users select how ChatGPT comes across in conversation - Professional, Candid, Quirky, alongside refreshed versions of Default, Friendly, and Efficient modes. Users can not only tailor the tone to suit business or banter, but also directly control traits such as warmth, brevity, and even emoji use. These controls work instantly across chats, making it less of a faff to tune responses - ideal for enterprise settings where adaptable communication style matters. The model also shows marked improvement in following instructions to the letter, addressing an issue that sometimes tripped up previous versions.
Another major upgrade is improved adaptive reasoning - GPT‑5.1 dynamically judges whether a task is simple or complex and adjusts its processing accordingly. For quick questions, you get rapid, snappy answers, while requests that require deeper thinking receive more thorough, measured responses. This helps maintain productivity in workplace chats and ensures no topic gets short-changed through underthinking. The update also sidesteps the detached, “cold” feel of earlier versions, making interactions with ChatGPT less robotic and more engaging.
Alongside GPT‑5.1, OpenAI is piloting group chats for ChatGPT, now available in select regions. This allows teams of up to twenty to collaborate with the AI in a shared chat space, ideal for brainstorming or decision-making sessions. You can start a group via invite link, set custom instructions for each chat, and rely on ChatGPT to only chime in when contextually relevant - rather than hijacking the chat thread like a disruptive colleague. Notably, group chat responses are powered by ChatGPT 5.1 Auto, which automatically picks the best model for each participant based on their plan, and privacy is maintained as the bot doesn’t use individual memory within groups.
For enterprise, both upgrades mean a smarter, more adaptable AI at your fingertips. The ability to set communication styles ensures we can rely on ChatGPT for both sensitive internal messaging and external-facing articles without a rewriting marathon. Group chat enables robust cross-team collaboration with AI support, helping keep projects on track and integrating diverse perspectives.
3. Google launches a 5-Day AI agents course
A new Kaggle hosted 5-Day AI Agents Intensive course, delivered in partnership with Google, aims to equip participants with practical expertise in building and deploying AI agents - from basic structures to multi-agent systems fit for enterprise environments. Every day introduces a pillar of the agent ecosystem, blending theory and hands-on code labs to demystify how agents work and how they can be evaluated and scaled.
Course Structure and Highlights:
Daily Pillars: Each of the five days is dedicated to a major theme: agent architectures, embedded tools, memory and evaluation, agent workflows, and productionisation. Attendees interact with white papers and code labs, supporting their learning with live discussion on Discord and YouTube.
Practical Assignments: Early modules get hands-on with building agents using Python, integrating Google’s Gemini tools via the ADK, and creating agents that collaborate or operate in parallel. By mid-week, participants tackle real-world connections like function calling - for instance, plugging in SQL tools to a chatbot, and orchestrating workflows in LangGraph.
Capstone Project: The course culminates with a project where skills are combined to build a fully functioning AI agent, employing MLOps techniques such as Vertex AI’s foundation model management and AgentOps for robust deployment and monitoring.
Features and Enterprise Relevance:
Tool Integration: The course introduces practical agent toolkits, showing how custom Python functions can be transformed into interactive tools for agents - crucial for integrating bespoke business logic into automated workflows.
Agent Evaluation: Rigorous emphasis is placed on evaluating agents using real-world test sets, enabling participants to understand regression, develop test suites, and use evaluation files for reliable agent benchmarking.
Collaboration and Production: Techniques for coordinating multiple agents (sequential and parallel workflows) are covered, along with guidance on bringing solutions to production safely and effectively - a vital link for scaling projects in any large organisation.
For enterprise AI teams, courses like this represent an accessible route to upskilling on agent technology - even if you’re not using the Google stack - which is increasingly necessary as business and workflow automation migrate to agent-powered platforms. The rigor and practicality of Google’s curriculum can accelerate internal developments, whether creating customer service agents, automating knowledge work, or industrialising internal tools. Moreover, with direct applications for monitoring and evaluation already embedded, this approach aligns well with our priorities for safe, scalable AI that’s measurable and secure.
4. Signs of self-awareness in the AI machine
Anthropic’s latest research delves into the mysterious world of AI self-awareness, asking a rather human question: can large language models actually introspect, or are they simply bluffing with plausible-sounding answers? Using both creative methodology and digital neuroscience tricks, the team has unearthed signs that contemporary models, especially Claude Opus 4 and 4.1, may possess the early signs of introspection - a limited but intriguing ability to observe and report on their own internal state. Don’t worry, the machines aren’t about to upstage Descartes just yet, but as enterprise leaders, understanding where this could lead is vital.
Introspection, in the AI context, is the capacity for a model to “notice” what’s happening in its own neural circuits and communicate this when prompted. Anthropic’s experiments - using a clever technique called concept injection - involved subtly manipulating a model’s neural state and then quizzing it about the change. For instance, after inserting a neural fingerprint corresponding to the idea of “all caps”, the model sometimes identified this intruder before it ever affected its output. However, this ability is far from reliable: Claude Opus 4.1 only spotted these injections about 20% of the time, often failing or, when pushed too hard, hallucinating in spectacularly odd ways.
Even with its inconsistencies, this level of self-awareness is novel in commercial AI. The research suggests that as models become larger and better-trained, their introspective faculties may improve, offering potential for trustworthy, debuggable AI in enterprise settings. Yet the ability is context-sensitive and far from reliable today, sometimes confounding itself with almost comedic justifications when operations go awry. The mechanisms behind these glimpses of introspection remain speculative, hinting at a patchwork of narrow circuits rather than a grand, unified ‘self-monitoring’ system.
If AIs develop reliable self-reporting, this could revolutionise model transparency and error tracking, a win for teams seeking assurance before rolling new systems into production. The research also reveals that models can modulate their own representations if incentivised, suggesting future potential for AI to not only “know its mind” but also deliberately steer in response to operational, compliance, or risk requirements. However, there’s an important warning: just as people are not always honest or self-aware, neither are these models, and their “thoughts” can be manipulated.
For enterprises, these developments sit at an important crossroads: AI may be moving from an unpredictable black box to a system that might, one day, explain “what went wrong” in language we can audit. This enhances confidence in deploying agents for critical roles and speeds up compliance checks where traceable AI decision-making is mandated. Anthropic’s openness about both the potential and limits of model introspection is refreshing - and a useful reminder for us as we consider how, when, and where to deploy next-generation AI solutions. As always, it’s best approached with professional curiosity and a hint of healthy scepticism.
5. The SPACE of AI for developers: a Microsoft study
Microsoft Research’s paper, “The SPACE of AI: Real-World Lessons on AI’s Impact on Developers”, dives into the true effects of artificial intelligence on software engineering - moving past tired debates about robots taking jobs and zeroing in on actual developer experience. The research goes big, drawing on both surveys and interviews from over 500 developers (mainly inside Microsoft but with voices from other major tech firms) and framing its findings through the SPACE lens: Satisfaction, Performance, Activity, Collaboration, and Efficiency. It’s a refreshingly human take on a topic too often discussed without speaking to those who live it day-to-day.
Developers are overwhelmingly integrating AI into their daily routines, with 75% using these tools regularly. While there’s talk in some corners about skill decline or AI “gimmicks”, most survey respondents are not worried about being replaced any time soon. Instead, developers report that AI is doing what it should - it’s augmenting them, cutting through the repetitive bits, and freeing up time for problems that require genuine thought. The boosts are most obvious for efficiency and throughput, with job satisfaction also getting a lift. However, when it comes to tackling complex or creative work, the machines still defer to the humans.
According to the study, whether AI benefits an individual developer often depends less on their seniority and more on company culture. Developers in organisations actively advocating for AI adoption are seven times more likely to use AI tools daily compared to less supportive places. Peer learning, clear team norms, and local champions help unlock these gains across the board. Team-wide adoption not only raises the bar for productivity but may gently push hesitant adopters to join in, ensuring more consistent outcomes and shared best practices.
Unlike other productivity dimensions, fewer than half of AI adopters felt AI had directly improved collaboration, though few disagreed outright. The nuance is interesting: interruptions for simple coding questions have shrunk, and team chats now veer more towards brainstorming and architecture discussions. As AI streamlines routine communication, the quality of collaborative effort actually improves, challenging old assumptions about technology and teamwork.
To get the most out of AI, organisations must move from isolated pilots to widespread team support, structured training, and open sharing of what’s working in practice. Our developers can use this as a springboard for their own best practices, knowing that peer learning and clear company direction multiply the value gained. AI’s real impact is a lift in effectiveness, satisfaction, and meaningful output - not just faster code. As we aim to build a robust, forward-thinking AI culture, the SPACE framework shows where to focus our efforts to keep developer teams thriving.
POB’s closing thoughts
I previously talked in Vibe with POB about how I’d used ElevenLabs to create a voice-driven AI agent experience for Strictly Come Dancing, and the company announced a new partnership this week bringing the unforgettable voices of legends such as Sir Michael Caine, Dr Maya Angelou, and Judy Garland to narrated audio experiences. This fusion of exclusive licensing with advanced text-to-speech technology gives literature and other content a distinctly personal touch, transforming how audiences connect with stories and characters - whether for education, entertainment, or business communication. The ElevenReader app now makes it possible for listeners across the globe to experience iconic narrations, blending nostalgia with innovation. I’ll be interested to hear how many famous voices sign up to be featured on the platform in the future (the list is a little light on “alive people” at the moment).
Finally this week, Weibo’s new open-source AI model, VibeThinker-1.5B, has caused quite a stir by outperforming the much larger DeepSeek-R1 model, all on a shoestring post-training budget of just $7,800. With only 1.5 billion parameters under the bonnet, it’s proof that clever architecture and efficient training can deliver top-tier results without racking up hefty hardware bills, a fact that should grab the attention of any enterprise (or finance director) keeping an eye on the bottom line.
This achievement exemplifies the direction modern AI is heading - smarter and more streamlined, rather than simply bigger for the sake of it. It signals an era where open-source solutions and optimised models are becoming increasingly viable for our use cases, offering competitive performance without the cost typically associated with leading AI technologies. As budgets tighten and expectations rise, this sort of innovation gives us breathing room to experiment with advanced capabilities without ballooning costs.
Thanks for reading, I hope you have a great weekend! 👍
I’d love to hear your feedback on whether you enjoy reading the Substack, find it useful, or if you would like to see something different in a future post. What AI topics are you most interested in for future explainers? Are there any specific AI tools or developments you'd like to see covered? Remember, if you have any questions around this Substack, AI or how Davies can help your business, you can reply to this message to reach me directly.
Finally, remember that while I may mention interesting new services in this post, you shouldn’t upload or enter business data into any external web service or application without ensuring it has been explicitly approved for use.
Disclaimer: The views and opinions expressed in this post are my own and do not necessarily reflect those of my employer.











