Enterprise AI Weekly #1

Chain of Thought explained, welcome Grok 3, the future of programming revealed, scary voice cloning, the return of Google and a graph for your consumption.

Feb 20, 2025

Welcome to Enterprise AI Weekly #1

Welcome to the Enterprise AI Weekly Substack, published by me, Paul O'Brien, Group Chief AI Officer and Global Solutions CTO at Davies.

Enterprise AI Weekly is a short-ish, accessible read, covering AI topics relevant to business of all sizes. It aims to be an AI explainer, a route into goings-on in AI in the world at large, and a way to understand the potential impacts of those developments on your business.

If you’re reading this for the first time, you can read previous posts at the Enterprise AI Weekly Substack page.

1. Explainer: What is ‘Chain of Thought’?

If you’ve been using the latest AI tools, you might have noticed something has changed.

Previously, when you’ve asked an AI tool a question, you would receive an answer very quickly but just that, an answer. But recently, with the advent of new models like OpenAI o1 and Deepseek R1, answers are taking longer to arrive and the outputs include steps on how the model came to that answer. This is called ‘Chain of Thought’ (CoT), a technique that enhances the reasoning capabilities of language models.

Unlike traditional approaches, CoT breaks down complex problems into a series of steps, mimicking human thought processes. Here’s how it works:

Prompt: The AI model receives a prompt containing a question, task etc.
Step-by-step reasoning: Instead of immediately producing an answer, the model generates a series of steps to be taken to lead to the solution.
Intermediate outputs: Each step in the reasoning process is executed and articulated to the prompter.
Result: After working through the intermediate steps, the model provides its answer.

Clearly, this allows LLMs to solve more complex tasks, but more importantly than that, it allows humans to better understand how the AI arrived at the answer provided.

This is interesting, but it’s also important for businesses in understanding the future direction of AI systems. A key consideration in regulatory terms is the ability to provide ‘interpretable AI’, i.e. the ability to understand how the system arrived at a specific decision. Although CoT isn’t the complete answer, and much of this needs to be tested from a regulatory perspective, it’s a step in the right direction.

2. xAI released Grok 3.

Grok 3, the latest AI model from Elon Musk's xAI, was released this week, positioning itself as the latest powerful competitor in the AI landscape. It boasts significant improvements over its predecessor, with xAI claiming it to be 10-15 times more powerful than Grok 2, no doubt enabled in no small part by xAI’s training on ~100K Nvidia H100 GPUs, a capability unsurpassed to date by competitors. For context, a H100 GPU is about $25k, so that’s a $2.5bn setup.

Grok 3 functions as both a generalist AI and a reasoning (‘Chain of Thought’) model, featuring a "Think mode" that transforms it into a step-by-step problem solver, particularly effective for tasks in maths, science, and coding.

The new model is said to outperform other leading AI systems in various benchmarks, although I do wonder if we are approaching a point where, as we saw in the mobile device space, companies optimise and ‘game’ benchmarks. Initial impressions suggest it may be comparable to models like OpenAI's o3 and DeepSeek's R1, but it may still lag behind top-tier models like GPT-4o and Claude 3.5 Sonnet. Of course, Elon Musk has positioned Grok 3 as a "maximally truth-seeking AI," emphasising its ability to challenge political correctness and censorship.

There’s not yet an API for Grok 3, so it can’t be meaningfully tested in a business context. In the short term, Grok is unlikely to challenge the bigger players for supremacy in the business space, but it’s hard to count out Elon in the long game. The likely biggest benefit to businesses will be, as always, that competition drives the market forward and improves capabilities across the board. xAI have also promised to Open Source the current-1 model.

3. The end of programming as we know it?

Tim O'Reilly, respected programmer, author, and publisher at O'Reilly Publishing, posted a blog article this week entitled ‘The end of programming as we know it’. In it he talked about his views on where software development is headed and how programmers won’t be replaced, they’ll just work differently. And, as we’ve talked about in some of our internal conversations, the winners in the job market will be those who are most adept at using AI and the very concept of being a programmer will change. Spot on.

One of the things referenced in the article that I particularly like is the term CHOP, or “chat-oriented programming”, albeit Tim calls it a buzzword, hah. This describes the use of natural language to carry out programming tasks, through a chatbot-style interface. Wishful thinking or near-future reality? Very much the latter.

In the AI team we have already started investigating, testing and developing with CHOP tools and similar products, and the results have been astounding. We have spent time with Bolt.new, lovable, cursor and Cline (amongst others) and they have the ability to revolutionise development as they mature. Are they completely ready for prime time today? No, but there is already huge potential for POCs (Proof of Concepts / Proof of Value) or MVPs (Minimum Viable Products). In under a week we delivered ‘Davies AI Accelerator’, a system that we are ready to show to our Consulting clients and package up as part of a broader offering.

4. Hey, is that really you?

Whenever I talk about AI, I usually end up talking about a ‘post-truth’ world, where we can’t trust what we read, hear or see anymore. Maybe a year ago this seemed a bit of a bold prediction, but I can firmly say we have arrived. Anything you read could well be AI generated (I wrote this myself, honest!), images and videos can be generated in a flash (and video is going to get unbelievably good), but the capability that astounds me most now is voice cloning.

I met with the City of London Police about 6 months ago where they highlighted that cyber criminals were starting to exploit this sort of technology and that with only 15 seconds of audio of someone speaking, good-enough clones could be created. Well, now you can try it for yourself.

I came across fish.audio recently, which is an Open Source project to enable voice cloning. I recorded myself speaking, asked it to generate a clone and it did, quickly, and is now able to generate me saying absolutely anything, almost instantly. Scary stuff.

While we immediately think of the potential impact of this on fraud, perhaps there are business applications too. Imagine voice messages from assessors visiting clients, feeling more personal than text or email, yet computer-generated? The same but with video in the future?

5. Google is back!

Around a year ago, I had a conversation with some folks about key players in the AI space, Microsoft’s partnership with OpenAI, AWS’ open approach with Bedrock and we collectively pondered how Google had been somewhat ‘left behind’. I pointed out that although this was true at the time, we should absolutely not bet against Google and I was sure they would catch up. And here we are.

Earlier in February, Google unveiled a series of significant updates to its Gemini AI model family. Google first released Gemini 2.0 Flash for general availability, making it accessible to developers through the Gemini API in Google AI Studio and Vertex AI. The updated version offers enhanced performance, a class-leading 1 million token context window (this relates to how much data you can send in the prompt and, for context, the whole bible is around 750k tokens!), and multimodal input capabilities, with plans for image and audio output in the near future. Alongside Gemini 2.0 Flash, Google introduced Gemini 2.0 Flash-Lite, a cost-efficient model designed to maintain the speed and pricing of its predecessor while improving quality.

Google has also released the experimental version of Gemini 2.0 Pro, which Google touts as its "best model yet for coding performance and complex prompts". This model boasts a massive 2 million token context window (!), enabling comprehensive analysis of vast amounts of information It also features the ability to call external tools like Google Search and code execution.

These are genuinely leading models amongst their peers.

We are living in a world where AI providers are all advancing their capabilities continuously, and at a breakneck pace. It’s important that we avoid aligning solely with one provider, but build systems that have the capability to integrate with different models on demand based on capability and requirements.

POB’s closing thought

Earlier this week I was talking to the team about how it feels like we’ve entered an amazing period of technology advancement, with capabilities growing at an incredible pace while cost decreases at the same time.

I asked CoPilot (which you can use free of charge at https://copilot.microsoft.com by the way!) to create me an image articulating capability vs cost vs opportunity and it produced this, which I thought was quite good (albeit perhaps it understates the steepening of the curves we are now experiencing).

What a time to be alive and in technology, eh! Until next time…

Disclaimer: The views and opinions expressed in this post are my own and do not necessarily reflect those of my employer.

Enterprise AI Weekly