Enterprise AI Weekly #29

Grok learns to code, OpenAI updates Codex's agentic capabilities, Claude comes to Chrome - with a warning, Dots levels up OCR, Taco Bell's AI goes awry and Google takes on Duolingo

Sep 05, 2025

Welcome to Enterprise AI Weekly #29

You’re reading the Enterprise AI Weekly Substack, published by me, Paul O'Brien, Group Chief AI Officer and Global Solutions CTO at Davies.

Enterprise AI Weekly is a short-ish, accessible read, covering AI topics relevant to businesses of all sizes. It aims to be an AI explainer, a route into goings-on in AI in the world at large, and a way to understand the potential impacts of those developments on your business.

We’re also working on something together. I’m building an app, Boring Expenses, in a Vibe Coding style, to demonstrate the process and to provide a test bed for technologies we talk about in future issues. I previously mentioned that I set aside a bit of time in my week for keeping up with tech and doing this sort of thing - usually a Sunday morning - so I’m setting aside an hour each week to progress our experiment.

If you’re reading this for the first time, you can read previous posts at the Enterprise AI Weekly Substack page. Enterprise AI Weekly is now available for anyone to sign up at https://enterpriseaiweekly.com! Please share the link and encourage others who might find it interesting to sign up.

Last week in EAIW #28 I spoke about nano banana, Google’s new image model (aka Gemini 2.5 Flash Image), and this week ElevenLabs have powered up the capabilities with their audio product and Runway’s video generation. Check this out.

As always, thanks for reading, I appreciate you being here. Enjoy EAIW #29!

1. xAI moves into the coding space with Grok Code Fast 1

Grok Code Fast 1 from xAI has officially landed, bringing a notably rapid and affordable agentic coding model to the AI programming landscape. Touted by xAI as purpose-built for agentic development, Grok Code Fast 1 is designed to integrate seamlessly with common developer workflows, enabling everything from shell commands and file editing to multi-step coding operations without fuss. Initial reactions from developers have focused on its remarkable speed and low costs, suggesting a strong challenge to more established players in the market.

Feedback from early adopters has been broadly positive. Developers using partner tools such as Cline and Cursor have described Grok Code Fast 1 as so fast they felt compelled to rethink their entire workflow, reporting that coding sessions have gone from days or weeks to mere hours. Notable comments include “feels 10x better and faster than Claude” and “changed my existing working methods”. The model's architecture enables performance in excess of 92 tokens per second, and its massive context window handles large projects without breaking a sweat. While some users have noted reliability issues with harder tasks, many have found it trustworthy for everyday development, especially for rapidly iterating code and bug fixes. In line with competing AI coding players, as mentioned in EAIW #27, xAI have published a Prompt Engineering guide for the model.

Grok Code Fast 1 sports a 314-billion-parameter Mixture-of-Experts design, allowing for both speed and intelligent routing of expert capabilities. Its context window stretches to 256,000 tokens, enabling coherent handling of sprawling codebases and long logs. The model has been integrated with popular platforms like GitHub Copilot and Kilo Code, and initial pricing is vastly lower than rival models - just $0.20 per million input tokens, outpacing GPT-5 High and Claude Sonnet 4 by wide margins. Live benchmarking reports cache hit rates above 90 percent and throughput up to 190 tokens per second in optimised deployments, further highlighting its utility for continuous integration and day-to-day coding tasks.

While Grok Code Fast 1 sets new speed standards, discussion persists about xAI’s approach to AI safety guardrails. Previous Grok models have been criticised for insufficient transparency and limited documentation of safeguards, prompting industry experts to call for clearer communication and better safety systems. Despite technical capability, some users and researchers remain uneasy about the reliability and ethical boundaries of the system, encouraging open debate about achieving the right balance between performance and responsible deployment.

For an enterprise leaning into developer productivity and tech modernisation, Grok Code Fast 1 signals a potential leap forward in tool-assisted software engineering. Its speed and value-for-money alignment could reduce development bottlenecks, improve code quality, and accelerate innovation without disproportionately inflating costs. However, as with any bleeding-edge tool, leaders should keep a close eye on the evolving conversation around safety, transparency, and the appropriateness of deployment for sensitive projects. It’s an exciting, pragmatic option for boosting output - but not one to use with the brakes off.

2. As agentic development grows, OpenAI updates Codex

Agentic development is gaining significant traction - and not just with me, and to stay at the forefront of the space, OpenAI has rolled out a substantial upgrade to its Codex AI coding assistant, aimed at significantly enhancing developer productivity through tighter integration and more autonomous capabilities. Central to the update is the introduction of a dedicated IDE extension that brings Codex directly into popular development environments such as Visual Studio Code, Cursor, and other VS Code forks. This enables developers to manage, preview, and edit code changes within their familiar editor while benefiting from Codex’s AI assistance without switching context. Importantly, the new extension allows seamless movement of coding tasks between local environments and Codex’s cloud, preserving workflow state and enabling asynchronous delegation of heavier or long-running tasks to the cloud.

Another major change is the simplification of access through ChatGPT account integration, eliminating the need for separate API key setups for Codex in both the IDE and command line interface (CLI). This unified approach connects all workspaces via the user’s ChatGPT subscription, making the transition between Cloud and local development smoother and more coherent. The upgraded CLI boasts an improved interface, additional commands, image input support, message queuing, approval modes, and an enhanced task list feature. These upgrades tap into GPT-5’s agentic capabilities, enabling Codex to act more proactively - from reading files and running commands to editing code independently within defined permission boundaries.

On the collaboration side, Codex now supports automated code review on GitHub. Developers can configure Codex to automatically review new pull requests (PRs), with the AI analysing PR intent, cross-referencing across the codebase, and running test code to validate changes behaviorally. Alternatively, Codex can be summoned on demand within a PR by mentioning @codex for detailed review suggestions and potential fixes. This feature reflects a shift towards AI-driven asynchronous collaboration, augmenting traditional static code analysis with contextual and reasoning capabilities. It positions Codex not just as a coding helper, but as an active participant throughout the development lifecycle, and aligns Codex capabilities with those of Github Copilot Coding Agents.

This release is part of OpenAI’s broader vision for agentic programming, where AI tools evolve from simple assistants into autonomous coding agents capable of multi-tasking and managing complex workflows. With the Codex update, developers can interact with AI in real-time or assign tasks for asynchronous completion, blending these approaches into unified workflows. The company is also expanding integrations across developer tools, including issue trackers and continuous integration pipelines, to embed Codex deeper into enterprise software development practices. This enhancement is a notable stride in elevating AI-assisted coding from an optional aid to a standard industry practice, fostering faster cycles, higher code quality, and greater developer focus on innovation rather than routine tasks.

For enterprise businesses, embedding advanced AI coding assistants directly within developers’ preferred environments accelerates productivity and streamlines collaboration across dispersed teams. The GitHub pull request automation adds a valuable layer of quality assurance, reducing human review burden and expediting deployment processes. Moreover, the flexible local-to-cloud task handoff improves resource management, allowing teams to optimise both local computing power and cloud scalability. Choosing to adopt and integrate such AI-enhanced workflows will be instrumental in maintaining competitiveness and improving delivery speeds in our software projects.

3. Anthropic launches Claude for Chrome pilot

Anthropic have launched their Claude for Chrome preview (via a waitlist), which is designed to let the Claude AI assistant operate directly in the browser. Building on months of integration into calendars, documents, and other software, this Chrome extension allows trusted users to instruct Claude to interact with web pages by reading the content, clicking buttons, filling forms, and handling tasks such as managing calendars, drafting emails, and processing routine expense reports. As much work happens in the browser, this approach promises a more seamless and efficient way to bring AI assistance to everyday digital workflows. The rollout begins as a controlled pilot with 1,000 users, focusing on gathering real-world feedback to fine-tune safety and performance before wider release.

Despite the exciting possibilities, browser-based AI introduces novel security and safety challenges, particularly around “prompt injection” attacks. These attacks involve malicious actors hiding dangerous instructions within websites, emails, or documents that the AI may inadvertently follow. For example, corrupted content might instruct the AI to delete files, steal data, or carry out unauthorised transactions. In testing, Anthropic discovered a 23.6% success rate for such attacks without defences, including a case where Claude was tricked into deleting emails without user confirmation. To mitigate this, Anthropic implemented layered protections including careful permission controls, action confirmations for sensitive tasks, blocking of access to high-risk categories like financial services, and advanced classifiers that detect suspicious patterns - even in legitimately appearing contexts. These measures have substantially cut the attack rate, with some browser-specific attack scenarios dropping to zero success thanks to enhanced mitigations.

A particularly insidious technique with prompt injection is embedding harmful instructions hidden deep within the webpage data structure, especially in the Document Object Model (DOM). These instructions can be camouflaged in invisible form fields, page metadata such as URL text or tab titles, or other parts only accessible to browser agents rather than visible to human readers. Such hidden data poses a serious risk because it bypasses typical user scrutiny and can deceive AI agents directly. This kind of attack has also been observed in various browser extensions, where malicious code or instructions are concealed to execute harmful actions under the guise of legitimate extension behaviour. Anthropic’s ongoing red-teaming tests include these complex scenarios to improve the robustness of Claude’s defenses before the extension is generally available.

For enterprise businesses, this is particularly relevant as embracing AI tools in browsers introduces a new dimension of operational efficiency alongside novel cybersecurity concerns. Understanding and preparing for prompt injection risks, especially those exploiting hidden webpage data, is essential for IT and security teams. Anthropic’s pilot with Claude for Chrome demonstrates a thoughtful, iterative approach to balancing innovation with safety, emphasising careful permissions and advanced attack detection. As browser-based AI integration grows, businesses will need to stay vigilant and informed to harness these new capabilities securely.

4. Dot(s) OCR powers up image to markdown processing

Dots OCR is a new, cutting-edge open-source optical character recognition (OCR) and document parsing tool powered by an AI vision-language model with 1.7 billion parameters. It unifies layout detection and content extraction within a single architecture, making it highly efficient and versatile for processing documents in over one hundred languages. The tool is designed to handle a wide range of document types including PDFs and images, extracting text, tables, mathematical formulas, and maintaining accurate reading order and document structure. Its compact but powerful model achieves state-of-the-art accuracy while delivering performance that is faster than many traditional OCR tools, making it ideal for researchers, developers, and enterprise users who require reliable and high-quality data extraction from complex documents.

Dots OCR stands out with its combination of speed, accuracy, and multilingual support. Its unified AI engine eliminates the need for separate pipelines for layout and text recognition, simplifying deployment and usage. The model supports extraction of structured data such as tables and formulas in addition to unstructured text, preserving the original format and layout fidelity in outputs, which can be rendered as JSON, markdown, or HTML formats. It accommodates a diverse global user base and various document complexities, including multi-page files and scanned documents. The tool offers fast processing, often delivering output in seconds, and supports multiple image and document formats such as JPEG, PNG, and PDF.

Since its release on Hugging Face, Dots OCR has garnered a positive reception from users experimenting with the model. Early adopters highlight impressive accuracy, especially with handwritten notes and table extraction from images. On a Reddit discussion forum, users shared experiences praising its recognition capabilities for challenging handwriting and table structures, noting the model’s performance on tables as particularly commendable, though a few mentioned some issues with column alignment in complex tables. Comments also reflect appreciation for the model’s ability to handle multilingual documents and deliver detailed layout parsing, which surpasses many open-source and proprietary alternatives.

That said, some users note that while the model excels in many areas, table and formula extraction still demand refinement, and handling of very high-resolution images requires down sampling. Occasional bugs related to repeated characters in unusual cases are being acknowledged by developers.

In an enterprise context, integrating a robust and scalable OCR solution like Dots OCR can streamline handling of vast document volumes and multilingual data sources, enhancing data workflows and analytics capabilities. Its high accuracy and speed mean reduced turnaround times for document-heavy processes, unlocking productivity gains. Furthermore, the open-source nature aligns with tech innovation strategies by providing flexibility for customisation and integration without extensive licensing costs.

5. Taco Bell is rethinking its use of AI after early issues

Taco Bell is reconsidering its extensive use of AI for drive-through ordering after a wave of glitches and viral videos highlighted the technology’s shortcomings. The fast-food chain deployed AI voice ordering across more than 500 US drive-throughs with goals of reducing wait times, improving order accuracy, and easing staff workload. However, customers have experienced repeated errors, from the AI getting stuck in loops to unintentional additional items appearing on orders. Perhaps most notably, a prank video showing a customer ordering 18,000 cups of water to bypass the AI went viral, exposing vulnerabilities in the system and prompting public scrutiny. The company acknowledges these issues and the mixed experiences the technology has delivered so far.

Dane Matthews, Taco Bell’s Chief Digital and Technology Officer, has stated that the company is now in an “active conversation” about when to deploy AI and when human team members should step in, especially during busy periods. Instead of a full withdrawal, Taco Bell plans a more nuanced approach where voice AI is monitored closely and humans can override or assist when needed. Training will be provided to franchise teams to ensure that voice AI is used where it adds the most value without negatively impacting customer experience. This cautious stance follows similar steps taken by competitors like McDonald’s, which ended its own AI-driven ordering tests amid comparable technical glitches.

While the drive-through AI rollout has had its flaws, Taco Bell emphasises that millions of orders have been processed successfully, and the company sees AI as a useful tool rather than a complete replacement for human interaction. The current reassessment reflects a broader industry learning curve around AI’s practical limitations in customer-facing roles. By adopting a hybrid model where AI is an aid, not a replacement, Taco Bell hopes to strike a balance between innovation and reliability. This measured reassessment is important as businesses of all types weigh the best ways to integrate AI technologies without compromising service quality or risking public missteps.

Taco Bell’s experience offers a timely reminder that while AI deployment promises efficiencies, real-world applications often require careful calibration and ongoing human oversight. The challenges faced by their voice AI demonstrate the importance of mixed models and adaptable strategies when introducing AI into customer interactions. For enterprise organisations exploring AI at scale, it reinforces the value of active monitoring, staff training, and maintaining fallback options to safeguard customer experience and brand reputation. This case exemplifies the need for pragmatic approaches to AI adoption rather than wholesale implementation without adequate safeguards.

POB’s closing thoughts

This week I spotted that Google are adding more AI features to their language translation products. This clearly makes sense, as AI has a big part to play both in live translation and in language learning. As a 1500+ day Duolingo streak owner in French, who can’t really speak French very well, I am intrigued to see how this evolves over time.

And with that, thanks for reading, I hope you have a great weekend! 👍

I’d love to hear your feedback on whether you enjoy reading the Substack, find it useful, or if you would like to see something different in a future post. What AI topics are you most interested in for future explainers? Are there any specific AI tools or developments you'd like to see covered? Remember, if you have any questions around this Substack, AI or how Davies can help your business, you can reply to this message to reach me directly.

Finally, remember that while I may mention interesting new services in this post, you shouldn’t upload or enter business data into any external web service or application without ensuring it has been explicitly approved for use.

Disclaimer: The views and opinions expressed in this post are my own and do not necessarily reflect those of my employer.

Enterprise AI Weekly