AI, Determinism and Control (Part 1)
Taming the First Layers of Indeterminism
What do you think of when the topic of AI comes up? I think there are some common answers here. Most of those answers are incomplete. I hope I can provide a deeper understanding by looking at the concept of control, and patterns of application. This will be a two-part series: the first part describes a framework and the foundational layer of AI uses, and the second describes more advanced applications.
To the earlier question, you wouldn’t be alone if your first answer was a user-facing chat application—ChatGPT, Gemini, or Claude. Hundreds of millions, maybe billions of users have tried one of these1. It’s a great starting point to understand the current state of AI. You can probe with questions, get answers, and evaluate the output. 30 minutes of that will demonstrate more than any other 30-minute investment. That said, forming your entire impression of AI based solely on chatbots misses deeper shifts.
A second common answer is a robot from a science fiction movie—a human-like, but fundamentally alien, being. While creative, this vision represents what AI might be, not what it is today. Thinking about Sci-Fi AI often does more to help us understand human nature than actual machine learning. Take any part of it too literally, and it will make you less informed.
Understanding AI’s true impact requires closing the gap between simple chatbots and science fiction robots. We can do that by looking at AI through the lens of software engineering and control. Specifically, we can treat the control mechanism of software as a system of planning, governed by two critical axes: Determinism and Scope.
The Mechanics of Control: Determinism and Scope
Historically, software has been built on predictability. When a human developer writes traditional code, they are creating a deterministic plan. Once it passes through quality checks, validations, and testing, it becomes a solid, rigid set of instructions. While bugs exist, the system is fundamentally designed to execute the same way every time.
Humans, on the other hand, are inherently indeterministic; you never know exactly how a user will approach a problem, what strategies they will employ, or how they might adapt their plans on the fly. Generative AI models2—the underlying engines powering the familiar chat tools like ChatGPT and Gemini mentioned earlier—share this indeterminism. When a GenAI model creates a plan or a response, it is probabilistic. It will not necessarily produce the same output twice. That said, if there is a single correct answer that has been heavily reinforced during its training, the model will predominantly provide that specific answer. This consistency occurs because the statistical weight leans overwhelmingly in that direction, not because the system’s inherent flexibility has been mechanically removed.
The second axis of control is the execution environment, or its scope. An environment can be strictly bounded, meaning the actor (human or AI) has very limited tools and access. A user in a simple data-entry application cannot do much other than enter data. Conversely, an environment can have broad scope, featuring wide-ranging access to tools with compounding effects, such as command-line execution, file system access, or the ability to write and deploy new code outside of a sandbox.
By analyzing AI through these axes—how predictable a plan is, and how bounded its environment is—a clear narrative of evolution emerges.
What is Indeterminism?
What you identify as the “AI” in these applications is called a foundation model. This model generates responses to your inputs based on its “training.” Training is a process where the model is incrementally updated to make its responses match an ideal. You can think of the first training pass as an attempt to create a model that could rewrite the entire internet with as little inconsistency as possible. Initially, the model might predict a word because it looks like a piece of content it just saw. But as it processes more information, it encounters conflicts. By forcing the model to resolve millions of overlapping conflicts, it learns the underlying rules of how concepts connect. Later, a second training pass is applied to be much more specific about what is considered a “good” or “bad” way to respond to a human user.
Output from AI models is indeterminate. There are two factors that cause this. The first, unconquerable aspect is that the relationship between input and output is too complex to reason through. You actually can get a model to produce the exact same output for the exact same input if you adjust its “temperature” to zero. However, this doesn’t mean the model is truly predictable, because even minute changes in the input can put the model on completely different paths, even at zero temperature. The second factor is that when you see a model used in practice, the temperature is generally set greater than zero. Zero temperature tends to be boring, less creative, and less insightful without necessarily being more accurate—it just enforces a stronger consistency between input and output. But since the complexity of that relationship makes strict prediction impossible anyway, the value of zero temperature is limited, and the output remains, in all practical senses, indeterminate.
But indeterminate isn’t the same as random; it has a direction. With evaluation you can find a probability, and those probabilities can be high. But indeterminate also entails the chance for novelty, including surprise. To some degree, we might say indeterminism reflects a limitation of the user or designer’s ability to predict outcomes. But it’s not a limitation reflective of a lazy user or designer; it reflects a level of complexity no amount of attention can fully address. You can fight it a bit, gain some control and understanding, but if your expectation is full control, you’re using the wrong toolbox, wasting your time, and will ultimately fail.
Indeterminism is something you tame, not control. A tame agent is something that works with you. Tame things have many benefits, but also bear caution. A tame horse has advantages a car does not. If you fall asleep on a horse (and don’t fall off), the horse is very unlikely to jump off a cliff. A traditional, non-autonomous car doesn’t behave that way—it’s very deterministic at driving straight, whether that straight path leads down the road, into a wall, or over a cliff.
But a tame horse can still kick you. It’s far less likely than a wild horse, but if you approach it wrong or scare it, there’s no horse so well-trained that a kick becomes impossible. That’s part of the tradeoff of working with something indeterminate. A car with its engine off is going to behave like any other 2,000+ lb. hunk of metal on wheels, governed entirely by physics. Even in motion, while there are a few exceptions like engine or brake failures, it’s all just physics in the end.
Most software applications are designed to be determinate. A developer reasoned out what output a particular input should create, and planned this carefully. The plan of these applications is encoded into a language, and translated into machine code.
Understanding this shift to a probabilistic nature is crucial, because many choices we make will be founded on the seeking of a balance between dynamism and trust that intersects with that fundamental property of models.
The Thin Layer: AI as an Application
Most users have started to understand what GenAI is, and what its capabilities are, by using it as an application. What’s interesting about this is that these first applications started as very thin layers over the core internal generative AI model, so users have experienced the technology at nearly its most basic. It’s been a while since a novel computing technique has been exposed with so few extra layers.
In a formal sense, “AI as an application” means the primary interface is directly to a GenAI model. There are a few wrapper elements—identifying who you are, moving data back and forth, and providing some presentation of returned data—but mostly, it’s a wrapper. You send inputs, it sends outputs, and you directly converse back and forth.
In our framework, this is an indeterminate system operating within a strictly bounded scope. The user and the model are primarily in control. The text, images, or files you provide get fed to the AI, and its probabilistic responses are safely constrained by the application’s sandbox. Safeguards exist that neither can override, but within those boundaries, the direction of the interaction is controlled linearly by the human and the AI model.
Embedded Intelligence: AI within Applications
As software evolves, we are seeing a shift toward embedding AI directly into applications. While technically a chatbot is an example of this, it is highly useful to differentiate the two.
When a developer embeds GenAI into an existing application—for example, a GenAI-powered A/B testing engine that automatically generates and tests multiple variations of marketing copy to identify the best performer—the control dynamics shift. The overarching application remains a rigid, deterministic plan, but the AI represents a small, contained pocket of indeterminism.
With AI embedded in an application, the application is primarily still in control. When it uses an AI for a specific function, it cedes a small amount of control, but it has strict boundaries. The traditional software dictates exactly when the AI is called and where its output goes, strictly bounding its influence to specific micro-outcomes. There is immense potential left in this domain that the general public doesn’t recognize, simply because its precise use is entirely dependent on the creativity of application developers.
Prompt Engineering
An interesting detail about embedding is that prompt engineering becomes a critical function. Embedded GenAI needs a goal. With chatbots, the user can provide the goal. With embedded GenAI, if the input is from the user, it’s through things like filling forms or uploading documents. The embedded function should reliably reach a result that allows the application workflow to proceed.
Prompt engineering is the process of creating inputs to a GenAI model that perform better at achieving a goal than other inputs. Some parts of this are intuitive to anyone fluent in a language, like if you were instructing an actual assistant. Some parts are more particular to GenAI models, or even particular to specific GenAI models.
Technically, you can use prompt engineering when you use a chatbot, and you’ll get better results if you do. But there’s also an overhead in doing so, as you’re no longer expressing your simple intent, but working to make it fit a pattern. Model builders work to try and make prompt engineering less necessary for general user interactions, so some tricks are less important than they were in 2023. The most obvious parts will probably always be useful, like avoiding ambiguity when you have a clear intent in mind.
For embedding, the overhead of prompt engineering has a higher payback, so it makes sense to engage with it more deeply, and so developers do. Prompt-engineering is also used when creating custom chatbots3.
Another goal of prompt engineering is to constrain the output. Stronger, more consistent instructions produce more consistent outputs.
Embedding and Value
You might have noted that the A/B testing example earlier was a marketing example, and thus falls into the adversarial category of use I’ve talked about before. Marketing is often an early adopter of these embedded systems because the industry is driven by adversarial motives—constantly competing against others for user attention and clicks. But the true potential is much more inspiring. Consider an adaptive educational platform. The overarching application rigidly tracks a student’s progress, curriculum, and test scores (the deterministic plan). However, when the system detects a student struggling with a specific concept, it calls upon an embedded GenAI model to instantly generate a custom, interactive story explaining that exact concept using the student’s favorite hobbies as an analogy. The application remains fully in control, but it uses the AI’s indeterminate flexibility to provide a deeply personalized learning experience that hardcoded software never could.
Building Systems with a Life of Their Own
The control dynamic fundamentally fractures with the next pattern: using AI as a tool to build applications. So far, experiences with this have revolved around coding assistants and “vibe coding.” In some cases, it’s immediately clear this is different from the chatbot model because the AI is embedded within complex Integrated Development Environments (IDEs).
But what truly distinguishes this pattern isn’t the interface. Rather, it’s the output and how it is used. Business-focused tools like Claude Cowork or Amazon Quick are increasingly managing different inputs and outputs to help end-users pursue task-oriented goals, generating artifacts like documents, summaries, and presentations. But if that output is ephemeral—a static artifact used only to accomplish a quick, singular objective—it’s not building an application.
Building applications means building something that has a life of its own. It is the indeterministic generation of deterministic plans. The AI indeterministically generates a code script, which is then refined through human review, automated reasoning, and testing. Once committed, it becomes a static, deterministic plan.
The topic of control highlights the profound nature of this shift. When an application is built this way, there are two distinct phases of control. In the first phase, the developer is in control, sharing control with the generative AI model by delegating, authorizing, and reviewing. But the developer does not retain persistent control during the second phase. Once deployed, the built application itself takes control. If it runs on a server, the developer can turn it off or replace it, but that is supervisory control at best.
By building something with a life of its own, we take a foundational recursive step in software: using indeterminate AI to architect and generate the very deterministic logic that will govern computing moving forward.
From Chatbot to Agent
The user-facing AI chat application you engage with has already evolved to be more than a simple interface. Designers try to make this seamless for you, so you shouldn’t feel bad if you missed the change, but to understand the systems of tomorrow, we need to distinguish between a passive chatbot and an active agent.
When you interact with a standard chatbot, the control dynamic is strictly conversational and reactive. You provide a prompt, the underlying generative AI model probabilistically calculates a response, and then it stops. It relies entirely on you to drive the interaction forward step-by-step.
An agent, on the other hand, is an AI system designed to pursue a broader goal autonomously. Instead of just answering a single prompt, an agent takes an objective, indeterministically breaks it down into a multi-step plan, and uses available tools to execute that plan. It can observe the results of its own actions, correct its course, and continue working until the goal is met.
Understanding this shift from passive generation to active execution is crucial. By combining the probabilistic reasoning of foundation models with the ability to take independent action, we are actively moving away from simple chatbots and into the era of agents.
We have traced the evolution of AI from simple chat interfaces to embedded intelligence, and finally to the threshold of these autonomous agents. But recognizing this shift is only the first step. To truly understand where software is heading, we must examine the wild frontier of the agent ecosystem itself—how these agents use tools, how they interact with each other, and the cybersecurity implications of granting them broad scope. In Part 2 of this essay, we will dive into this ecosystem, exploring the deep recursion of agents building agents, and what this shift means for the ultimate goal of automation.
900 million ChatGPT users, and 750 million Gemini users globally.
There’s two terms you might hear which are almost always used incorrectly. Large Language Model (LLM) was used to describe a model based on text inputs with text outputs, which used a large amount of text and had high complexity. Technically you rarely use a LLM anymore, as most models are multimodal (supporting text and graphics). Despite that change, the term LLM has enough weight that people use it anyhow, even though technically incorrect. As well, the term Foundation Model has a broader scope. While this allows it to encompass large multimodal models, it technically also includes many earlier types of models not advanced enough to perform the actions associated with “AI”, but more commonly described as Machine Learning (ML). If a better term was popular, I’d use it, but in general you should probably think of them as all the same, and if necessary use context to refine the intent. Generative AI Model is the best term, but if you see LLM or foundation model in anything other than an academic context, you can assume they are referring to generative AI models.
I want to be clear though. When I talk about embedded GenAI, I’m not referring to custom chatbots, like the one that answers your company’s HR questions. Those were never going to be particularly transformative. They’ve been made fun of quite a lot, and for good reason. While of some utility, they were really just cheap upgrades to search capabilities, and often underperformed general-purpose chatbots. We don’t need a special category for those until they become full-fledged agents.

