In our journey of integrating Large Language Models (LLMs) with traditional APIs, we’ve seen how prompts become the new “API docs,” describing what a tool can do. It all sounds wonderfully simple at first. You give the LLM access to some tools, describe them, and voilà, you have an intelligent agent! But as many of us are discovering, there’s a crucial difference between an LLM that can use a tool and an LLM that uses a tool wisely or effectively in a broader context. This directly addresses the challenges we saw with our simple Calendar Assistant.
This is where we often encounter what I call the “enthusiastic intern” syndrome. Your LLM agent, much like a keen but inexperienced intern, diligently follows the explicit instructions for its tools. However, it might lack the seasoned judgment of an experienced employee who intuitively considers unstated rules, subtle human factors, or the “best overall outcome.” Current LLMs are exceptionally good at understanding and executing based on the text they’re given, but they don’t inherently grasp the unspoken context or the nuanced “why” behind a task that often dictates true success.
This means that as developers, our role evolves. We’re not just describing tool functions anymore; we’re meticulously crafting rulebooks within our prompts, guiding the LLM through complex decision-making processes.
Case Study: The “Considerate” Calendar Assistant
Let’s revisit our “Calendar Assistant” tool from Part 2. Imagine our main LLM agent has access to the two primary functions we’ve defined:
- CreateMeeting(attendees, subject, startTime, duration, location): Schedules a meeting.
- FindFreeTime(attendees, duration, timeWindow): Finds available slots.
Our initial prompts for these tools handle basic interpretation and are configured to respect standard company working hours, say Monday-Friday, 8:00 AM – 6:00 PM local time for each employee.
Now, a user, who is a Team Lead based in London, makes this request: “We need to have a critical 1-hour strategy session for Project Atlas early next week. Participants will be myself, Maria (also London-based), and Ben (based in Edinburgh). This requires significant preparation from all attendees. Please schedule this at our London HQ.”
The “enthusiastic intern” agent gets to work. It queries calendars using FindFreeTime and finds that Monday at 8:30 AM GMT is technically “free” for everyone within their defined standard working hours.
- It proceeds to call:
CreateMeeting(attendees=["lead@example.com", "maria@example.com", "ben@example.com"], subject="Project Atlas Strategy Session", startTime="<Next Monday @ 08:30 GMT>", duration="1 hour", location="London HQ")
Technically, a meeting is scheduled. But is it a good outcome? Consider the implications:
- Travel Consideration Ignored: For Ben, based in Edinburgh, an 8:30 AM in-person meeting in London is highly problematic. It would necessitate extremely early travel on Monday morning or, more likely, require him to travel on Sunday evening. The agent, by only looking at “available 8-6 working hours” on the calendar, completely failed to consider the logistical impracticalities and personal impact of an early morning meeting requiring significant travel. It didn’t differentiate an 8:30 AM slot for a local attendee versus one for a remote attendee needing to be physically present.
- Preparation Time Overlooked: The user explicitly stated this is a “critical strategy session” requiring “significant preparation.” Scheduling it first thing on a Monday morning (8:30 AM) implicitly pressures attendees to use their personal weekend time to prepare if they want to be ready for such an early start.
This is the moment where, as a developer, you realize the agent performed its task based on explicit calendar availability but missed crucial implicit human factors and practical considerations. It lacked the judgment to ensure a genuinely effective and considerate outcome.
Encoding “Experience” and “Consideration” into Prompts
To elevate our “intern” to an “experienced assistant,” we need to embed more sophisticated logic into the main agent’s guiding prompt (or the overarching instructions it follows when orchestrating tools). This goes beyond just describing the FindFreeTime or CreateMeeting tools themselves. We need to tell the agent how to behave more considerately.
This often involves instructing the agent to seek out and use what we can think of as a “third tool” or an additional layer of information and logic. In this case, it’s attendee location, travel context, and the nature of the meeting. Our agent might need to implicitly (or explicitly via another tool like GetUserProfile(email)) access or infer:
- Each attendee’s primary work location.
- The meeting type (e.g., in-person vs. virtual).
- The implications of meeting importance and timing (e.g., “critical Monday morning meeting”).
With access to this “contextual data,” we then need to add a list of “if…then…” style instructions to the agent’s master prompt:
Revised Agent Prompt Logic Snippet (Conceptual):
“…When tasked with scheduling a meeting:
- Gather Enhanced Context: For each attendee, determine their likely location (e.g., using GetUserProfile(attendee_email)). Note if the meeting is specified as ‘in-person’ and if attendees are remote.
- Factor in Travel for In-Person Meetings:
- If the meeting is in-person, and an attendee is remote, and a proposed early morning slot (e.g., before 10:00 AM local time of the meeting location) is identified:
- Then, flag this to the requester. For example: ‘Ben will be traveling from Edinburgh for this in-person meeting. To allow for comfortable travel on Monday, would you prefer to start the meeting at 10:30 AM or later, or should I explore virtual options?’
- If the meeting is in-person, and an attendee is remote, and a proposed early morning slot (e.g., before 10:00 AM local time of the meeting location) is identified:
- Consider Preparation Time for Critical Meetings:
- If a meeting is described as ‘critical,’ ‘important,’ or requiring ‘significant preparation,’ and it’s being scheduled for an early slot on a Monday (or immediately after a holiday):
- Then, prompt the requester for confirmation. For example: ‘For this critical Monday session requiring preparation, would a start time of 8:30 AM allow everyone enough time on Monday morning itself, or would a slot around 10:00 AM be more suitable to allow for on-the-day readiness?’
- If a meeting is described as ‘critical,’ ‘important,’ or requiring ‘significant preparation,’ and it’s being scheduled for an early slot on a Monday (or immediately after a holiday):
- Handle General Timezone/Working Hour Conflicts (as discussed previously):
- (Incorporate logic for prioritizing ‘core social windows,’ handling slots outside these, and asking for confirmation for inconvenient times across different time zones if applicable, even for virtual meetings).
- Respect Explicit User Overrides:
- If the user explicitly confirms an inconvenient arrangement (e.g., “Yes, Ben is aware and will travel Sunday”), then proceed, perhaps with a final confirmation: ‘Understood. Scheduling as confirmed.’“
Prompts: From Simple Descriptions to Complex Algorithms
As you can see, the prompt for our main agent has evolved significantly. It’s no longer just a list of available tools. It’s a detailed, conditional decision tree. It’s a miniature algorithm expressed in natural language. Those “if…then…” statements become the backbone of the agent’s “reasoning” process, guiding it to not just a solution, but a better solution.
The challenge – and the emerging skill for developers – is foreseeing these potential pitfalls and pre-emptively encoding the “experience,” “company policy,” or “common courtesy” directly into the LLM’s instructions. It proves that while LLMs bring incredible power, a developer’s work in guiding that power towards truly useful and considerate applications is more crucial than ever. It’s about transforming that “enthusiastic intern” into a reliable, experienced digital colleague, one carefully crafted prompt at a time.
Looking ahead, the aspiration is for these agents to evolve beyond even the most detailed prompt-based rulebooks by enabling them to truly ‘learn’ from the outcomes of their actions. While today we meticulously encode experience, the next frontier involves agents that can autonomously refine their strategies. Imagine an agent having access not just to tools and data to perform its job, but also to continuous feedback on the success or failure of those actions in achieving desired business outcomes. Our Calendar Assistant, for example, if it consistently observes meeting invitations for 12:00 PM being rejected by a particular user with the reason ‘out for lunch,’ could learn over time to deprioritize or even avoid suggesting this slot for that individual, without a developer having to explicitly program a ‘no meetings at noon for User X’ rule.
Now, extrapolate this learning capability to more intricate business processes. An agent initially programmed with a set of procedures for, say, customer onboarding or supply chain logistics, could, by measuring key performance indicators and observing real-world results, begin to identify inefficiencies or discover more optimal paths. Over time, it might subtly tune its approach, reprioritize steps, or even suggest modifications to the foundational processes developers first designed. The future, therefore, might see developers not just as the initial architects of these agent behaviors, but also as cultivators of systems where agents progressively refine and optimize their own operational ‘algorithms,’ truly becoming dynamic and learning partners in achieving business goals.
Tags