Agent Basics¶

The core abstraction of Sentinel AI is the Agent class. The Agent class is a generic class that takes three type parameters:

R: The type of the request object that the agent receives. This can be a string or any other complex Java type.
T: The type of the response object that the agent returns. This can also be a string or any other complex Java type.
A: The agent subtype. This is the class you are currently implementing.

Request and Response Type parameters¶

Sentinel supports sending both text and objects as input and output. The type parameters R and T can be any Java type, including strings, lists, maps, or custom objects. The only requirement is that the types must be serializable to JSON.

Sentinel will generate schema for the type parameters and pass is to the model to ensure requests are interpreted correctly and responses are generated properly.

Use @JsonClassDescription and @JsonPropertyDescription liberally

Use the @JsonClassDescription and @JsonPropertyDescription annotations to provide copious amounts of documentation on the classes and their members wherever they are used, be it as request type, response type, tool parameters and so on. This is added to the generated schema. A lot of the accuracy of the agent will finally depend on the amount of information you provide to the model. The more information you provide, the better the model will be able to interpret the request properly and generate the correct and relevant response.

Sample request type would be like the following:

BookInfo.java

@JsonClassDescription("Information about the book to be summarized")
public record BookInfo(
        @JsonPropertyDescription("Unique ID for the book") String isbn,
        @JsonPropertyDescription("Title of the book") String title
) {
}

Similarly, the response type can be a complex object as well. For example, if you are implementing a book summarizer, a sample response can be like the following:

BookSummary.java

@JsonClassDescription("Summary of the book")
public record BookSummary(
        @JsonPropertyDescription("Unique ID for the book") String isbn,
        @JsonPropertyDescription("Summary of the book") String summary,
        @JsonPropertyDescription("Topics discussed in the book") List<String> topics
) {
}

Instantiating a model¶

The Model class is a generic abstraction for an LLM model used by an agent. A concrete subclass of the Model needs to be instantiated for usage in the agent.

Currently, we support only OpenAI API compliant model endpoints. The corresponding implementation of Model for this is the SimpleOpenAIModel class. The class is available in the sentinel-ai-models-simple-openai module.

The module needs to be added to the project dependencies as follows:

<dependency>
    <groupId>com.phonepe.sentinel-ai</groupId>
    <artifactId>sentinel-ai-models-simple-openai</artifactId>
</dependency>

This will add the required dependencies to instantiate the model with the SimpleOpenAI client library. The library itself is very flexible, and you should read the documentation for the library to understand how to use it. The model can be instantiated as follows:

final var model = new SimpleOpenAIModel<>(
        "gpt-4o",
        SimpleOpenAI.builder()
                .baseUrl(EnvLoader.readEnv("OPENAI_ENDPOINT"))
                .apiKey(EnvLoader.readEnv("OPENAI_API_KEY"))
                .objectMapper(objectMapper)
                .clientAdapter(new OkHttpClientAdapter(httpClient))
                .build(),
        objectMapper
);

Type parameter for SimpleOpenAIModel

The SimpleOpenAIModel is a generic class. The type is inferred from the type of the model. Leave it as <>.

Endpoint and api key

The OPENAI_ENDPOINT and OPENAI_API_KEY are environment variables that need to be set in the system. The EnvLoader class is a utility class that loads the environment variables. You can use any other method to load the environment variables as well.

Agent Setup¶

The AgentSetup class is a configuration class that is used to configure the agent. The class is available in the core library itself and can be used to set a variety of settings for the agent. The class provides a builder to allow to set only the required parameters and will default whatever it can if not provided.

AgentSetup object needs to be passed at startup. However, if all parameters are not known or some of them need to be dynamic, a setup object can be passed as parameter to the execute* methods as well.

Available Settings¶

Here are all available settings for the AgentSetup class:

Setting	Type	Description
`mapper`	`ObjectMapper`	The object mapper to use for serialization/deserialization. If not provided, a default one will be created.
`model`	`Model`	The LLM to be used for the agent. This can be provided at runtime. If not provided, an error will be thrown.
`modelSettings`	`ModelSettings`	The settings for the model. This can be provided at runtime. If not provided, an error will be thrown.
`executorService`	`ExecutorService`	The executor service to use for running the agent. If not provided, a default cached thread pool will be created.
`eventBus`	`EventBus`	The event bus to be used for the agent. If not provided, a default event bus will be created.

Required parameters

The model, and modelSettings are required parameters. If not provided, an error will be thrown. However, it is possible that model etc is not known during agent creation. In that case, the parameters can be provided as part of the execute* methods. If neither is available, exception will be provided at runtime.
The mapper, executorService and eventBus are optional parameters. If not provided, a default one will be created.

Model Settings¶

A variety of settings can be set for the model. The ModelSettings class is a configuration class that is used to configure the model. The class is available in the core library itself and provides a builder.

Setting	Type	Description
`maxTokens`	`Integer`	Maximum number of tokens to generate.
`temperature`	`Float`	Amount of randomness to inject in output. Lower values make the output more predictable.
`topP`	`Float`	Probabilistic sum of tokens to consider for each subsequent token. Range: 0-1.
`timeout`	`Float`	Timeout for model calls in seconds.
`parallelToolCalls`	`Boolean`	Whether to call tools in parallel or not.
`seed`	`Integer`	Seed for random number generator to make output more predictable.
`presencePenalty`	`Float`	Penalty for adding new tokens based on their presence in the output so far.
`frequencyPenalty`	`Float`	Penalty for adding new tokens based on how many times they have appeared in the output so far.
`logitBias`	`Map<String, Integer>`	Controls the likelihood of specific tokens being generated.

Sample setup¶

Sample code for creating settings for an agent:

final var agentSetup = AgentSetup.builder()
        .model(model)
        .mapper(objectMapper)
        .modelSettings(ModelSettings.builder()
                               .temperature(0.1f)
                               .seed(1)
                               .build())
        .build();

Creating an agent¶

To create an agent, you need to do the following:

Implement the Agent interface with the appropriate request and response type parameters
Provide a system prompt
Pass a setup object to the agent
There are other parameters we shall explore in subsequent sections

Continuing with the example, we want to create an agent that can summarize books. Code for such an agent would look something like this:

BookSummarizingAgent.java

public class BookSummarizingAgent extends Agent<BookInfo, BookSummary, BookSummarizingAgent> {
    public BookSummarizingAgent(AgentSetup setup) {
        super(BookSummary.class,
              "You are an expert in summarizing books. You will be provided with the title and ISBN of a book." +
                      " You need to summarize the book and provide the topics discussed in the book.",
              setup,
              List.of(),
              Map.of());
    }

    @Override
    public String name() {
        return "book-summarizer";
    }
}

The `AgentInput` class¶

Sentinel agent execute* requests can take multiple parameters along with the core request(user prompt). The AgentInput class wraps all parameters and provides a builder allowing users to easily send or skip additional parameters.

Property	Type	Description
`request`	`R`	Request object. This is a required parameter.
`facts`	`List<FactList>`	List of facts to be passed to the agent. This is passed to LLM as 'knowledge' in the system prompt.
`requestMetadata`	`AgentRequestMetadata`	Metadata for the request.
`oldMessages`	`List<AgentMessage>`	List of old messages to be sent to the LLM for this run. If set to `null`, messages are generated and consumed by the agent in this session.
`agentSetup`	`AgentSetup`	Setup for the agent. Overrides runtime setup. If set to `null`, the setup provided during agent creation is used. Fields provided at runtime take precedence.

The `AgentOutput` class¶

The return type for all execute* methods is AgentOutput which is a generic class typed with the response type T. The class contains the following fields:

Member	Type	Description
`data`	`T`	The output of the agent, typed to the required response type. Null in case of errors.
`newMessages`	`List<AgentMessage>`	New messages generated by the agent. Empty in case of errors.
`allMessages`	`List<AgentMessage>`	All messages generated by the agent, including the new messages.
`usage`	`ModelUsageStats`	Usage statistics for the model.
`error`	`SentinelError`	Error in case of failure or a success object otherwise.

Model Usage Statistics¶

The ModelUsageStats class tracks usage statistics for a model, including token usage and request details.

Member	Type	Description
`requestsForRun`	`int`	Number of requests made for this run.
`toolCallsForRun`	`int`	Number of tool calls made for this run.
`requestTokens`	`int`	Number of request/prompt tokens used in this run. Equivalent to the "prompt_tokens" parameter in OpenAI usage.
`responseTokens`	`int`	Number of completion/response tokens used in this run. Equivalent to the "completion_tokens" parameter in OpenAI usage.
`totalTokens`	`int`	Total tokens used in the whole run. Should generally equal `requestTokens + responseTokens`.
`requestTokenDetails`	`PromptTokenDetails`	Token usage details for prompts.
`responseTokenDetails`	`ResponseTokenDetails`	Token usage details for responses.
`details`	`Map<String, Integer>`	Additional details about token usage.

`PromptTokenDetails` Class¶

The PromptTokenDetails class provides detailed information about tokens used in prompts.

Member	Type	Description
`cachedTokens`	`int`	Number of cached tokens present in the prompt.
`audioTokens`	`int`	Number of audio input tokens present in the prompt.

`ResponseTokenDetails` Class¶

The ResponseTokenDetails class provides detailed information about tokens used in responses.

Member	Type	Description
`reasoningTokens`	`int`	Number of tokens generated by the model for reasoning.
`acceptedPredictionTokens`	`int`	Number of tokens in the prediction that appeared in the completion when using predicted outputs.
`rejectedPredictionTokens`	`int`	Number of tokens in the prediction that did not appear in the completion when using predicted outputs.
`audioTokens`	`int`	Number of audio input tokens generated by the model.

Using the agent¶

The agent can be invoked by calling any of the provided execute* methods.

executeAsync() method and it's overloads can be used to (you guessed it) invoke the LLM asynchronously. It returns a CompletableFuture object which can be used to get the result when it is available.
execute() method and it's overloads can be used to invoke the LLM synchronously. It returns the result directly.

In either case, the agent will be invoked with the provided request object and the response object will be returned along with errors and usage information.

final var agent = new BookSummarizingAgent(agentSetup);

final var response = agent.execute(
        AgentInput.<BookInfo>builder()
                .request(new BookInfo("978-0393096729", "War and Peace"))
                .build());
System.out.println(objectMapper.writerWithDefaultPrettyPrinter()
                               .writeValueAsString(response.getData()));

Output from the above would be something like:

{
  "isbn" : "978-0393096729",
  "summary" : "\"War and Peace\" is a historical novel by Leo Tolstoy that intertwines the lives of several families during the Napoleonic Wars in the early 19th century. The narrative explores themes of love, fate, and the impact of war on society. It follows characters such as Pierre Bezukhov, Prince Andrei Bolkonsky, and Natasha Rostova as they navigate personal struggles and the broader historical events that shape their lives. The novel delves into the philosophical questions of history and the nature of power, ultimately portraying the complexity of human experience amidst the chaos of war.",
  "topics" : [ "Historical fiction", "Napoleonic Wars", "Russian society", "Philosophy of history", "Love and relationships", "Fate and free will", "Family dynamics", "War and its consequences" ]
}

Request Metadata¶

Sometimes it is important to maintain context of the conversation. For example, if you are building a chat agent, you may want to keep track of the session or the user the conversation is happening with. SentinelAI provides the AgentRequestMetadata class for this purpose. The metadata class also provides the option to send back the ModelUsageStats object from previous calls. This can be used to keep track of the usage of the model and the agent across calls. If provided, the agent will merge the usage from current execution to the provided usage stats object.

Request metadata passed to model

The request metadata passed to execute calls are serialized and passed to the LLM as part of the structured system prompt.

Property	Type	Description
`sessionId`	`String`	Session ID for the current conversation. This is passed to LLM as additional data in the system prompt.
`userId`	`String`	A User ID for the user the agent is having the current conversation with. This is passed to LLM as additional data in the system prompt.
`customParams`	`Map<String, Object>`	Any other custom parameters that need to be passed to the agent or the tools being invoked by the agent. This is passed to LLM as additional data in the system prompt.
`usageStats`	`ModelUsageStats`	Global usage stats object that can be used to track usage of the model across execute calls.

Note

Request metadata is optional and passing null for this param is acceptable.

Prompts¶

Sentinel AI has some special handling to improve LLM performance for agentic use cases both for system and user prompts.

System Prompts¶

SentinelAI converts the system prompt in an XML format for easy parsing by the LLM. The string or object passed as the system prompt to the agent is passed in a tag called <role>. Other information such as tools, properties from request metadata as well as facts and additional tasks from the registered extensions are added to the prompt as well.

Serializability requirements

The system prompt needs to be serializable to XML. If not, an error will be thrown.

Structured prompts

We recommend passing the system prompt as a structured object. This will help the LLM understand the context better and speed up processing as well. Check the SystemPrompt class for tips on how to use different jackson annotations to make system prompts serialize correctly.

Sample system prompt generated by SentinelAI

<?xml version='1.1' encoding='UTF-8'?>
<SystemPrompt>
    <coreInstructions>Your main job is to answer the user query as provided in user prompt in the `user_input` tag.
        Perform the provided secondary tasks as well and populate the output in designated output field for the task.
        Use the provided knowledge and facts to enrich your responses and avoid unnecessary tool calls.
    </coreInstructions>
    <primaryTask>
        <role>greet the user</role> <!--(1)!-->
        <tools> <!--(2)!-->
            <tool>
                <name>test_tool_box_get_location_for_user</name>
                <description>Get location for user</description>
            </tool>
            <tool>
                <name>simple_agent_get_name</name>
                <description>Get name of user</description>
            </tool>
            <tool>
                <name>test_tool_box_get_weather_today</name>
                <description>Get weather today</description>
            </tool>
            <tool>
                <name>simple_agent_get_salutation</name>
                <description>Get salutation for user</description>
            </tool>
        </tools>
    </primaryTask>
    <secondaryTasks> <!--(3)!-->
        <secondaryTask>
            <instructions>
                <tasks>
                    <task>
                        <objective>EXTRACT MEMORY FROM MESSAGES AND POPULATE `memoryOutput` FIELD</objective>
                        <outputField>memoryOutput</outputField>
                        <instructions>How to extract different memory types:
                            - SEMANTIC: Extract fact about the session or user or any other subject
                            - EPISODIC: Extract a specific event or episode from the conversation
                            - PROCEDURAL: Extract a procedure as a list of steps or a sequence of actions that you can
                            use later
                        </instructions>
                        <additionalInstructions>IMPORTANT INSTRUCTION FOR MEMORY EXTRACTION:
                            - Do not include non-reusable information as memories.
                            - Extract as many useful memories as possible
                        </additionalInstructions>
                        <tools>
                            <tool>
                                <name>agent_memory_extension_find_procedural_memory</name>
                                <description>Find procedural memory about any topic from the store</description>
                            </tool>
                        </tools>
                    </task>
                </tasks>
            </instructions>
        </secondaryTask>
    </secondaryTasks>
    <additionalData> <!--(4)!-->
        <sessionId>s1</sessionId>
        <userId>ss</userId>
    </additionalData>
    <knowledge> <!--(5)!-->
        <facts>
            <description>Memories about current session</description>
            <fact>
                <name>UserName</name>
                <content>The user's name is Santanu.</content>
            </fact>
            <fact>
                <name>UserLocation</name>
                <content>The user is located in Bangalore.</content>
            </fact>
            <fact>
                <name>WeatherToday</name>
                <content>The weather in Bangalore today is sunny.</content>
            </fact>
        </facts>
    </knowledge>
</SystemPrompt>

System prompt provided to the Agent class constructor.
Tools registered with and discovered by the agent.
Secondary tasks provided by extensions
Request metadata passed to the agent
Facts provided by extensions and client

User prompts¶

The mandatory request property passed in the AgentInput<R> parameter to the execute* methods is converted to a structured XML object and wrapped in <user_input> tag.

For example, for the book summarizer agent, the provided BookInfo object is sent to the LLM as follows:

<user_input>
  <isbn>978-0393096729</isbn>
  <title>War and Peace</title>
</user_input>

Original input request for the above would be something like:

agent.execute(
      AgentInput.<BookInfo>builder()
              .request(new BookInfo("978-0393096729", "War and Peace"))
              .build());

Extensions¶

Extensions are a way to add additional functionality to your agent. They are exposed as modules and can be used to extend the functionality of the agent. Agents can be configured to use extensions by adding while creating the agent. The extensions are loaded in the order they are added.

Extensions can be used to:

Add facts to the knowledge passed to the agent in the system prompt
Add custom tools to the agent
Get agent to perform additional tasks
Generate extra information from the agent

To create an extension derive and implement the AgentExtension interface.