Data

The data you provide will be the source of truth for your agent. Your agent will rely on this data (along with the prompt and Q&A pairs) to answer user queries.

Pinned Data - By pinning data sources, you can instruct your agent to refer to certain sources when answering every user query. You should use this for documents that will relate to most queries you expect your agent to address. If you have a document that you almost always want your agent to refer to, put it here. This could include answers to FAQs or a sales playbook. As pinned sources will be read in their entirety every time a user asks a question, they should only be used for a limited amount of high quality data to avoid overwhelming the AI with too much data on every question. To use this feature, use the toggle to turn on pinned sources and search for sources to pin in the search bar below.

Here, we pinned our infosec FAQs when setting up Credal’s Infosec agent.

Screenshot showing pinned infosec FAQs in Credal’s Infosec agent

  1. Searchable Sources

    • You should also provide your agent with sufficient data relevant to its area of expertise. It will search these sources for information relevant to user queries. Use the toggle to enable searchable sources and use the search bar to add sources.

    We provided our infosec agent with all of Credal’s information security documentation.

    We provided our infosec agent with all of Credal’s information security documentation.

  2. Tailoring source retrieval

    • When asked a question, your agent will search its sources for relevant pieces (or “chunks”) of information, which it will use to come up with a response. You can configure how the agent retrieves this information:

    tailoring_source_retrieval.gif

    • Number of chunks: You can set the number of chunks the agent will draw on to answer questions. This number should be set the lowest possible value that generates accurate answers to avoid drawing too much data into every prompt (and overwhelming the AI). Generally, where agents will have to look at more than one document (or parts of a document) to answer a question, the number will be higher than if answers are located in a specified location (like a row of an FAQs document).
    • Similarity threshold: You can also set a similarity threshold, which mediates how similar a chunk must be to a prompt before it will be considered by the agent. For broader queries, a lower similarity threshold may be helpful to provide the agent with more context, while for more specific queries, a higher threshold will help the agent focus on only the correct response.

    When adjusting these settings, you may also want to consider spend—the more data you pull into each query, the more the cost will increase. Also, more data isn’t always more useful. Drawing too much data into each query can introduce noise and distract the agent from the relevant information. Optimizing these settings can help control spend and improve the focus and accuracy of responses.

iv. User Inputs

User Inputs are a way to provide your agent with additional context. This is particularly useful when you want to provide the agent with information that is not contained in the built in data sources, such as one off documents to analyze, or the input to your Agent’s process. Whether it’s an annual report, an incident Slack channel, or any other document, you can define what your Agent needs to search to address the prompt. User Inputs can be either specified as Pinned or Searchable Data, and these suggestions are then displayed below the Agent prompt.

agent_user_inputs_section.png