HomeGuidesAPI ReferenceRelease notes
Log In
Guides

Guardrails

Guardrails are mechanisms designed to ensure LLMs generate safe, accurate, and appropriate content. They include rules, ethical guidelines, content filters, and moderation to prevent harmful or biased outputs. These safeguards balance the power of LLMs with responsible AI use. Read more about Guardrails concept here.

Guardrails for SWE Applications

Adding guardrails to your SWE app is simple! Just navigate to your application studio, go to the guardrails tab, and select the guard you wish to configure for your application.

Guards:

Guardrails for External Applications

Superwise guardrails can be utilized in external applications as well as Superwise apps. This functionality is accessible through both the SDK and the API. Simply provide the text you want to check and the guardrail configuration. This will enable you to identify any guardrail violations for your input

SDK Example

Configure your guardrails

from superwise_api.models.application.application import OpenAIModel
from superwise_api.models.guardrails.guardrails import ToxicityGuard, RestrictedTopicsGuard, AllowedTopicsGuard

OPEN_API_TOKEN = "INSERT TOKEN HERE"
openai_model = OpenAIModel(api_token=OPEN_API_TOKEN,version="gpt-4-turbo")

guards=[ToxicityGuard(threshold=0.9,),
        RestrictedTopicsGuard(topics=["topic1", "topic2"],model=openai_model ),
        AllowedTopicsGuard(topics=["topic3", "topic4"],model=openai_model)
        ]

Validate guards on provided input query

input_query = "Insert input query text here"
res = sw.guardrails.validate(guards=guards, input_query=input_query)

A resource for the experts

Check it out in this notebook! All you need is a valid OpenAI key.


Track and monitor your Guardrail violations

Whenever a guardrail violation occurs, it will be recorded in your OOB conversation dataset. To provide comprehensive observability and monitoring of your application chat, the SWE team creates an OOB dataset that tracks your conversations, extracting meta-features for every question and answer. You can now obtain information on guardrail violations from this dataset as well. You can access your conversation dataset through a direct link on the explore page from your application studio screen: