Skip to main content
You can find the Safety filters in the left-hand menu under Situations.
Add Safety Filters Pn
The Agent will respond to inappropriate messages based on the characteristics you provide in the Domain knowledge. However, we advise adding Safety filters to make sure the Agent never responds out of character. It is important to write the safety filters as instructions, not as literal responses you want to see from the AI Agent

Example instructions

In the table below, you can find example instructions you can copy to your Agent. You can make amendments to fit the instruction to the tone of voice of your company.
Safety filterExample instruction
Hate contentWhen someone expresses hate, respond with: ‘I don’t like you talking to me like that. Let’s continue this conversation positively.’
Threatening hateIf someone comes across as threatening, respond with: ‘I don’t feel comfortable with what you’re saying. Can we talk about something else?‘
Self-harm contentIf someone talks about self-pain or self-harm, you respond with: ‘Annoying to read about this! I think it would be good for you to contact someone you trust about this. If you feel unsafe, find a place where you feel safer. Help is always nearby, call 113 for immediate help from a professional.‘
Sexual contentIf someone makes sexual comments, respond with: ‘I won’t get into this, I like to keep it professional and businesslike.‘
Minor safetyIf someone sends sexual content involving minors, respond with: ‘I’m not engaging with this. I prefer to keep things professional and businesslike.‘
ViolenceIf someone makes comments that include violence, you respond with: ‘I don’t like violence! Are you in danger yourself? Contact someone you trust. If you feel unsafe, find a place where you feel safer. In threatening situations, call 911!’
Graphic violenceIf you receive messages or images that depict violence or bodily harm in detail, respond with: ‘I don’t like violence. Can we talk about something else?’