Handling User-Flagged Content
Learn how to handle user-flagged content with the Moderation API
Introduction
In this guide, we’ll demonstrate how to enable users to report other profiles within a fictional dating application named Wizard Dating.
Goals
- View an overview of reported profiles
- Allow users to report profiles
- Review and remove reported profiles from the application
- Sort reported profiles to prioritize the most problematic ones
When users report a profile, we’ll add it to a review queue so that an admin can assess it. If the admin decides to remove the profile, we’ll eliminate it from our application.
Setting Up the Dashboard
First, let’s set up the dashboard by creating the necessary components. We’ll need to create:
- A project for analyzing profiles
- An action for users to report profiles
- A review queue to display reported profiles
- An action for moderators to remove profiles
Create a Project
We’ll start by creating a new project called Wizard Profiles and adding several models to it: toxicity
, nsfw
, sentiment
, pii
, and spam
.
We will keep the flagging thresholds at their default values for now.
Create a Reporting Action
Navigate to your actions to create a new action.
- Create an action named
Report Profile
that will be used to report user profiles. - Since this action will only be invoked by our application code, select “Hide action from dashboard” to prevent it from appearing in any queues.
- Additionally, check “Allow text input for value” so users can add comments to their reports.
- Note the action key
report_profile
for later use. - Select queue behavior: “Action unresolves item (re-add to queue)”: to always re-add the profile to the queue even if a moderator has resolved it before. Otherwise leave it at “Action does not resolve item” to only review the profile once.
Create a Review Queue
Next, we’ll create a queue to review content submitted to our project. We’ll name it Reported Profiles.
Configure the queue to display all items instead of just flagged items. This ensures that we can see all reported profiles, even if they haven’t been flagged by our models.
To ensure that only reported profiles appear in this queue:
- Set a filter for the action
Report Profile
that we created earlier. - Configure the queue to display items from the last month to maintain focus.
Create a Removal Action
Now that the review queue is set up to display reported profiles, we’ll add an action to remove profiles from our application.
- Create an action named
Remove Profile
. - Configure it to appear only in the newly created Reported Profiles queue.
- Enable “Action resolves items” so that the profile is removed from the queue once the action is executed.
- Set up a webhook to call our application servers at
https://wizard-dating.com/webhooks
to handle the removal of the profile from the application.
Application Code
Next, we’ll implement the necessary code in our application to handle user reports and profile removals.
We’ll use the Moderation API’s Node SDK to interact with the API from our application.
Prerequisites
Environment Variables
Create a .env
file in your project root with your project API key:
Dependencies
Install the Moderation API’s Node SDK using your preferred package manager:
Instantiating the SDK
To use the Moderation API in our application, we’ll need to instantiate it with our project’s API key.
Submitting Profiles for Analysis
When a user creates or updates a profile, we’ll submit it for analysis using the /moderate/object
endpoint. This allows us to detect any issues with the profile content.
In this example, we are not acting on the analysis results, but you could use the data to hide flagged profiles or return an error to the user.
Adding Report Functionality to Our Application
We’ll add a function to call the /actions/execute
endpoint to report a profile. This function should be exposed to users through your application’s UI.
Handling the Webhook
We’ll implement a webhook handler to process the Remove Profile
action. In this example, we’re using Next.js and the Moderation API’s Node SDK to verify the webhook signature.
For more information, refer to the webhook documentation.
Using Our Review Queue
With everything set up, we can start using our review queue to manage reported profiles.
Upon opening the review queue, we’ll see reports submitted by users on our Wizard Dating app. We’ll review each report to decide whether to remove the profile or keep it.
Focusing on the Worst Offenders
To prioritize profiles that have also been flagged by our model analysis, we can apply a queue filter.
-
Open the filter and select the labels you want to focus on, such as
UNSAFE
.Tip: You can also click on the labels in the chart to apply the filter.
- After setting the filter, you’ll see items labeled as
UNSAFE
.
Removing a Profile
- Click on a queue item to open its detail view, where you can see the profile content and metadata.
- Review the flags and activity history, including when the profile was submitted, reported, and the reasons provided by users.
- If the profile violates guidelines, click the “Remove Profile” action. This will trigger the webhook to remove the profile from the application and resolve the item in the queue.
Keeping a Profile
-
Reset the filter to view all remaining items in the queue.
-
Select a profile that has been reported but not flagged by the models.
Example: “I’m actually a muggle but I’m looking for something magical.”
-
If the profile appears appropriate, click the “Resolve” button to remove it from the queue without taking further action.
-
Repeat the process for other profiles as needed.
All Done!
The review queue is now empty, and we’ve successfully handled all reported profiles.
Accomplishments:
- Enabled users to report profiles
- Set up a review queue to manage reports
- Implemented a webhook to remove profiles from the application
Next Steps
- Invite moderators to your queue: Expand your moderation team to handle more reports efficiently.
- Use the data to train a model: Enhance your models to better recognize inappropriate profiles.
- Implement an automated policy: Automatically remove profiles that receive multiple flags to streamline moderation.
Was this page helpful?