Creating Rego Check Plugins for Opsbox
Rego check plugins are vital for gathering data from a provider plugin, applying a defined policy, and formatting the raw result into a readable format. Let's explore how to create one!
Before You Start
If you haven't read the Plugin Basics and Providers documents, please take a moment to do so. They provide essential information for successfully gathering data, including:
- Setting up a proper hookimpl marker to register hook implementations
- Gathering required arguments for various plugins
- Information on setting up your plugin info document
- Details on providers to help you get started
While Rego checks don't typically use activate
or configuration models, feel free to include them if needed!
Requirements
Rego check plugins require three things:
- A manifest file detailing the location of the Rego file, the provider to gather data from, and the meaning of the check results.
- A Rego file with a policy that returns meaningful results from the provider data.
- A Python class with a hook implementation that converts the Rego check results into a human-readable format.
Manifest File
In addition to the [info]
section of your manifest TOML file *(see development basics), you also need to include a [rego]
header:
[info]
name = "Plugin Name"
module = "plugin_module"
class_name = "PluginClassInModule"
type = "rego"
uses = ["rego"]
[rego]
rego_file = "path/to/regofile.rego"
description = "Description of policy results"
- type must be
"rego"
. - uses list must contain
"rego"
. - rego_file points to the rego file.
- description describes the rego check itself.
These fields tell Opsbox what it needs to know to accurately manage Rego checks.
Defining Hook Implementations
Rego checks implement the hooks from RegoSpec
.
The key method to implement is report_findings(self, data: "Result") -> "Result"
. This hook takes in a Result with rego-processed details from the rego check specified and formats it into an LLM-usable text format, returning the result.
If you want to have user-settable parameters, you must also use the inject_data(self, data: "Result") -> "Result"
method. This method will intercept data from the provider and modify it's Result
object.
Note
More info about the result object can be found here.
What to include in your pyproject file
One key thing to note for building installable distributions* is the inclusion of rego files in your metadata, in addition to adding an entrypoint in your pyproject.toml
file to the module and class of your plugin. (see development basics for more!)
Change the following section in your pyproject.toml file to this to include both your manifest and rego files:
[tool.setuptools.package-data]
"plugin_name" = ["manifest.toml", "*.rego"] # include "*.rego" if you are making a rego plugin.
Example non-parameterized rego code
In order to get a feel of what this is like, let's take a look at our route53 check, empty_zones
.
Input Data
This check uses the route53 provider, which generates input data that looks like this:
{
"hosted_zones": [
{
"id": "/hostedzone/7ZSTMBU7PTBT23B",
"name": "example2.com.",
"record_count": 0,
"private_zone": false
},
...
],
"records": [
{
"zone_id": "/hostedzone/7Z4TMBU7PTBT23B",
"name": "api.example.com.",
"type": "CNAME",
"ttl": 300,
"records": [
{
"Value": "www.example.com."
}
]
},
...
],
"health_checks": [
{
"id": "ee156676-1aef-4696-94c6-f85ff628abae",
"type": "HTTP",
"ip_address": "192.0.2.1",
"port": 80,
"resource_path": "/",
"failure_threshold": 3
}
]
}
Note
It is very important to look at the results from the provider provided by your environment to develop the rego code you rely on.
Sometimes, you might have different parameters set up than our testing data.
Rego Policy
Here's what the rego policy for this check looks like:
package aws_rego.r53_checks.empty_zones.empty_zones
import rego.v1
# Find hosted zones with no records
empty_hosted_zones contains zone if {
some zone in input.hosted_zones
not records_exist_for_zone(zone.id)
}
# Check if records exist for a given zone
records_exist_for_zone(zone_id) if {
some record in input.records
record.zone_id == zone_id
}
allow if {
some zone in input.hosted_zones
not records_exist_for_zone(zone.id)
}
# Generate details for empty hosted zones
details := {"empty_hosted_zones": [zone | zone := empty_hosted_zones[_]]}
We can see that we:
- Use
input.<key>
to gather information retreived from the provider - Return a
details
dictionary that contains the results of the rego evaluation.
Designing rego code
It is important to note that there are a series of "best practices" for rego.
These can mostly be found by simply using Regal to lint your rego code.
Be careful what you ignore, some versions of OPA might be more or less strict on conformity to guidelines!
Example output data
When we take our rego check and run it straight against our testing data, we get this:
{
"details": {
"empty_hosted_zones": [
{
"id": "/hostedzone/7ZSTMBU7PTBT23B",
"name": "example2.com.",
"private_zone": false,
"record_count": 0
}
]
}
}
This is the data we will use to generate a nice, formatted version of our result, and it what normally goes into our details
attribute (minus the details
key, of course).
Example formatting
Here is how we format the rego check's result into a readable format:
class EmptyZones:
"""Plugin for identifying Route 53 hosted zones with no DNS records."""
def report_findings(self, data: "Result"):
"""Report the findings of the plugin.
Attributes:
data (Result): The result of the check.
Returns:
Result: The result containing the findings.
"""
details = data.details
logger.info(f"Details: {details}")
# Directly get empty hosted zones from the Rego result
empty_zones = details.get("empty_hosted_zones", [])
# Format the empty zones list into YAML for better readability
try:
empty_zones_yaml = yaml.dump(empty_zones, default_flow_style=False)
except Exception as e:
logger.error(f"Error formatting empty hosted zones: {e}")
empty_zones_yaml = ""
# Template for the output message
template = """The following Route 53 hosted zones have no DNS records:
{empty_zones}"""
logger.info(empty_zones_yaml)
# Generate the result with formatted output
if empty_zones:
return Result(
relates_to="r53",
result_name="route53_empty_zones",
result_description="Route 53 Hosted Zones with No Records",
details=data.details,
formatted=template.format(empty_zones=empty_zones_yaml),
)
else:
return Result(
relates_to="r53",
result_name="route53_empty_zones",
result_description="Route 53 Hosted Zones with No Records",
details=data.details,
formatted="No empty Route 53 hosted zones found.",
)
You can see we:
- Take in a rego check's
Result
object and mainly use it'sdetails
attribute to format and build a result - Returned a new
Result
object with the formatted template.
From here, your check is complete!
If you want to package it for wider distribution from PyPI, start with reading this portion of development basics, then read this portion again after initializing your project in the format from development basics.
Parameterizing your rego code (Advanceed, Optional)
You can add parameters to your rego code by implementing the inject_data(self, data: "Result") -> "Result"
method.
What this method does is intercept data after the provider but before it is sent to the rego check.
The sole argument to this check is the provider data Result
object, and the user modifies the Result.details
attribute of this object to add parameters accessible inside the rego code by using the input.<key>
syntax, like all other access of input data.
This allows for a bunch of fun, interesting dynamic attributes to be added to your rego checks to make them more powerful.
One of the more common things to do with this input injection method is to parameterize your rego functions by injecting user-specified arguments into the input
object.
Defining the arguments
First thing to do when needing accessible arguments is to add a config model that's returned from grab_config
, along side a proper setter of set_data
.
Let's take a look at another plugin, this time the overdue_api_keys
from the AWS IAM checks plugin, to see how it does it:
class OverdueAPIKeysConfig(BaseModel):
iam_overdue_key_date_threshold: Annotated[
datetime,
Field(
default=(datetime.now() - timedelta(days=90)),
description="How long ago a key was last used for it to be considered overdue. Default is 90 days.",
),
]
class OverdueAPIKeysIAM:
"""Plugin for identifying IAM keys that are overdue for rotation."""
@hookimpl
def grab_config(self) -> type[BaseModel]:
"""Return the plugin's configuration pydantic model.
These should be things your plugin needs/wants to function."""
return OverdueAPIKeysConfig
@hookimpl
def set_data(self, model: type[BaseModel]) -> None:
"""Set the data for the plugin based on the model.
Args:
model (BaseModel): The model containing the data for the plugin."""
self.conf = model.model_dump()
...
This is very much like any other plugin in how it defines it's data.
Tip
For more info on how to specify a configuration for a plugin that's stored, check out here
Inject into input
Next, let's see it's inject_data
function:
class OverdueAPIKeysIAM:
...
@hookimpl
def inject_data(self, data: "Result") -> "Result":
"""Inject data into the plugin.
Args:
data (Result): The data to inject into the plugin.
Returns:
Result: The data with the injected values.
"""
timestamp = int(self.conf["iam_overdue_key_date_threshold"].timestamp() * 1e9)
data.details["input"]["iam_overdue_key_date_threshold"] = timestamp
return data
You can see we:
- Grab the argument and format it for input
- Inject this data into the
data.details["input"]
dictionary to be recognized as an input
Tip
Normally for data injection you want to inject in the samedata.details["input"]
dictionary. This allows you to specify input.<key>
in the rego file, which is the default way to access this data in OPA.
This is the basic formula you'll want to follow for writing parameterized rego checks.
By following these guidelines, you can create effective Rego check plugins for Opsbox, making your cloud infrastructure management more insightful and automated. Happy coding! 🚀