Gate: PII Detection
This plugin supports all intercept modes (request, response, request_response)
This plugin is not compatible with the AWS Lambda Authorizer deployment type.
This plugin detects and redacts PII information contained in an HTTP request or response. The plugin is built using a mix of named entity recognition and regular expressions.
Configuring Gate
- Environment variables
- HCL
- JSON
- TOML
- YAML
GATE_PLUGINS_<PLUGIN NUMBER>_TYPE=anonymizer
GATE_PLUGINS_<PLUGIN NUMBER>_PARAMETERS_ANONIMIZERS_DEFAULT_TYPE=<action>
GATE_PLUGINS_<PLUGIN NUMBER>_PARAMETERS_ANONIMIZERS_PHONE_NUMBER_TYPE=<action>
In the Environment variables configuration, <PLUGIN NUMBER> defined plugin execution order.
gate = {
    plugins = [
        // ...
        {
            type       = "anonymizer"
            parameters = {
                anonymizers {
                    DEFAULT = {
                        type =  keep
                    }
                    PHONE_NUMBER = {
                        type = mask
                        masking_char = "*"
                        chars_to_mask = 4
                        from_end = true
                    }
                }
            }
        }
        // ...
    ]
}
{
  "gate": {
    "plugins": [
      // ...
      {
        "type": "anonymizer",
        "parameters": {
            "anonymizers": {
                "DEFAULT": {
                    "type": "keep"
                },
                "PHONE_NUMBER": {
                    "type": "mask"
                    "masking_char": "*"
                    "chars_to_mask": 4
                    "from_end": true
                }
            }
        }
      }
      // ...
    ]
  }
}
[[gate.plugins]]
type = "anonymizer"
parameters.anonymizers.DEFAULT.type = "keep"
parameters.anonymizers.PHONE_NUMBER.type = "task"
parameters.anonymizers.PHONE_NUMBER.masking_char = "*"
parameters.anonymizers.PHONE_NUMBER.chars_to_mask = 4
parameters.anonymizers.PHONE_NUMBER.from_end = true
gate:
  plugins:
    // ...
    - type: anonymizer
      parameters:
        anonymizers: |
          DEFAULT:
            type: keep
          PHONE_NUMBER:  # (123) 456-7890 ➔ (123) 456-****
            type: mask
            masking_char: "*"
            chars_to_mask: 4
            from_end: true
    // ...
Gate offers a rich set of options to customize detection behavior. In particular for each PII type you can:
- customize the detection with regexp to increase accuracy
- decide on the behavior: whether to replace, mask, tokenize, encrypt or simply log the presence of sensitive data
Currently we support:
- email addresses (EMAIL_ADDRESS)
- phone number (PHONE_NUMBER)
- urls (URL)
- credit cards (CREDIT_CARD)
- IBAN (IBAN)
- SSN (US_SSN)
- UK NHS numbers (UK_NHS)
- US ITIN (ITIN)
- US driver license (US_DRIVER_LICENSE)
Once sensitive data is detected we have multiple options:
- replace
- mask
- hash
- encrypt
- keep
PII anonymization policies
By default gate runs in keep mode. Meaning the PII is detected but not modified. 
As an example, this is the configuration to monitor by default, mask phone numbers, tokenize email addresses, encrypt urls and replace credit cards.
    - id: pii_anonymizer
      type: anonymizer
      enabled: false
      parameters:
        anonymizers: |
          DEFAULT:
            type: keep
          PHONE_NUMBER:  # (123) 456-7890 ➔ (123) 456-****
            type: mask
            masking_char: "*"
            chars_to_mask: 4
            from_end: true
          EMAIL_ADDRESS:  # john.smith@example.com ➔ 8e621e3d0368631d263d07a351fa8d34fba0d17c15fbcdec11a5f58008d022a0
            type: hash
            hash_type: sha256  # either sha256 (default), sha512 or md5
          URL:  # https://john.smith.me ➔ KshULhOqJrLmuHOsQ/ArGHQK8Wrjg0BKdypb/77PYf+64v/FqB3zufbVlnGD4sn4
            type: encrypt  # Replaces value with Base64-encoded AES-CBC encrypted value with PKCS#7 padding and a prepended IV
            key: abcdefghijklmnop  # AES key, must be 128, 192 or 256 bits
          CREDIT_CARD:  # 4111111111111111 ➔ <CREDIT_CARD>
            type: replace
            new_value: <CREDIT_CARD>  # Defaults to <TYPE>
Custom detections
You can specify your own types using regex patterns, as shown below: or more regex patterns, as shown below, to improve the detection accuracy.
    - id: pii_anonymizer
      type: anonymizer
      enabled: false
      parameters:
        analyzer_ad_hoc_recognizers: |
          - name: Zipcode regex
            supported_language: en
            supported_entity: ZIP
            patterns:
            - name: zipcode
              regex: "(\\b\\d{5}(?:\\-\\d{4})?\\b)"
              score: 0.01
In this example we are specifying the following data:
- supported_language: the language this regex should be applied to
- supported_entity: the entity type to apply the regex on
- One or more patterns, each with aregexpand a score which represents the confidence in the PII detection.