Payload Keywords

Payload keywords inspect the content of a packet or a specific buffer.

content

The content keyword is the basis for all signatures. Its value must be a string between double quotes:

content: "lorem ipsum";

It is possible to use several contents in a signature.

Contents are matched on bytes. You can match on all printable characters by writing them in the content keyword. For non-printable characters, use their hexadecimal notations between pipes.

For example:

content: "GET /|69 6E 64 65 78 2E 68 74 6D 6C| HTTP/1.0";

Note that there are reserved characters you cannot use in the content keyword because they are meaningful in the signature. To match on these characters, you have to use hexadecimal notation. It is a convention to write the hexadecimal notation in upper case characters. Reserved characters are:

Character	Hexadecimal Notation
“	\|22\|
;	\|3B\|
:	\|3A\|
\|	\|7C\|

Refer to the ASCII table for a comprehensive list of characters and their hexadecimal values.

Furthermore, it is possible to use ! for exceptions in contents.

For example:

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"Outdated Firefox on
Windows"; content:"User-Agent|3A| Mozilla/5.0 |28|Windows|3B| ";
content:"Firefox/3."; distance:0; content:!"Firefox/3.6.13";
distance:-10; sid:9000000; rev:1;)

Here, content:!"Firefox/3.6.13"; means that an alert will be generated if the used version of Firefox is not 3.6.13.

By default, the pattern matching is case sensitive.

nocase

If you do not want to make a distinction between uppercase and lowercase characters, you can use the nocase content modifier.

Place it after the content you want to modify, for example:

content: "abc"; nocase;

depth

The depth content modifier comes with a mandatory numeric value, for example:

depth:12;

The number after depth designates how many bytes from the beginning of the payload will be checked.

offset

The offset keyword designates from which byte on the payload will be checked for a match.

The keywords offset and depth can be combined and are often used together.

For example:

content:"def"; offset:3; depth:3;

In this example, the payload is checked from the third byte to the sixth byte.

distance

The distance keyword is a relative content modifier. This means it indicates a relation between the current content keyword and the content preceding it. distance takes effect after the preceding match.

The distance keyword comes with a mandatory signed value. This value determines the byte from which the payload will be checked for a match relative to the previous match.

distance determines where cognitix Threat Defender will start looking for a pattern. For example, distance:5; means the pattern can be anywhere after the previous match plus 5 bytes. To limit how far after the last match cognitix Threat Defender needs to look, use within.

within

The within keyword is relative to the preceding match. The keyword within comes with a mandatory positive value. Using within makes sure there will only be a match if the content matches the payload within the set number of bytes.

startswith

The startswith keyword is similar to depth:<length of pattern>;. It takes no arguments and must follow a content keyword. It modifies the content to match exactly at the start of a buffer.

Example:

content:"GET|20|"; startswith;

startswith is a short hand notation for:

content:"GET|20|"; depth:4; offset:0;

startswith cannot be mixed with depth, offset, within or distance for the same pattern.

endswith

The endswith keyword is similar to isdataat:!1,relative;. It takes no arguments and must follow a content keyword. It modifies the content to match exactly at the end of a buffer.

Example:

content:".php"; endswith;

endswith is a short hand notation for:

content:".php"; isdatat:!1,relative;

Note

You can combine startswith and endswith to check whether a pattern fills the complete buffer. Example:

http.uri; content:"/index.html"; startswith; endswith;

isdataat

The purpose of isdataat is to look if there is still data at a specific part of the payload. The keyword starts with a positive number (the position) and optionally followed by relative separated by a comma. Use relative to know if there is still data at a specific part of the payload relative to the last match.

The following example illustrates a signature which searches for byte 512 of the payload:

isdataat:512;

The second example illustrates a signature searching for byte 50 after the last match:

isdataat:50, relative;

The option rawbytes is not supported.

bsize

With the bsize keyword, you can match on the length of the buffer to add precision to the content match. Previously this could be done with isdataat.

dsize

With the dsize keyword, you can match on the size of the packet payload. For example, you can use the keyword to look for abnormal sizes of payloads. This may be convenient in detecting buffer overflows.

byte_test

The byte_test keyword extracts the number of bytes specified in bytes_to_convert and performs an operation selected with operator against value at offset.

Format:

byte_test:<bytes_to_convert>, [!]<operator>, <value>, <offset> \
         [, relative][, <endian>][, string, <num>][, dce]  \
         [, bitmask <bitmask_value>];

bytes_to_convert

The number of bytes selected from the packet to be converted; between 1 - 8.

value

Value to test the converted value against; between 0 - 4294967295.

offset

Number of bytes into the payload.

operator

! negation, can prefix other operators
< less than
> greater than
= equal to
<= less than or equal to
>= greater than or equal to
& bitwise AND
^ bitwise OR

relative

Offset relative to last content match.

endian

big (most significant byte at lowest address)
little (most significant byte at the highest address)

string

not supported

dce

not supported

bitmask

not supported

Example:

alert tcp any any -> any any (msg:" Matches User-Agent 'genua'"; \
http.header; content:"User-Agent: "; \
byte_test:5, =, 113685342544138, 0, relative, big; sid:1; rev:1)

byte_jump

The byte_jump keyword allows for the ability to select bytes_to_convert from an offset and moves the detection pointer to that position. Subsequent content matches will then be based off the new position.

Format:

byte_jump:<bytes_to_convert>, <offset> [, relative][, multiplier <mult_value>] \
  [, <endian>][, string, <number_type>][, align][, from_beginning][, from_end] \
  [, post_offset <adjustment value>][, dce][, bitmask <bitmask_value>];

bytes_to_convert

The number of bytes selected from the packet to be converted, between 1 - 8.

offset

Number of bytes into the payload.

relative

Offset relative to last content match.

multiplier

Multiplies the converted byte by the value.

endian

big (most significant byte at lowest address)
little (most significant byte at the highest address)

string

not supported

align

Rounds the number up to the next 32-bit boundary.

from_beginning

Jumps forward from the beginning of the packet, instead of where the detection pointer is set.

from_end

not supported

post_offset

After the jump operation has been performed, the specified number of bytes be will jumped additionally.

dce

not supported

bitmask

not supported

Example:

alert tcp any any -> any any (msg:"Jump on binary newline 10 bytes forward"; \
content:"/index.html HTTP/1.0\"; byte_jump:1,1,relative,big; \
content:\"genua\"; distance:0; sid:1; rev:1;)

byte_extract

Note

cognitix Threat Defender does not support this keyword.

pcre (Perl Compatible Regular Expressions)

The pcre keyword matches specifically on regular expressions.

Matching regular expressions causes a lot of processing overhead and is often combined with the content keyword. This way, the regular expression is only run if the content matches first.

Format of pcre:

pcre:"/<regex>/<modifiers>";

In the following example, the signature will match if the payload contains six consecutive numbers:

pcre:"/[0-9]{6}/";

Note

The following characters must be escaped inside the content: ; \ "

Perl/PCRE-compatible Modifiers

The matching behavior and the syntax interpretation of PCREs can be altered by several flags. The ones supported are listed here, with a short description and their internal PCRE name in parentheses.

i (PCRE_CASELESS)
If this modifier is set, letters in the pattern match both upper- and lowercase letters.
m (PCRE_MULTILINE)
By default, PCRE treats the subject string as consisting of a single “line” of characters (even if it actually contains several newlines). The “start of line” metacharacter (^) matches only at the start of the string, while the “end of line” metacharacter ($) matches only at the end of the string, or before a terminating newline (unless E modifier is set). When this modifier is set:
- ^ additionally matches after every newline character (and the start of the string)
- $ additionally matches before every newline character (and the end of the string)
If there are no \n characters in a subject string, or no occurrences of ^ or $ in a pattern, setting this modifier has no effect.
s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
x (PCRE_EXTENDED)
If this modifier is set, whitespace characters in the pattern are totally ignored except when escaped or inside a character class. In this mode, # also introduces a line comment (that extends to the next \n character (inclusive)). # characters that are escaped or inside a character class are treated as normal. Whitespace characters may never appear within special character sequences in a pattern, for example within the sequence (?( which introduces a conditional subpattern.
A (PCRE_ANCHORED)
If this modifier is set, the pattern is forced to be “anchored”, that is, it is constrained to match only at the start of the string which is being searched (the “subject string”). This effect can also be achieved by appropriate constructs (^) in the pattern itself.
E (PCRE_DOLLAR_ENDONLY)
If this modifier is set, a dollar metacharacter in the pattern matches only at the end of the subject string. Without this modifier, a dollar also matches immediately before the final character if it is a newline (but not before any other newlines). This modifier is ignored if m modifier is set.

Note

This modifier is sometimes represented with the letter D (e.g. PHP).

G (PCRE_UNGREEDY)
This modifier inverts the “greediness” of the quantifiers so that they are not greedy by default, but become greedy if followed by ?. It can also be set by a (?U) modifier setting within the pattern. This can also be achieved by inverting the greediness of every quantifier individually by appending a question mark (e.g. .* -> .*?, .+ -> .+?, x? -> x??).

Note

This modifier is usually represented with the letter U.

Custom Modifiers

The following custom modifiers are available:

R: Match relative to the last pattern match. It is similar to distance:0;.
B: Compatibility pcre modifier. This modifier has no effect.
O: not supported

Custom Target Modifiers

There are several custom modifiers available to specify the buffer the pattern should match on. These stem mostly from the time modifier keywords were used to specify the target buffer, instead of the now prevalent sticky buffer keywords. They should be considered deprecated and the use of sticky buffer keywords should be preferred.

C: Matches on the same buffer as http.cookie.
D: not supported
H: Matches on the same buffer as http.header.
I: Matches on the same buffer as http.uri.raw.
M: Matches on the same buffer as http.method.
P: Matches on the same buffer as http.request_body.
Q: Matches on the same buffer as http.response_body.
S: Matches on the same buffer as http.stat_code.
U: Synonym of I.
V: Matches on the same buffer as http.user_agent.
W: Matches on the same buffer as http.host.
Y: Matches on the same buffer as http.stat_msg.
Z: Matches on the same buffer as http.host.raw.