Documentation for "expecto"

Download the whole documentation as one plain text file

4. Template Syntax

4.3. More String Flags: s, w and i

In the previous chapter you learned about the string flags "b" (match at the beginning) and "e" (match at the end). This chapter will introduce three more flags that modify the way strings are handled.

The "w" Flag: White Space Compression

Status reports from cron jobs sometimes contain output from shell commands that produce data in columns, such as "df", "ls -l" or "netstat -i". For example, a backup script might contain output like the following:

Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad5p6 10154158 71970 9269856 1% /bak-spool

Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad5p6 10154158 8476634 865192 91% /bak-spool

Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad5p6 10154158 1476464 7865362 16% /bak-spool

If you look closely, you will notice that the spacing in these variants is different. Depending on the actual size of the numbers, tools like "df" adapt the column widths dynamically. How do you match such a line in your template?

Of course, the simplest approach would be to only match the first word, like "Filesystem"{b}. But that's not a satisfying solution because we want the match to be as exact as possible. This is a typical use case for the "w" string flag.

"Filesystem 1K-blocks Used Avail Capacity Mounted on"{w}

The "w" flag means to treat all white space equal during comparison. That means that any sequence of spaces and/or tab characters is treated like a single space. Additionally, any white space at the beginning and at the end of the line is ignored completely.

The "s" Flag: Substring matching

System status reports often contain excerpts from free-form log files. For example, FreeBSD's "daily run output" contains entries from the kernel's so-called dmesg buffer that have been generated in the past 24 hours. These are very difficult to match because they are free-form. Here are some sample lines:

fxp0: link state changed to UP ad1: FAILURE - WRITE_DMA48 status=51 error=10 LBA=270282959 GEOM_MIRROR: Request failed (error=5). ad1[WRITE(offset=138384875008, length=16384)] GEOM_MIRROR: Device gm0: rebuilding provider ad0 finished. WARNING: /backup was not properly dismounted pid 1482 (script) is using legacy pty devices - not logging anymore MCA: CPU 0 COR GCACHE L2 EVICT error swap_pager_getswapspace(1): failed

Some of those lines are just informative and can be ignored, while others indicate serious situations that require attention. Basically, we would like to get a message if there's anything that contains the word "error", "fail" or "warn". That means we have to match lines that don't contain any of those words.

In order to do that, we use the "s" string flag to match substrings. A string tagged with that flag doesn't have to match the whole line, but it can match a part of it (i.e. a substring). This part can be anywhere in the line: at the beginning, at the end, or anywhere in between.

* "" || !("error"{s} || "fail"{s} || "warn"{s})

Phew! That looks complicated, but it is not that difficult to understand. This template expression matches any number of lines that are either empty or that don't contain any of the words "error", "fail" or "warn". Remember that "!" means "not" and "||" means "or". The asterisk at the beginning indicates that this expression is applied to as many lines as possible.

The "i" Flag: case-insensitive matching

There is one problem with the template expression in the above section: The log entries might contain the words in various combinations of lower-case and upper-case characters, for example "fail", "Fail" or "FAIL". Of course we could add more "or" parts to the expression, but this would get tedious and lead to very long and inefficient template lines.

Thankfully there is another string flag that can be used in this situation: The "i" flag causes the case to be ignored, i.e. the matching is case-insensitive, like the -i option of the grep command. So, now our template line looks like this:

* !"" && !("error"{si} || "fail"{si} || "warn"{si})

As you can see, multiple string flags can be combined inside the curly braces, as long as the combination makes sense. So, {si} or {is} instructs expecto to perform a case-insensitive substring match.


The following table summarizes the supported string flags:

Flag Meaning
b match at beginning of line
e match at end of line
w collapse white space
s match a substring of the line
i case-insensitive matching

[Valid XHTML 1.0]