Documentation for "expecto"

Download the whole documentation as one plain text file


4. Template Syntax

4.4. Regular Expressions

IMPORTANT: If you're not familiar with regular expressions yet, please stop reading here now. Use the internet search engine of your choice to look for a "regular expression tutorial" (there are a lot of them). The concept of regular expressions is essential for using expecto, so it is important that you know it well. As an excercise, play with the "egrep" command. Once you made yourself familiar with regular expressions, return here and continue reading.

The expecto utility uses "extended regular expressions" (sometimes called "modern regular expressions"), i.e. the kind of expressions that support "?", "+" and "|" (basic regular expressions don't support those). To be more exact, expecto uses Python's regular expression library that implements a superset of extended regular expressions. However, for most purposes it is sufficient to know that expecto supports the same expressions that can be used with the egrep utility (same as "grep -E").

Regular expressions are enclosed in slashes. Basically they can be used everywhere a string can be used. That means you can also combine them with logical operators. If you have to use a literal slash character insider a regular expression, you have to double it.

It is important to know that regular expressions are not anchored by default. That is, they match a substring of the line, just like the grep command. If you want to anchor the expression at the beginning or end of the input line, you have to use the "^" or "$" characters, respectively.

Now let's look at some examples. Assume we have a cron job that performs a backup of the /var file system. We get the following output on two different days:

Backup /var: Full backup (level 0) tar: log/messages.log: Truncated write; file may have grown while being archived. tar: log/auth.log: Truncated write; file may have grown while being archived. Backup finished.

Backup /var: Incremental backup (level 1) tar: spool/mqueue/d9Tn68oS021471: Cannot stat: No such file or directory tar: spool/mqueue/q9Tn68oS021471: Cannot stat: No such file or directory tar: Error exit delayed from previous errors. Backup finished.

As you can see, the tar command sometimes reports certain errors. The kind of errors shown above should be treated as normal and not cause an email message to be sent, because it is normal that log files can grow during the backup (first case), and it is also normal that sendmail's temporary queue files can disappear during the backup (second case). So we have to match those in our template.

Another detail that we have top pay attention to is the first line. Its contents change depending on the backup level. We have to account for that, too. This is the template that can be used:

/^Backup //var: (Full|Incremental) backup \(level [01]\)$/ * /^tar: log//[a-z]+\.log: Truncated write; file may have grown while being archived\.$/ * /^tar: spool//mqueue//[dq][A-Za-z0-9]+: Cannot stat: No such file or directory$/ ? "tar: Error exit delayed from previous errors." "Backup finished."

Note that it is also possible to write a similar template without regular expressions, using string flags and operators:

"Backup /var: Full backup (level 0)" || "Backup /var: Incremental backup (level 1)" * "tar: log/"{b} && ".log: Truncated write; file may have grown while being archived."{e} * "tar: spool/mqueue/"{b} && "Cannot stat: No such file or directory"{e} ? "tar: Error exit delayed from previous errors." "Backup finished."

In this case it doesn't make much of a difference if you use regular expressions. However, in many cases they allow the template to be written in a more terse and readable way, and they allow matching to be more precise and strict. For example, observe that the matching in the first template above is more restrictive, because it covers only files in the spool/mqueue directory that begin with "d" or "q" and consist only of letters and digits, while the second template accepts any file name.

The "i" Flag: case-insensitive matching

Regular expressions support the "i" flag, just like strings. If you append {i} to a regular expression (behind the closing slash), the matching is case-insensitive. For example, the following template expression matches a line containing any of the words "color", "colour", "Color", "COLOUR" etc.:

/colou?r/{i}

The other string flags ("{b}", "{e}", "{w}" and "{s}") are not supported for regular expressions. However, you don't need them, because regular expressions are much more powerful anyway. Hint: Inside a regular expression you can use \s+ to match any amount of white space between words, and \s* can be used to ignore any white space at the beginning or at the end of a line.



[Valid XHTML 1.0]