# Regular Expressions Tutorial - part 1 - Basics of Regular Expressions

## What is Regular Expression

Regular expression, regex, or regexp (sometimes called a rational expression) is special sequence of characters that define a search pattern (if you want a mask) for text strings. Regular expressions are used in search engines, search and replace dialogs of word processors and text editors, in text processing utilities such as sed and AWK and in lexical analysis. Many programming languages provide regex capabilities, built-in or via libraries.

## Patterns

The phrase regular expressions, and consequently, regexes, is often used to mean the specific, standard textual syntax for representing patterns for matching text. Each character in a regular expression (that is, each character in the string describing its pattern) is either:

• metacharacter, having a special meaning, or

• regular character that has a literal meaning

Each regular expression consist from metacharacter and regular character

For example, in the regex a., is a literal character which matches just ‘a’ and . is a meta character that matches every character except a newline. Therefore, this regex matches, for example, ‘a ‘, or ‘ax’, or ‘a0’ text strings.

## Simple regular expressions

The simplest regular expression is a common letter - e.g r and when a string is searched in the text to accommodate this regular expression, it simply searches for the letter “r”. By default, as in Unix, it is case-sensitive. However, in most utilities, you can turn off this feature.

Since even in the simplest cases a person usually seeks a word and not a single letter, regular expressions can be chained. If you use the regular expression think, it actually represents the chaining of five elementary single-letter regular expressions. The result is the behavior you would expect - the word “think” will be searched for. Simple word search is the most primitive but also the most common application of regular expressions.

## Examples:

• .at matches any three-character string ending with “at”, including “hat”, “cat”, and “bat”.
• [hc]at matches “hat” and “cat”.
• [^b]at matches all strings matched by .at except “bat”.
• [^hc]at matches all strings matched by .at other than “hat” and “cat”.
• ^[hc]at matches “hat” and “cat”, but only at the beginning of the string or line.
• [hc]at$ matches “hat” and “cat”, but only at the end of the string or line. • $.$ matches any single character surrounded by “[” and “]” since the brackets are escaped, for example: “[a]” and “[b]”. ## Summary regular expression match \ Escape special meaning of meta characters ^ Start of string or line $ End of string or line
. Match any single character
[] Match one item in this character set
[abc] Match single character that is a or b or c
[^abc] Negative range ( Not a or b or c )
[a-z] match single lowercase letter from a to z
[A-Z] match single uppercase letter from A to Z
[0-7] match single number from 0 to 7