Updated April 3, 2023
Introduction to Regular Expression in C#
Pattern matching is done in C# using regular expression and regex class of C# is used for creating regular expression in C#, a standard for pattern matching in strings and for replacement is set using regular expression and it tells the computer through the user how to look for a specific pattern in a string and what must be the response when it finds the specific pattern it is looking for and regex is the abbreviation for a regular expression, overall regular expressions in C# is a powerful method to identify and replace the text in the strings that are defined in a particular format.
Syntax
The following is a list of the basic syntax used for regular expressions in C#. They are:
1. Quantifiers
The list of important quantifiers are as follows:
- *: The preceding character is matched zero or more times. Consider the regular expression c*. This expression matches d, cd, ccd, cccd,….c to the power of nd.
- +: The preceding character is matched one or more times. Consider the regular expression c+d. This expression matches cd, ccd, cccd,….c to the power of nd.
- ?: The preceding character is matched zero or one time. Consider the regular expression c?d. This expression matches d, cd.
2. Special Characters
The list of important special characters are as follows:
- ^: The beginning of the string is matched using this special character. Consider the example ^Karnataka. This expression matches Karnataka is our state.
- $: The end of the string is matched using this special character. Consider the example Karnataka$. This expression matches Our state is Karnataka.
- Dot (.): Any character is matched only once using this special character. Consider the example l.t (length = 3). This expression matches lit, lot, let.
- \d: A digit character is matched using this special character. Consider the example Regex-[0-9]. This expression matches 123, 456, 254, etc.
- \D: Any non-digit character is matched using this special character. Consider the example Regex-[^0-9]. This expression matches everything except the numbers consisting of digits from 0-9.
- \w: An alphanumeric character plus “_” can be matched using this special character. Consider the example Regex- A to Z, 0 to 9, a to z, _(Underscore). This expression matches the alphanumeric character “_”.
- \W: Any non-word character is matched using this special character. Consider the example \W. This expression matches “.” in “IR B2.8”
- \s: White space characters are matched using this special character. Consider the example \w\s. This expression matches “C ” in “IC B1.5”
- \S: Non-White space characters are matched using this special character. Consider the example \s\S. This expression matches “_ ” in “IC__B1.5”
3. Character Classes
The characters can be grouped by putting them between square brackets. By doing this, at least one character in the input will be a match with any character in the class.
[]: A range of characters can be matched using []. Consider the example [Xyz]. This expression matches any of x, y, and z.
Consider the example [c-r]. This expression matches any of the characters between c and r.
4. Grouping and Alternatives
The things can be groups together using the parenthesis ( and ).
- (): Expressions can be grouped using (). Consider the example (ab)+. This expression matches ab, abab, and does not match aabb.
- {}: Matches the preceding character for a specific number of times.. The number of times can be specified using the following:
- n: The previous element is matched exactly n number of times. Consider the example “,\d{3}”. This expression matches,123 in 1,123.40
- {n,m}: The previous element is matched at least n number of times but not more than m number of times. Consider the example “,\d{2,3}”. This expression matches,12 and,123 in 1,123.40
Working of Regular Expressions in C#
Basically, there are two types of regular expression engines. They are text-directed engines and regex-directed engine. A regex-directed engine scans through the regex expression trying to match the next token in the regex expression to the next character. The regex advances if a match is found, otherwise it goes back to the previous position in the regex and the string to be parsed where it can try different paths through the regex expression. A text-directed engine scans through the string trying all the permutations of the regex expression before moving to the next character in the string There is no backtracking or going backward in-text directed engine. The leftmost match is always returned by the regex engine even if there are possibilities of finding the exact matches later. The engine begins with the first character of the string whenever a regex is to be applied to the string. All the possible permutations are applied at the first character and the results seem to fail, then the permutations are moved to the second character in the string and this process goes on until the regex engine finds the exact match.
Consider the example Check the water in the bathtub before going to the bath. The regex engine is asked to find the word bath from the above sentence. The first character C is matched with b by the regex engine and this is a failure. So, the next character H tries to match with b by the regex engine and again this is a failure. This goes on and when the regex engine tries to match the 24th character with b, it matches. So, it goes on and matches the word bath from the bathtub with word bath and the engine reports the word bath from the bathtub as a correct match and it will not go on further in the statement to see if there are any other matches. This is how the regex engine works internally.
Methods of Regular Expression in C#
The regular expression in C# makes use of the following methods. They are:
- public bool IsMatch(string input): The regular expression specified by the regex constructor is matched with the specified input string using this method.
- public bool IsMatch(string input, int startat): The regular expression specified by the regex constructor is matched with the specified input string with the starting position specified, using this method.
- public static bool IsMatch(string input, string pattern): The method matches the regular expression specified with the input string specified.
- public MatchCollection Matches(string input): All the occurrences of a regular expression are searched in the specified input string, using this method.
- public string Replace(string input, string replacement): The specified strings matching the regular expression are all replaced by the replacement string, using this method.
- public string[] Split(string input): The positions specified by the regular expressions is where the array of strings is split into an array of substrings, by using this method.
Example on Regular Expression in C#
C# program to demonstrate the use of regular expressions for the verification of mobile numbers.
Code:
using System;
using System.Text.RegularExpressions;
class Check {
static void Main(string[] args)
{
//Mobile numbers are given as a input to an array of strings
string[] nos = {"9902147368",
"9611967273", "63661820954"};
foreach(string s in nos)
{
Console.WriteLine("The mobile number {0} {1} a valid number.", s,
checkvalid(s) ? "is" : "is not");
}
Console.ReadKey();
}
// Regex expressions are verified through this code block
public static bool checkvalid(string Number)
{
string cRegex = @"(^[0-9]{10}$)|(^\+[0-9]{2}\s+[0-9]
{2}[0-9]{8}$)|(^[0-9]{3}-[0-9]{4}-[0-9]{4}$)";
Regex res = new Regex(cRegex);
if (res.IsMatch(Number))
return (true);
else
return (false);
}
}
Output:
Recommended Articles
This is a guide to Regular Expression in C#. Here we also discuss the introduction and syntax of regular expression in c# along with methods and examples. You may also have a look at the following articles to learn more –