How can you add regular expressions to C++?
Here you’re three small examples.
Pattern matching
In this example you’ll find how you can match a regexp in a string.
// Created by Flavio Castelli <flavio.castelli_AT_gmail.com>
// distrubuted under GPL v2 license
#include <boost/regex.hpp>
#include <string>
int main()
{
boost::regex pattern ("bg|olug",boost::regex_constants::icase|boost::regex_constants::perl);
std::string stringa ("Searching for BsLug");
if (boost::regex_search (stringa, pattern, boost::regex_constants::format_perl))
printf ("found\n");
else
printf("not found\n");
return 0;
}
Substitutions
In this example you’ll find how you can replace a string matching a pattern.
// Created by Flavio Castelli <flavio.castelli@gmail.com>
// distrubuted under GPL v2 license
#include <boost/regex.hpp>
#include <string>
int main()
{
boost::regex pattern ("b.lug",boost::regex_constants::icase|boost::regex_constants::perl);
std::string stringa ("Searching for bolug");
std::string replace ("BgLug");
std::string newString;
newString = boost::regex_replace (stringa, pattern, replace);
printf("The new string is: |%s|\n",newString.c_str());
return 0;
}
Split
In this example you’ll find how you tokenize a string with a pattern.
// Created by Flavio Castelli <flavio.castelli@gmail.com>
// distrubuted under GPL v2 license
#include <boost/regex.hpp>
#include <string>
int main()
{
boost::regex pattern ("\\D",boost::regex_constants::icase|boost::regex_constants::perl);
std::string stringa ("26/11/2005 17:30");
std::string temp;
boost::sregex_token_iterator i(stringa.begin(), stringa.end(), pattern, -1);
boost::sregex_token_iterator j;
unsigned int counter = 0;
while(i != j)
{
temp = *i;
printf ("token %i = |%s|\n", ++counter, temp.c_str());
i++;
}
return 0;
}
Requirements
In order to build this examples you’ll need:
- a c++ compiler (like g++)
- boost regexp library















Entries (RSS)
Is there a way to accomplish parantheses matching like (((catelli)))
For example this one is not correct: (((catelli) since there is only one at the right. Is it possible to detect this with regex_match ?
Yes, there is!
You should use an conditional to check the group under test, like this:
If you “castelli” or “(castelli)” its ok, but it should not match “(castelli” or “castelli)”, to do so, you just make an expression like this:
(\()?castelli(?(1)\))
The (1) is related to the first group, it means that if the group 1 – (\)) – exists, the second condition should be tested too. I think it can help on taking care about parenthesis in this case.
Maybe count matched ‘(‘ and ‘)’ chars, and if the count_of_open == count_of_closed then the syntax is correct.
But if I have a string with a lot of useless blanks and tabs like mystr = ” 1 4 999 3 67 “, is there a way to reproduce awk behaviour (like “cat mystr | awk ‘{ print $2 }’”, obtaining 4 ) with boost/regex libraries?
Sure, but you need to have in mind that regex can only be applied over patterns that can be matched against an expression. In case you need to find the second parameter from any string like:
“aa 00 b0 aaaaa a 123455″
I just need a regex like:
^[0-9a-f ]{1,59}\s([0-9a-f]{1,59})
It will match any line starting with a string that has numbers or letters (at least 1 and a maximum of 59), followed by space and having a second string, this is the group you need, so you just surround it with ‘(‘ and ‘)’.