Regex Use Fundamentals. Pleased with results, but Builds with Warnings?

Started by EdPellesC99, March 01, 2011, 06:45:53 PM

Previous topic - Next topic

EdPellesC99

  I am trying to get to first base using Regex "el-simple-ino".

 Thanks in Advance for any input, Ed

Quote

// // ~ •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <regex.h>

int main(int argC, char *argV[])
{
char buf[100];
_regex_t    regex;

//strcpy(buf , "C:\Folder");
// Above produces a match as it is a string literal.

//strcpy(buf , "C:\Foldr");
// Above produces no match.

//strcpy(buf , "[A-C]:\Folder");
// Above is Match

//strcpy(buf , "[A-C]:\Foldes");
// Above is No Match

//strcpy(buf , "[A-C]:\Folde.");
// Above is Match

strcpy(buf , "[A-C]:\.");  //...................... The dot means any character(s).
// Above is Match

int ret;

_regcomp(&regex , buf , _REG_EXTENDED |_REG_NOSUB );

char EdString[]="C:\Folder";
ret=_regexec(&regex , EdString , 1 , NULL , 0);
   if (ret==0)
   {
      printf("\n\nSince you returned %i\n\n     *You have Match*\n\n", ret);
   }
   else if (ret==1)
   {
      printf("\n\nYou returned %i\n\n    You have NO Match\n\n", ret);
   }
return 0;
}

// // ~ •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •


Although the above file works, I get warnings when building.
Should I ignore the warnings?
Or am I doing something wrong, and that is why I have warnings?

Output:
Building Gold In Console.obj.
C:\SGV1\C _Regex Dev\Gold In Console\Gold In Console.c(26): warning #2176: Unrecognized character escape sequence '\.'.
C:\SGV1\C _Regex Dev\Gold In Console\Gold In Console.c(37): warning #2176: Unrecognized character escape sequence '\F'.
Building Gold In Console.exe.
Done.

CommonTater

Your char EdString should be "c:\\Folder" ... the \\ slash prevents it from being treated as an escape sequence.

EdPellesC99

   It is because escaping backslashes did not cure my problem that I posted. [Doing this in both places may allow me to compile without warnings, but the Regex action fails.]  Let me explain more.

escape as you suggest and my build error is:

Building Gold In Console.obj.
C:\SGV1\C _Regex Dev\Gold In Console\Gold In Console.c(26): warning #2176: Unrecognized character escape sequence '\.'.
Building Gold In Console.exe.
Done.

I had tried this, trying to see what it took to get rid of the warnings.

QuoteIn my "real-life use" I will be of course checking user input, if the user input "C:\Folder" correctly ..... there will be only one backslash there and that is what I will be verifying with Regex. ..... but this is tangential to this issue.
Also I knew that if I escaped the backslash here also:
strcpy(buf , "[A-C]:\\.");  //...................... The dot means any character(s).

It would build without warnings..... HOWEVER:

It no longer matched, as it should.

I do not know if it is my computer, a corrupt install, or if other people will see the same thing.

That is why I posted. I do not understand why I see what I see.

Thanks for the reply.

If you knew that your suggestion would correct things (by building) then I assume .....you would have said so.

Did you try it?

Because if it works for you: .......... i.e. you can build without warnings, yet have the Regex work properly .....then I have a problem of another sort here.

---Ed



EdPellesC99

Ok, I figured it out.

In the above example file:

Adding the escape to:
char EdString[]="C:\\Folder";

gets rid of the one build warning.

To be able to build without warnings, AND the regex to work properly, you need to correct the Regex from:
strcpy(buf , "^[A-C]:\\.");  //...................... The dot means any character(s).

to the line:
strcpy(buf , "^[A-C:\\.]");  //...................... The dot means any character(s).

This regex will work even if the string is:
char EdString[]="C:\Folder";
but of course you would see a build warning on this line.

Looking over Pelles C help on using the Regex.h, I went looking for more info to scan .... to help things sink in a bit.

On the subject of Posix RegEx, the GNU C library:
http://www.cs.ui.ac.id/WebKuliah/IKI10100/resources/contest/OnlineJudge/gnudoc/libc/POSIX_Regexp_Compilation.html

is something to look at.

Then NICE info on Regular expressions for Posix RegEx I found here:
http://www.zytrax.com/tech/web/regex.htm

Thanks,
and until I am stuck again .......Ed

(Thanks to Pelle and Henry Spencer I am in "Regex business".)

CommonTater

You only need the double with string literals.

If you were to print out the string you would discover that the double slash is converted to a single when it's stored in memory.  You need the double because the \ is the escape character for tabs, line feeds, etc.  A double slash "escapes" to a single slash in the stored string.

TimoVJL

May the source be with you

EdPellesC99

Yes,

 In getting my win32 going for the edit control, I had just found I had to bracket the drives only.

EDIT: In the final event this regex was best: "^[C-M]:\\\\.*"
Of course this does not eliminate punctuation marks in the path, so a better regex can be written. I am trying to do that currently.


  In my win32, I am able to eliminate most of this block of code I had yesterday !  ;D ;D


      
Quoteif (  !( (ECEntry2[0]=='C') || (ECEntry2[0]=='D') || (ECEntry2[0]=='E') || (ECEntry2[0]=='F') || (ECEntry2[0]=='G') || (ECEntry2[0]=='H') || (ECEntry2[0]=='I')  )  )
         {
               SetWindowText(hwnd1, "   Invalid Targ Folder");
               SetWindowText(hwnd2, ".....Please Re-Enter.");
         goto LabelB;
         }

         if (  !( (ECEntry2[1]==':') )   )          {
               SetWindowText(hwnd1, "   Invalid Targ Folder");
               SetWindowText(hwnd2, ".....Please Re-Enter.");
         goto LabelB;
         }
         if (  !( (ECEntry2[2]=='\\') )   )
//         if (  !( (ECEntry2[1]==':') )   ) || if (  !( (ECEntry2[2]=='\\') )   )
         {
               SetWindowText(hwnd1, "   Invalid Targ Folder");
               SetWindowText(hwnd2, ".....Please Re-Enter.");
         goto LabelB;
         }
         if (*ECEntry1==*ECEntry2)
//         if (  !( (ECEntry2[1]==':') )   ) || if (  !( (ECEntry2[2]=='\\') )   )
         {
               SetWindowText(hwnd1, "   Invalid Targ Folder");
               SetWindowText(hwnd2, ".....Please Re-Enter.");
         goto LabelB;
         }



Oh I love this ability to use Regex.h !  It is so interesting, and VERY cool !!!!!

--------Your Regex tester site is Great !

However for offline use especially ..... in another thread on this site Frankie recommended:

http://www.weitz.de/regex-coach/

It is an install app, and large / sophisticated, but seems to work very nicely.

Thanks Timo

EdPellesC99


   I noticed the use of Regex is pricey in overhead !

I reduced a 29 line block of code to 15 lines, in my win32.exe project,
but the compiled exe file size went from 58 kb to 85 kb !...... the only change was adding the use of regex !

This really surprised me !