Pelles C forum

Pelles C => Feature requests => Topic started by: oforshell on February 14, 2011, 11:35:45 PM

Title: Some intrinsics from the days when intel PL/M-86 was the language of choice!
Post by: oforshell on February 14, 2011, 11:35:45 PM
When I programmed in PL/M-86 during the eighties to mid-nineties there were some built-in functions that I liked a lot, especially for text searches.

int findb (void *,char,int);
int findrb (void *,char,int);
int skipb (void *,char,int);
int skiprb (void *,char,int);
int cmpb (void *,void *,int);
int cmprb (void *,void *,int);

findb/findrb scanned a buffer (void *) of x (int) length for the first/last match for a byte (char). Upon completion the returned value would contain the index of the first/last match. If there was no match 0xffff would be returned.

skipb/skiprb scanned a buffer (void *) of x (int) length for the first/last non-match for a byte (char). Upon completion the returned value would contain would be the index of the first/last non-match. If there was no non-match 0xffff would be returned.

cmpb/cmprb compares two buffers of x (int) length byte by byte and reports the position of the first non-match. If the buffers were equal 0xffff would be returned.


I've included the OpenWatcom inline assembly definitions for the 32-bit x86:

#pragma aux cmpb = \
"all_same_0:"\
       "  mov edx,ecx"\
       "  jecxz all_same_1"\
       "    repe cmpsb"\
       "    je short all_same_0"\
       "      sub edx,ecx"\
"all_same_1:"\
       "dec edx"\
        parm [esi][edi][ecx] modify exact [ecx edx esi edi] value [edx];
#pragma aux cmprb = \
       "jecxz all_same"\
       "  lea esi,[esi+ecx-1]"\
       "  lea edi,[edi+ecx-1]"\
       "  std"\
       "  repe cmpsb"\
       "  cld"\
       "  jne short not_same"\
"all_same:"\
       "dec ecx"\
"not_same:"\
        parm [esi][edi][ecx] modify exact [ecx esi edi] value [ecx];
#pragma aux findb = \
"not_found_0:"\
       "mov edx,ecx"\
       "jecxz not_found_1"\
       "  repne scasb"\
       "  jne short not_found_0"\
       "    sub edx,ecx"\
"not_found_1:"\
       "dec edx"\
       parm [edi][al][ecx] modify exact [ecx edx edi] value [edx];
#pragma aux skipb = \
"all_same_0:"\
       "mov edx,ecx"\
       "jecxz all_same_1"\
       "  repe scasb"\
       "  je short all_same_0"\
       "    sub edx,ecx"\
"all_same_1:"\
       "dec edx"\
       parm [edi][al][ecx] modify exact [ecx edx edi] value [edx];
#pragma aux findrb = \
       "jecxz not_found"\
       "  lea edi,[edi+ecx-1]"\
       "  std"\
       "  repne scasb"\
       "  cld"\
       "  je short found"\
"not_found:"\
       "dec ecx"\
"found:"\
       parm [edi][al][ecx] modify exact [ecx edi] value [ecx];
#pragma aux skiprb = \
       "jecxz not_found"\
       "  lea edi,[edi+ecx-1]"\
       "  std"\
       "  repe scasb"\
       "  cld"\
       "  je short found"\
"not_found:"\
       "dec ecx"\
"found:"\
       parm [edi][al][ecx] modify exact [ecx edi] value [ecx];
Title: Re: Some intrinsics from the days when intel PL/M-86 was the language of choice!
Post by: Nobody_1707 on March 14, 2011, 10:12:33 PM
Wouldn't those have to be char pointers since you're dereferencing them?
Title: Re: Some intrinsics from the days when intel PL/M-86 was the language of choice!
Post by: AlexN on March 17, 2011, 08:24:12 AM
Since Pelles C has a function specifier inline (search  the usage in the help file) you can try to implement it by yourself and offer it in "User contributions" in the forum. ;)
Title: Re: Some intrinsics from the days when intel PL/M-86 was the language of choice!
Post by: Pelle on April 17, 2011, 04:39:09 PM
Not for 6.50, but maybe later add __cmpsb/__cmpsw/etc., and __scasb/__scasw/etc.

The thing is, intrinsics like these are usually added for speed and the Intel optimization guide says "... Using a REP prefix with string move instructions can provide high performance in the situations described above. However, using a REP prefix with string scan instructions (SCASB, SCASW, SCASD, SCASQ) or compare instructions (CMPSB, CMPSW, SMPSD, SMPSQ) is not recommended for high performance. Consider using SIMD instructions instead."
Title: Re: Some intrinsics from the days when intel PL/M-86 was the language of choice!
Post by: oforshell on April 18, 2011, 05:55:14 PM
The efficiency of the rep block instructions has varied for quite some time. What is true for one manufacturer, processor generation or processor family may not be true for the next. I've experienced some big variations.

Introducing  __cmps* and __scas* intrinsics miss the point to a certain degree since their behaviour is be regulated by both the direction (cld/std) and zero (repe repne) flags. To me it isn't obvious which one should be implemented and which one omitted.

In the same manner __movs* may also be regulated by the direction flag. One favorite of mine was using it for overlapping moves: when preparing an empty space in a buffer I would use the reverse direction move (i e beginning at the numerically highest address and moving my way towards numerically lower). When I had a space that had been vacated I'd use a normal move (moving from numerically lower to higher).

Testing forward byte scasb (find and skip) on one byte vector on a C2D 6400 @ 2.13 GHz results in 6 clks/byte data scanned. Forward cmpsb on two vectors results in 5 clks/byte.