NO

Author Topic: SCASD  (Read 5291 times)

Jokaste

  • Guest
SCASD
« on: September 28, 2017, 01:40:45 PM »
I want to dp this :

Quote
STEP 1 :Search for <img
STEP 2 : if found => search for rc=".
STEP 3 : if found => search for .jpg.

I use SCASD. STEP 1 is OK.
The second SCASD goes too far.

I made an other search in looking for .jpg then backward I look for rc=" Always the same problem. It seems that only the first SCASD instruction works well.

I downloaded AMD64 instructions set and read about SCAS.

Quote
Compares the AL, AX, EAX, or RAX register with the byte, word, doubleword, or quadword pointed
to by ES:rDI, sets the status flags in the rFLAGS register according to the results, and then increments
or decrements the rDI register according to the state of the DF flag in the rFLAGS register.
If the DF flag is 0, the instruction increments the rDI register; otherwise, it decrements it. The
instruction increments or decrements the rDI register by 1, 2, 4, or 8, depending on the size of the
operands.
The forms of the SCASx instruction with an explicit operand address the operand at ES:rDI. The
explicit operand serves only to specify the size of the values being compared.
The no-operands forms of the instruction use the ES:rDI registers to point to the value to be compared.
The mnemonic determines the size of the operands and the specific register containing the other
comparison value.
For block comparisons, the SCASx instructions support the REPE or REPZ prefixes (they are
synonyms) and the REPNE or REPNZ prefixes (they are synonyms). For details about the REP
prefixes, see “Repeat Prefixes” on page 12. A SCASx instruction can also operate inside a loop
controlled by the LOOPcc instruction.

Code: [Select]
;   http://www.alumnicheerleaders.com/             (URL FOR TEST)

;   __________________________________________________________________________________
;   _______________________ ParseJpgFile ________________________________________________
;   __________________________________________________________________________________

ParseJpgFile         PROC   USES RDI RSI PARMAREA=4*QWORD
                     LOCAL   _CurrentCount:QWORD

                     mov      rax,OFFSET lpCurrentBuffer      ; The html file
                     mov      rdi,[rax]
                     mov      rsi,rdi

                     mov      rax,OFFSET BufferSize         ; number of bytes in html buffer
                     mov      rcx,[rax]
                     shr      rcx,2                         ; divided by four because I search 4 bytes
                     mov     _CurrentCount,rcx              ; save RCX

;   =========================================================================
;   =========================================================================

; Examples of bytes I can find, not always xxxx.jpg"

;   <img src="//monsite.woopic.com/383/f/300x/p/imgtools/img/702daf7fa5fcd2ecd8ce3efffef17d96.jpg"
;   <img  src="http://api.ning.com/files/mCZ8T-S0x0y9t7RLp0ig6gEugQ1wv5aUW9HI0D80zOi*lwRSkSZVZ9d3IeAiZYkKYHAsJLdVVpj46F2pzoFgtUkDJ7jPbhru/Cheerboots.jpg?crop=1%3A1&amp;width=40"
;       <img  class="photo photo" src="http://api.ning.com:80/files/LvNfiSNLzicE9UiR73X5lR4kgNr7OgFQzJRmYC23BDbRBzK5LtrgUXZdUkQkogajJ4ea6GCEPbP7bSMttklSreHCaQDsmQg2/1039001571.png?xgip=0%3A4%3A299%3A299%3B%3B&amp;width=48&amp;height=48&amp;crop=1%3A1" alt="" />

@Loop :
                     cld

                     mov      rcx,_CurrentCount
                     jrcxz    @Finished
                     mov      eax,'gmi<'            ; '<img'
                     repne    scasd
                     jrcxz    @Finished             ; OK

                     shr      rdi,3                 ; I test to have a 8 bits alignment
                     shl      rdi,3

                     mov      eax,'"=cr'            ; 'rc="' (SRC=")
                     repne   scasd                  ; TOO FAR !
                     jrcxz   @Finished

                     mov      rsi,rdi

                     mov      eax,'gpj.'            ; '.jpg'
                     repne   scasd
                     jrcxz   @Finished

                     mov      BYTE PTR [rdi],0      ; End of jpeg file name
                     mov      _CurrentCount,rcx


                     mov      rax,OFFSET lpszFileNameFromHtml
                     mov      [rax],rsi
                     call     DownloadThisFile
                     jmp      @Loop

;   =========================================================================
;   =========================================================================

@Finished :

                     ret
ParseJpgFile         ENDP

Could someone help me?
Thanks

Philippe RIO
« Last Edit: September 28, 2017, 01:47:43 PM by Jokaste »

Jokaste

  • Guest
Re: SCASD
« Reply #1 on: September 28, 2017, 08:47:49 PM »
Now the problem is solved, I removed the SCASD.
I would like to hae an advise from JJ2007 and Vortex (my masters) ::)
Here is a new version.
« Last Edit: September 29, 2017, 11:32:50 AM by Jokaste »

Jokaste

  • Guest
Re: SCASD
« Reply #2 on: September 29, 2017, 11:03:24 AM »
Better alignment of code and datas.
Played with processor L1 cache.
Seems quicker.

Offline jj2007

  • Member
  • *
  • Posts: 536
Re: SCASD
« Reply #3 on: September 29, 2017, 02:25:21 PM »
Now the problem is solved, I removed the SCASD.
I would like to hae an advise from JJ2007 and Vortex (my masters) ::)

I feel honoured, Philippe ;-)

scasd advances in dword steps. So if you have a string like "where is the rc=?", then mov eax, "?=cr" is a good start but not sufficient... because it will find the match only if it happens to be at pos 0, 4, 8, ...

You probably need a combi of scasb and cmp [edi-1], eax.

Jokaste

  • Guest
Re: SCASD
« Reply #4 on: September 29, 2017, 03:22:52 PM »
I did not know that the first character had to be in first position. I was right when I told taht you were my masters.
I suppose it is the same thing with CMPSx. Finally it is not a useful instruction. It can be easily replaced by the old but good CMP.


I have modified my software for having a better aligment, also I added the MFENCE and PREFETCHNT1 instructions. Is it a good idea? Because I make a loop into a buffer that I always scan, if it could be stored into the processor cached seemed to me be a manner to speed the loop.


Is it possible, using XMM registers to make the same loop?


Thanks GOD

Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2091
Re: SCASD
« Reply #5 on: September 29, 2017, 04:27:47 PM »
Something with C too
Code: [Select]
// https://www.w3.org/MarkUp/html3/img.html

char *FindPicUrl(char *str, char **pc1, char **pc2)
{
char *pch = str;
*pc1 = *pc2 = 0; // set both empty
do { // find "<img"
while (*pch && *pch != '<') pch++; // first char to test
if (!*pch || !*(pch+1) || !*(pch+2) || !*(pch+3)) break;
if (*(long*)pch == *(long*)"<img") {
pch += 4;
while (*pch && *pch == ' ') pch++; // remove spaces
while (*pch && *(long*)pch != *(long*)"src=") pch++; // remove attributes
if (*(long*)pch == *(long*)"src=") {
*pc1 = pch+4;
break;
}
}
if (*pch) pch++; // don't go past the end
} while (*pch);
if (pch) { // find end " or ?
char *p2;
p2 = pch+5;
while (*p2 && *p2 != '\"' && *p2 != '?') p2++;
*pc2 = p2;
}
return *
pc1;
}

int __cdecl main(void)
{
char str1[] = "   <img src=\"//monsite.woopic.com/383/f/300x/p/imgtools/img/702daf7fa5fcd2ecd8ce3efffef17d96.jpg\" junk";
// char str2[] = "   <imp src=\"//monsite.woopic.com/383/f/300x/p/imgtools/img/702daf7fa5fcd2ecd8ce3efffef17d96.jpg\" junk";
char str3[] = "       <img  class=\"photo\" src=\"http://api.foo.com:80/files/L/1039001571.png?xgip=0;width=48&amp;height=48&amp;crop=1%3A1\" alt=\"\" />";
// char str4[] = " <img src=\"' + content.mediaDocuments[i].thumbUrl + '\" class=\"imgClss' + i + '\">";

char *pt1, *pt2;
// pt1 = pt2 = 0;
pt1 = FindPicUrl(str3, &pt1, &pt2);
if (pt2 && *pt2) *(pt2+1) = 0; // cut string
printf("%s\n", pt1);
return 0;
}
Any comments?
May the source be with you

Jokaste

  • Guest
Re: SCASD
« Reply #6 on: September 29, 2017, 07:15:41 PM »
Excellent (in French)




This site : http://www.alumnicheerleaders.com/ was a good test.


If you are OK I will include it into my program. A DLL ?


Bravo

Jokaste

  • Guest
Re: SCASD
« Reply #7 on: September 29, 2017, 07:44:47 PM »
Here is my version (copyright TimoVJL)

Code: [Select]
#include <tchar.h>
#include <ctype.h>

// https://www.w3.org/MarkUp/html3/img.html

char *FindPicUrl(char *str, char **pc1, char **pc2)
{
   char *pch = str;
   *pc1 = *pc2 = 0;   // set both empty
   do {   // find "<img"
      while (*pch && *pch != '<') pch++;   // first char to test
      if (!*pch || !*(pch+1) || !*(pch+2) || !*(pch+3)) break;
      if (*(long*)pch == *(long*)"<img") {
         pch += 4;
         while (*pch && *pch == ' ') pch++;   // remove spaces
         while (*pch && *(long*)pch != *(long*)"src=") pch++;   // remove attributes
         if (*(long*)pch == *(long*)"src=") {
            *pc1 = pch+4;
            break;
         }
      }
      if (*pch) pch++;   // don't go past the end
   } while (*pch);
   if (pch) {   // find end " or ?
      char *p2;
      p2 = pch+5;
      while (*p2 && *p2 != '\"' && *p2 != '?') p2++;
      *pc2 = p2;
   }
   return (*pc1) ;
}

char *UrlStringCopy(char *__lpszDestination,char *__lpszSource)
{
   int      _iLength ;
   char   *_lpDest ;
   char   _c ;

   _iLength = 0 ;
   _lpDest = __lpszDestination ;

   do
   {
      _c = *__lpszSource++ ;

      if(!((_c && (_iLength < 2048))))
         break ;

      *_lpDest++ = _c ;
      _iLength++ ;

   } while (_iLength < 2048) ;               // Maximum length of an URL

   *_lpDest = '\0' ;

   return (__lpszDestination) ;
}

char *FindPictureUrl(char *__lpszStringToSearch,char *__lpszResult)
{
   char   *_pt1, *_pt2 ;

   _pt1 = _pt2 = NULL ;

   _pt1 = FindPicUrl(__lpszStringToSearch,&_pt1,&_pt2);
   if(_pt2 && *_pt2)
   {
      *(_pt2 + 1) = 0;   // cut string

      return (UrlStringCopy(__lpszResult,_pt1)) ;
   }

   return ((char *) NULL) ;
}
« Last Edit: September 29, 2017, 07:46:21 PM by Jokaste »

Jokaste

  • Guest
Re: SCASD
« Reply #8 on: September 29, 2017, 10:06:02 PM »
Thank a lot TimoVJL for your functions.
Here is the last version.
« Last Edit: October 01, 2017, 10:38:39 AM by Jokaste »

Offline jj2007

  • Member
  • *
  • Posts: 536
Re: SCASD
« Reply #9 on: September 30, 2017, 03:36:52 AM »
Is it possible, using XMM registers to make the same loop?

Sure. Search e.g. for
Code: [Select]
pcmpeqb xmm0, [edi+16] ; compare packed bytes in [m128] and xmm0 for equality
pmovmskb eax, xmm0 ; set byte mask in eax for second 16 byte chunk

Offline TimoVJL

  • Global Moderator
  • Member
  • *****
  • Posts: 2091
Re: SCASD
« Reply #10 on: September 30, 2017, 10:20:18 AM »
For those who want to tweak it with poasm:
Code: [Select]
.code

FindPicUrl PROC
        mov     qword ptr [r8], 0
        mov     qword ptr [rdx], 0
        jmp     ?_002

?_001:  add     rcx, 1
?_002:  cmp     byte ptr [rcx], 0
        jz      ?_009
?_003:  cmp     byte ptr [rcx], 60 ; <
        jnz     ?_001
        cmp     byte ptr [rcx], 0
        jz      ?_009
        cmp     byte ptr [rcx+1H], 0
        jz      ?_009
        cmp     byte ptr [rcx+2H], 0
        jz      ?_009
        cmp     byte ptr [rcx+3H], 0
        jz      ?_009
        cmp     dword ptr [rcx], 676D693Ch ;"<img"
        jnz     ?_008
        add     rcx, 4
        jmp     ?_005

?_004:  add     rcx, 1
?_005:  cmp     byte ptr [rcx], 0
        jz      ?_007
        cmp     byte ptr [rcx], 32  ; ' '
        jz      ?_004
?_006:  cmp     byte ptr [rcx], 0
        jz      ?_007
        cmp     dword ptr [rcx], 3D637273h ;"src="
        jz      ?_014
        add     rcx, 1
        jmp     ?_006

?_007:  cmp     dword ptr [rcx], 3D637273h ;"src="
        jz      ?_014
?_008:  cmp     byte ptr [rcx], 0
        jz      ?_009
        add     rcx, 1
        cmp     byte ptr [rcx], 0
        jnz     ?_003
?_009:  test    rcx, rcx
        jz      ?_013
        lea     rax, qword ptr [rcx+5H]
        jmp     ?_011

?_010:  add     rax, 1
?_011:  cmp     byte ptr [rax], 0
        jz      ?_012
        cmp     byte ptr [rax], 34 ; "
        jz      ?_012
        cmp     byte ptr [rax], 63 ; ?
        jnz     ?_010
?_012:  mov     qword ptr [r8], rax
?_013:  mov     rax, qword ptr [rdx]
        ret

?_014:  lea     rax, qword ptr [rcx+4H]
        mov     qword ptr [rdx], rax
        jmp     ?_009

        ret
FindPicUrl ENDP

END
source code generated with objconv.exe and edited for poasm.
May the source be with you

Jokaste

  • Guest
Re: SCASD
« Reply #11 on: October 01, 2017, 10:48:18 AM »
Some bugs corrected.
Added some SSE/SSE2 instructions.
Changed aligment for some variables.
replaced the strlen function with one written by Agner Frog.
I would like to use the entire library but it is seems to be written for an other assembler (gas, namsm?).
so I can't.
« Last Edit: October 03, 2017, 05:22:22 AM by Jokaste »

Offline Vortex

  • Member
  • *
  • Posts: 797
    • http://www.vortex.masmcode.com
Re: SCASD
« Reply #12 on: October 04, 2017, 09:11:17 PM »
Hi Jokaste,

Sorry, I could be not helpful. Here is my image library converted to Poasm. Maybe you could use it for your projects.
Quote
LoadImageFromFile

    LoadImageFromFile PROC pImageFileName:DWORD

        This function loads a BMP, JPG, GIF or WMF image from disc and returns the handle to the image.

        pImageFileName is a pointer to the FULL path name of the image file to be displayed

LoadImageFromMem

    LoadImageFromMem PROC pImageAddr:DWORD,ImageLen:DWORD

        This function returns the handle of an image stored in memory. Valid image formats are
        BMP, JPG, GIF and WMF

        pImageAddr is a pointer to the location of the image in memory.

        ImageLen is the size of the image.

        In case of error, both of the functions will return NULL.
« Last Edit: October 04, 2017, 09:14:51 PM by Vortex »
Code it... That's all...

Jokaste

  • Guest
Re: SCASD
« Reply #13 on: October 04, 2017, 10:11:43 PM »
Thanks Vortex. I take. :)