Is any algorithm to search binary strings?

Started by bitcoin, June 11, 2021, 10:57:37 PM

Previous topic - Next topic

bitcoin

Hello,

Are any public algorithm to find binary pattern (string)? As example - I wan't to find string "\x55\x8B\xEC\x90\x90" in PE file, strstr don't works.


Vortex

Hi bitcoin,

Grincheux is right, the Boyer Moore Algorithms can do the job. Here are some procedures provided by the Masm32 package :

QuoteBoyer Moore Algorithms

BMBinSearch proc startpos:DWORD,
                 lpSource:DWORD,srcLngth:DWORD,
                 lpSubStr:DWORD,subLngth:DWORD

BMHBinsearch proc startpos:DWORD,
                  lpSource:DWORD,srcLngth:DWORD,
                  lpSubStr:DWORD,subLngth:DWORD

SBMBinSearch proc startpos:DWORD,
                  lpSource:DWORD,srcLngth:DWORD,
                  lpSubStr:DWORD,subLngth:DWORD


Description

The three algorithms are variations of a Boyer Moore binary search design that is well suited for very high speed searching of binary or text data for a subpattern.


Parameters

1. startpos Zero based start address to nominate where to start searching.
2. lpSource The address of the data to be searched.
3. srcLngth The length of the search data.
4. lpSubStr The address of the pattern to search for.
5. subLngth The length of the pattern to search for.

Code it... That's all...

Vortex

Hi bitcoin,

Here is a quick example for you :

#include <stdio.h>

extern int __stdcall BMBinSearch(int,char *,int,char *,int);

int main(void)
{
char s[]={0x1,0x2,0x4,0x10,0x3,0x0,0x5,0x6,0x12};
char t[]={0x3,0x0,0x5};
int u;
u=BMBinSearch(2,s,9,t,3);
printf("Zero based offset of the pattern 0x3,0x0,0x5 = %d",u);
}

Code it... That's all...

Pelle

If you just want a "KISS" C solution, something like this should work too...

void *memstr(const void *memptr, size_t memlen, const char *strptr)
{
    if (*strptr == '\0')
        return (void *)memptr;

    size_t patlen = strlen(strptr), len;
    const unsigned char *pat = (const unsigned char *)strptr;
    const unsigned char *ptr = memptr;
    const unsigned char *end = ptr + memlen - patlen + 1;

    for (;;)
    {
        for ( ; ptr < end && *ptr != *pat; ++ptr)
            ;

        if (ptr == end)
            break;

        for (len = 1; len < patlen && ptr[len] == pat[len]; ++len)
            ;
        if (len == patlen)
            return (void *)ptr;

        ++ptr;
    }

    return NULL;
}
/Pelle

bitcoin


Vortex

With thanks to HellOfMice who reported the assembly syntax issue, I rebuilt the example with Pelles C V12.
Code it... That's all...

HellOfMice

#7
The same BMSerch in 64 bits please in 64 bits.
---------------------------------------------------------
I don't search on big strings only on dialogbox
The biggest I made for my tests in 3 449 bytes

QuoteDLG_ALL_CLASSES DIALOGEX DISCARDABLE 6, 18, 550, 358, 103
STYLE WS_POPUP|DS_MODALFRAME|DS_SYSMODAL|DS_ABSALIGN|DS_CONTEXTHELP|DS_SETFOREGROUND|DS_3DLOOK|DS_NOFAILCREATE|DS_NOIDLEMSG|DS_CONTROL|DS_CENTER|DS_CENTERMOUSE|DS_LOCALEDIT|WS_THICKFRAME|WS_CAPTION|

WS_SYSMENU|WS_MINIMIZEBOX|WS_MAXIMIZEBOX|WS_CLIPSIBLINGS|WS_CLIPCHILDREN|WS_HSCROLL|WS_VSCROLL|WS_VISIBLE
EXSTYLE WS_EX_APPWINDOW|WS_EX_TOOLWINDOW|WS_EX_CLIENTEDGE|WS_EX_STATICEDGE|WS_EX_TRANSPARENT|WS_EX_ACCEPTFILES|WS_EX_CONTROLPARENT|WS_EX_CONTEXTHELP|WS_EX_NOPARENTNOTIFY|WS_EX_RIGHT|WS_EX_LEFTSCROLLBAR|WS_EX_TOPMOST|WS_EX_NOACTIVATE
CAPTION "Dialogue de Test"
MENU IDM_MAIN_MENU
CLASS "MYDLGCLASS"
FONT 8, "Tahoma", 0, 1, 1
{
  CONTROL "OK", IDOK, "Button", WS_TABSTOP, 160, 5, 45, 15
  CONTROL "Cancel", IDCANCEL, "Button", WS_TABSTOP, 160, 23, 45, 15
  CONTROL "This is a label:", 4001, "Static", WS_GROUP, 344, 196, 108, 8
  CONTROL "Edit", 4002, "Edit", ES_AUTOHSCROLL|WS_BORDER|WS_TABSTOP, 344, 160, 108, 12
  CONTROL "Group-box", 4003, "Button", BS_GROUPBOX, 344, 216, 108, 40
  CONTROL "Button", 4004, "Button", WS_TABSTOP, 344, 144, 108, 12
  CONTROL "", 4005, "ComboBox", WS_BORDER|CBS_DROPDOWNLIST|CBS_SORT|WS_VSCROLL|WS_TABSTOP, 344, 176, 108, 40
  CONTROL "", 4006, "ListBox", LBS_SORT|LBS_NOTIFY|WS_VSCROLL|WS_BORDER|WS_TABSTOP, 344, 4, 108, 60
  CONTROL "", 4007, "ScrollBar", SBS_HORZ, 344, 128, 108, 8
  CONTROL "", 4008, "ScrollBar", SBS_VERT, 212, 16, 10, 48
  CONTROL "", 4009, "msctls_updown32", UDS_SETBUDDYINT|UDS_ARROWKEYS, 264, 32, 20, 40
  CONTROL "", 4010, "msctls_progress32", 0x00000000, 344, 72, 108, 8
  CONTROL "", 4011, "msctls_trackbar32", 0x00000000, 344, 88, 108, 12
  CONTROL "", 4012, "msctls_hotkey32", 0x00000000, 344, 112, 108, 12
  CONTROL "", 4013, "SysListView32", LVS_REPORT|LVS_SINGLESEL|LVS_AUTOARRANGE|WS_BORDER|WS_TABSTOP, 8, 8, 80, 80
  CONTROL "", 4014, "SysTreeView32", TVS_HASBUTTONS|TVS_HASLINES|TVS_LINESATROOT|WS_BORDER|WS_TABSTOP, 8, 96, 80, 80
  CONTROL "", 4015, "SysTabControl32", 0x00000000, 100, 48, 100, 100
  CONTROL "", 4016, "SysAnimate32", ACS_CENTER|ACS_AUTOPLAY, 284, 176, 48, 48
  CONTROL "Rich-edit", 4017, "RICHEDIT", WS_BORDER|WS_TABSTOP, 212, 76, 48, 40
  CONTROL "Rich-edit", 4018, "RichEdit20W", WS_BORDER|WS_TABSTOP, 212, 128, 48, 40
  CONTROL "", 4019, "ComboBoxEx32", CBS_DROPDOWN|WS_TABSTOP, 212, 172, 48, 80
  CONTROL "", 4020, "SysMonthCal32", 0x00000000, 8, 196, 130, 100
  CONTROL "", 4021, "SysDateTimePick32", WS_TABSTOP, 212, 192, 128, 14
  CONTROL "", 4022, "ComboBoxEx32", CBS_DROPDOWN|WS_TABSTOP, 212, 212, 96, 80
  CONTROL "", 4023, "ReBarWindow32", CCS_NORESIZE, 336, 260, 100, 40
  CONTROL "", 4024, "SysPager", 0x00000000, 212, 228, 100, 40
  CONTROL "", 4025, "ToolbarWindow32", CCS_NORESIZE|CCS_NODIVIDER, 8, 308, 150, 16
  CONTROL "", 4026, "msctls_statusbar32", 0x00000003, 0, 334, 539, 14
  CONTROL "<a>Click me!</a>", 4027, "SysLink", WS_TABSTOP, 456, 4, 80, 16
  CONTROL "", 4028, "SysIPAddress32", WS_TABSTOP, 336, 304, 80, 12
  CONTROL "Net-address", 4029, "msctls_netaddress", ES_AUTOHSCROLL|WS_BORDER|WS_TABSTOP, 156, 304, 80, 12
  CONTROL "Button", 4030, "Button", BS_SPLITBUTTON|WS_TABSTOP, 452, 272, 80, 16
  CONTROL "Command link", 4031, "Button", BS_COMMANDLINK|WS_TABSTOP, 148, 272, 160, 25
  CONTROL "", 4032, "NativeFontCtl", NOT WS_VISIBLE, 488, 176, 48, 28
  CONTROL "", 4033, "NativeFontCtl", NOT WS_VISIBLE, 472, 144, 8, 8
}

I work on Vorte'x one but i am not as good as him.possible if WS_SYSMENU is at the end of the line.
I don't check the '|' before.

"NativeFontCtl" will ne ignored

For the instant I have just finished the header.
The program will generate sources in C and in assmbler.
After resolving the possible bug I do controls

I already have 2452 lines in my source file.
It is not finished.

I ignore te delimiter BEGIN and END

Bonne soirée

Philippe / HellofMice