NO

Recent Posts

Pages: [1] 2 3 ... 10
1
Tips & tricks / KAST macro by MrBcx
« Last post by Vortex on Today at 09:28:17 PM »
A nice macro by MrBcx used in the BCX language.

Traditional type casting :

Code: [Select]
#include <windows.h>
#include <stdio.h>

int main(void)
{
HMODULE hMod;
PIMAGE_NT_HEADERS pNThdr;
PIMAGE_EXPORT_DIRECTORY pDesc;
DWORD NumbOfNames,i;
DWORD* AddrOfNames;

hMod=LoadLibrary("kernel32.dll");

pNThdr=(PIMAGE_NT_HEADERS)((LPBYTE)hMod + ((PIMAGE_DOS_HEADER)hMod)->e_lfanew);

pDesc=(PIMAGE_EXPORT_DIRECTORY)((LPBYTE)hMod + pNThdr->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);

NumbOfNames =  pDesc -> NumberOfNames;

AddrOfNames = (DWORD *)((LPBYTE)hMod + pDesc -> AddressOfNames);

for(i=0;i<NumbOfNames;i++)
{
printf("%s\n",(*AddrOfNames+(LPBYTE)hMod));

AddrOfNames=AddrOfNames+1;
}

FreeLibrary(hMod);
return 0;
}

The KAST macro for types casting :

Code: [Select]
#include <windows.h>
#include <stdio.h>

/* KAST macro by MrBcx */

#define KAST(to_type,old_obj) ((to_type)(old_obj))

int main(void)
{
HMODULE hMod;
PIMAGE_NT_HEADERS pNThdr;
PIMAGE_EXPORT_DIRECTORY pDesc;
DWORD NumbOfNames,i;
DWORD* AddrOfNames;

hMod=LoadLibrary("kernel32.dll");

pNThdr=KAST(PIMAGE_NT_HEADERS,KAST(LPBYTE,hMod) + KAST(PIMAGE_DOS_HEADER,hMod)->e_lfanew);

pDesc=KAST(PIMAGE_EXPORT_DIRECTORY,KAST(LPBYTE,hMod) + pNThdr->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);

NumbOfNames =  pDesc -> NumberOfNames;

AddrOfNames = KAST(DWORD *,KAST(LPBYTE,hMod) + pDesc -> AddressOfNames);

for(i=0;i<NumbOfNames;i++)
{
printf("%s\n",*AddrOfNames+KAST(LPBYTE,hMod));

AddrOfNames=AddrOfNames+1;
}

FreeLibrary(hMod);
return 0;
}
2
Expert questions / Re: TikToken
« Last post by HellOfMice on Today at 01:21:19 PM »
I have used the web interface and created a database with more than 174 000 english words and their tokens.
Thank you everybody for your help

3
Expert questions / Re: TikToken
« Last post by John Z on Today at 10:45:25 AM »
For OpenAI a token is a group of three or four characters. The solution I have made is to divide the length of each word by three and add 1 word length is greather than three.


OpenAI tokenizer https://platform.openai.com/tokenizer

One of the methods to get better performance (at least using English language) is to look for both common prefixes and common suffixes and break the word there initially.  This creates more efficient tokens as I understand that.

To that end the following link might be useful for the most common of each - 

https://www.scholastic.com/content/dam/teachers/lesson-plans/migrated-files-in-body/prefixes_suffixes.pdf

John Z
4
Work in progress / Re: zlib 1.2.8 source converted for __stdcall
« Last post by John Z on Yesterday at 09:53:49 PM »
zlib 1.2.8 source converted for __stdcall.

Thanks Timo!

John Z
5
Work in progress / Re: zlib 1.2.8 source converted for __stdcall
« Last post by WiiLF23 on May 16, 2024, 10:22:27 PM »
Keep in mind, zlib1 (note the 1 at the end of the filename) is a mainstream effort from the authors to maintain a corrected compilation of the library, and its codebase in the wild. The documentation explains this very throughly, as well as the mistake that was overlooked prior to the mass release of zlib (without the 1) at the original release date. The naming convention is important. What you name the file after doesn’t mean much, as long as it is a corrected build (away from regular zlib.xx).

People need to know this, who are aware of the libs history.

So, I recommend you indicate this as to clarify it is the corrected code.
6
Windows questions / Re: WIN32 ComboBox control
« Last post by WiiLF23 on May 16, 2024, 10:09:53 PM »
The combobox control has a buggy history, and often developers will attempt to alter the controls appearance but will find that the win32 shell team implemented a group of pixels to draw the glyph, but it is all tied together so you have to either draw it all yourself or leave it alone in respect to its default style.

There are also undocumented API messages, which stump many. Not my favourite control to work with, and often I just use a Edit control and I draw a non-client area region and paint my own dropdown with highlight events (WM_PAINT, WM_MOUSEMOVE) with a Hittest on the rects region, and I go from there in respect to the label items.

I probably won’t ever use a traditional combobox anytime soon lol
7
Expert questions / Re: TikToken
« Last post by WiiLF23 on May 16, 2024, 09:52:22 PM »
I would love to convert this, just to stick it to Python.

I’m not a fan of it, however given the use of vectors and a range of “modules”, I would just grab the bindings and cave in.

A pure rewrite would utilize AVX/AVX2 or the SSE instructions (with CPU vendor detection of course). So that alone is worth considering if desiring a scratch implementation in C. Pelles has vector support, you will find this in the project settings.

Basically, you would need the API documentation and the rest is up to the C programming to align with the OpenAI API documentation.

It looks like some work outside of the Python C bindings.
8
Expert questions / Re: TikToken
« Last post by HellOfMice on May 16, 2024, 12:48:26 PM »
For OpenAI a token is a group of three or four characters. The solution I have made is to divide the length of each word by three and add 1 word length is greather than three.


OpenAI tokenizer https://platform.openai.com/tokenizer

9
Expert questions / Re: TikToken
« Last post by HellOfMice on May 16, 2024, 12:36:27 PM »
Yes. It computes the tokens.
10
Expert questions / Re: TikToken
« Last post by Vortex on May 14, 2024, 10:11:23 PM »
Hello,

Are you referring to this project?

https://github.com/openai/tiktoken
Pages: [1] 2 3 ... 10