C language > Work in progress
Word .doc file, extract text
TimoVJL:
--- Quote from: jj2007 on March 28, 2019, 12:54:19 PM ---I have MS Word installed; would it work without?
--- End quote ---
It doesn't depend on MS Word, just API OLE2.
Notepad2 shows some special chars for formatting.
bitcoin:
--- Quote from: jj2007 on March 28, 2019, 12:54:19 PM ---I have MS Word installed; would it work without?
--- End quote ---
In my home computer I don't have MS Word.
But it works good. :)
jj2007:
Good to know. So you are basically looking for some magic numbers, right? Are they documented somewhere?
--- Code: ---if (*(WORD*)szTmp == 0xA5EC || *(WORD*)szTmp == 0xA5DC)
--- End code ---
I saw something here at MIT, but it doesn't look very official. This looks better 8)
TimoVJL:
https://docs.microsoft.com/en-us/openspecs/windows_protocols/MS-CFB/53989ce4-7b05-4f8d-829b-d08d6148375b
https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-doc/ccd7b486-7881-484c-a137-51170af7cc22
http://www.opennet.ru/docs/formats/wword8.html
https://www.decalage.info/file_formats_security/office
Vortex:
Hi Timo,
Nice work. Thanks for the new tool.
Navigation
[0] Message Index
[*] Previous page
Go to full version