This tool is able to find 'http://' strings in any kind of file and write the complete link into an html file.
I played arround with
http://www.yacy.net/yacy/.
It is a p2p search engine for the web, a crawler, indexer, proxy, dns, webserver...
Because i saved all my links as shortcuts (for a better oversight) it was not possible to pass them to the yacy crawler. So i decided to code a little tool to crawl all my shortcuts and save the 'http://' links in an .html file. This way it worked.
How to use (in the console):
link_crawler [-s directory] [-d html_file] [-f filter]
-s 'directory': this directory and all subdirectories will be crawled
-d 'html_file': path and name for the result file
-f 'filter': in which type of files the program will search ('.txt' for example, without the 's )
known limitations:
- only the first 1 mb of a file will be scanned (should be enough)
- the programm will only search for one type of file, it is not possible to search in multiple filetypes at one time.
It is still alpha, but it works for me. If you find any bugs you can post it here.