![]() ![]() ![]() Now after looking through OPs code we see calls to file_get_contents. Netstat -ant | grep 9050 # verify Tor is running It is dead simple to setup Tor on Linux and use it as a proxy as the documentation suggests. Though I've not set Tor up as a proxy it's something I've considered, this is the place I would start. Unless I'm missing something the answer is yes, and here is some documentation on the Tor site. Perhaps someone with a kind heart can modify it to use 127.0.0.1:9050 for all crawling requests? Here is the crawler source if you are interested to take a look at: (Don't recommend my p2p search engines- it's not what I want for this- I know they exist, I did my homework.) I just want to crate my own Tor search engine. If any of my solutions work, how would I do it? (Step by step instructions please, I am a noob.) Perhaps if i set up global proxy settings, php would respect them? OR can i force my ENTIRE MACHINE to tunnel things through Tor, and how? ![]() I call this script from the command line using php crawl.php, and I add the appropriate parameters to crawl the page. onion domains, which is what I'm indexing. The thing is, I have to tunnel its connection through Tor so that It can resolve. Now, my problem is that my spider that actually crawls pages needs to do so on a SOCKS port 9050. ![]()
0 Comments
Leave a Reply. |