{"id":9,"date":"2019-08-30T03:25:05","date_gmt":"2019-08-30T02:25:05","guid":{"rendered":"https:\/\/openbullet.dev\/?p=9"},"modified":"2019-08-30T04:57:45","modified_gmt":"2019-08-30T03:57:45","slug":"evading-ip-lock-of-proxy-services","status":"publish","type":"post","link":"https:\/\/openbullet.dev\/?p=9","title":{"rendered":"Evading IP lock of proxy services"},"content":{"rendered":"\n<p>There are many services that provide a large quantity of proxies that you can use when scraping websites to avoid getting timed out. But what if you want to buy a single proxy package and use it from multiple computers on different networks?<\/p>\n\n\n\n<h2>Types of security<\/h2>\n\n\n\n<p>The main types of security that proxy services use in order to avoid abuse of their system are:<\/p>\n\n\n\n<ul><li>Username and password authentication<\/li><li>IP whitelisting (1 or more)<\/li><li>Maximum number of threads<\/li><\/ul>\n\n\n\n<p>or any combination of these three.<\/p>\n\n\n\n<p>While the first is easy to bypass (just use the same username and password on all your devices), the other two can be complex to work around. Although for the latter there is probably no solution, we can find a way to evade the IP whitelisting security. I will not reveal which services are vulnerable to the workarounds I&#8217;m going to illustrate for obvious reasons.<\/p>\n\n\n\n<h2>The &#8220;easy&#8221; way<\/h2>\n\n\n\n<p>The easiest way is to use a VPN. The VPN will change the IP address that the proxy service sees when you send a HTTP request to one of their proxies, and so you can simply bind the service to the VPN host IP and connect all your devices to that same host with a VPN tunnel.<\/p>\n\n\n\n<p>Doing this free of charge, though, can be difficult.<\/p>\n\n\n\n<ul><li>OpenVPN, which can be self-hosted, only allows for up to 2 users to connect to your VPN if you don&#8217;t buy a license<\/li><li>Paid VPN services hosted on 3rd party servers will give you a variety of IPs to choose from, but you need a static IP so you need to connect to the same server every time<\/li><li>Finally you could host an <a href=\"https:\/\/opensource.com\/article\/18\/6\/vpn-alternatives\">open source VPN<\/a><\/li><\/ul>\n\n\n\n<h2>The hacky way<\/h2>\n\n\n\n<p>Since I wasn&#8217;t happy with the solution above, I started investigating on how to make a two-step proxy that will take all incoming traffic from all my devices and send it to one of the proxies of the provider, making it look like it came from that IP alone.<\/p>\n\n\n\n<p>I started investigating a node.js solution and found the <a href=\"https:\/\/github.com\/apifytech\/proxy-chain\">proxy-chain<\/a> module, which sounded very appealing at first, but when I tested it with around 1000 proxies it performed very badly.<\/p>\n\n\n\n<p>Finally I started thinking that I could simply use nginx and make it redirect all incoming traffic to the addresses I specify. This system works pretty well, and after testing it for some days I feel confident to say that it&#8217;s a decent zero-cost solution to the problem (provided that you can host a linux server of course).<\/p>\n\n\n\n<p>First of all we want to install nginx. I prefer to use the <strong>mainline <\/strong>branch instead of the stable because it offers more features. You can find instructions <a href=\"http:\/\/nginx.org\/en\/linux_packages.html#Ubuntu\">here<\/a>. Then add the nginx user<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>useradd nginx<\/code><\/pre>\n\n\n\n<p>After that we need to edit the configuration file (check out where it&#8217;s located on your installation)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>vi \/var\/nginx\/nginx.conf<\/code><\/pre>\n\n\n\n<p>then delete all the content using<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>:1,$d<\/code><\/pre>\n\n\n\n<p>paste this configuration<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>load_module \/usr\/lib\/nginx\/modules\/ngx_stream_module.so;\nworker_rlimit_nofile    65536;\nuser nginx;\nworker_processes 1;\nerror_log \/var\/log\/nginx\/error.log warn;\npid \/var\/run\/nginx.pid;\nevents { worker_connections     65536; }\nhttp { }\ninclude \/etc\/nginx\/tcpconf.d\/*;<\/code><\/pre>\n\n\n\n<p>and finally save and quit.<\/p>\n\n\n\n<p>This configuration will load the stream module, which is what we need to take the incoming TCP stream on a local port and redirect it to another host on its port. We also specify a high limit for the number of workers since we need to make sure nginx can work on enough files at the same time. It&#8217;s also important to notice that <strong>the worker limit MUST be a power of 2<\/strong> or it will be disregarded! Yeah, that costed me almost an hour of debugging\u2026<\/p>\n\n\n\n<p>We should check if the file limits for the nginx user are high enough or if we should increase them. To test it, you can type<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>su - nginx\nulimit -Hn\nulimit -Sn<\/code><\/pre>\n\n\n\n<p>If the limits are too low, you can edit this file<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>vi \/etc\/sysctl.conf<\/code><\/pre>\n\n\n\n<p>and append this line<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>fs.file-max = 70000<\/code><\/pre>\n\n\n\n<p>After saving and closing, edit this other file<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>vi \/etc\/security\/limits.conf<\/code><\/pre>\n\n\n\n<p>and add these two lines<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>nginx       soft    nofile   10000\nnginx       hard    nofile   30000<\/code><\/pre>\n\n\n\n<p>Make sure to use tabs instead of spaces as the latter can cause issues to some people. Finally, edit the file<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>vi \/etc\/pam.d\/common-session<\/code><\/pre>\n\n\n\n<p>and append this line<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>session required pam_limits.so<\/code><\/pre>\n\n\n\n<p>Now that we edited all the required files, we can reload the sysctl process and nginx (although killing the nginx would be a better way to ensure it reloads entirely)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sysctl -p\nnginx -s reload<\/code><\/pre>\n\n\n\n<p>We&#8217;re almost good to go! We just need a few more steps. First of all, create a folder called tcpconf.d inside your nginx root folder (where the nginx.conf file is)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>mkdir tcpconf.d\ncd tcpconf.d<\/code><\/pre>\n\n\n\n<p>Inside this folder we need to create a file with this syntax (you can give it any name you want)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>stream {\nserver { listen 30000; proxy_pass 1.2.3.4:8080; }\nserver { listen 30001; proxy_pass 5.6.7.8:8080; }\n}<\/code><\/pre>\n\n\n\n<p>You should be able to generate this file pretty easily by using this python script that I wrote for the occasion<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>output = open('output.txt', 'w+')\np = open('proxies.txt', 'r')\nproxies = p.read().splitlines()\np.close()\n\noutput.write('stream{\\n')\n\ni = 30000\nfor proxy in proxies:\n\toutput.write('server { listen %d; proxy_pass %s; }\\n' % (i, proxy))\n\ti += 1\n\noutput.write('}\\n')\noutput.close()<\/code><\/pre>\n\n\n\n<p>This script will take your proxy list in the format of one proxy per line (e.g. 12.34.56.78:8080) and generate a file called output.txt that is ready to be placed inside the tcpconf.d folder. By default this script will use the ports from 30000 onwards, so if you want to use other ports you can edit that value but <strong>do not choose a low value<\/strong> since many of the ports are reserved for other processes (like SSH on port 22) and will give you errors when you start nginx.<\/p>\n\n\n\n<p>Finally we need to generate a proxy list that we can use in our favourite scraping software (like OpenBullet for example). Run this python script<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>output = open('output.txt', 'w+')\n\nip = 'YOUR_SERVER_IP'\nfor i in range(30000, 55000):\n\toutput.write('%s:%d\\r\\n' % (ip, i))\n\noutput.close()<\/code><\/pre>\n\n\n\n<p>You have to <strong>replace <\/strong>your own IP and <strong>define <\/strong>your port range basing on how many proxies you used (I had 25000 so I needed to output 25000 ports starting from port 30000, which is the beginning of the default port range of the previous script).<\/p>\n\n\n\n<p>Finally, it&#8217;s very important that you <strong>import these proxies as SOCKS5<\/strong> or they will not work, since we&#8217;re redirecting TCP streams and not HTTP requests (they are on two different layers of the protocol stack).<\/p>\n\n\n\n<p>Congratulations, we are officially done and we can finally start nginx<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>service nginx start<\/code><\/pre>\n\n\n\n<p>If you have a large amount of proxies this might take some time, but if you patiently wait you should have your middle proxies visible on all the incoming ports, ready to accept some traffic!<\/p>\n\n\n\n<h2>Conclusion<\/h2>\n\n\n\n<p>This was a great learning experience as I tried to use nginx for something I didn&#8217;t know it could do, by using the stream module which is optional and needs to be activated with a directive in the configuration file. I also learned how linux handles the maximum open files allowed for each process and messed around with various configuration files in order to get it to work. Next time I will try one of the open source VPN solutions and report my findings on this blog.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There are many services that provide a large quantity of proxies that you can use when scraping websites to avoid getting timed out. But what if you want to buy a single proxy package and use it from multiple computers on different networks? Types of security The main types of security that proxy services use&#8230; <\/p>\n<div class=\"read-more navbutton\"><a href=\"https:\/\/openbullet.dev\/?p=9\">Read More<i class=\"fa fa-angle-double-right\"><\/i><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_markdown_editor_remember":false},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/openbullet.dev\/index.php?rest_route=\/wp\/v2\/posts\/9"}],"collection":[{"href":"https:\/\/openbullet.dev\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/openbullet.dev\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/openbullet.dev\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/openbullet.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=9"}],"version-history":[{"count":11,"href":"https:\/\/openbullet.dev\/index.php?rest_route=\/wp\/v2\/posts\/9\/revisions"}],"predecessor-version":[{"id":21,"href":"https:\/\/openbullet.dev\/index.php?rest_route=\/wp\/v2\/posts\/9\/revisions\/21"}],"wp:attachment":[{"href":"https:\/\/openbullet.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=9"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/openbullet.dev\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=9"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/openbullet.dev\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=9"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}