Tools to Capture and Convert the Web

Capturing or Scraping Intranet and localhost websites

How the intraproxy enables intranet screenshots

Capturing an intranet or localhost website is more complicated than taking a screenshot of a normal website on the web.

The simplest way to do this would be to use GrabzIt’s IntraProxy, which opens up all of your internal websites to only GrabzIt's servers. The IntraProxy then handles the routing of requests to and from your internal websites for you as shown in the diagram.

First install the Proxy. Either using the Windows Installer or the Linux download. To install the Linux version, follow the instructions in the ReadMe.txt file. Once installed you can see if the IntraProxy is running by going to Then on your router forward the port 10000 to the machine IP Address the GrabzIt IntraProxy is installed on.

Visit http://localhost:10000/grabzit://dashboard.html for further information on how to configure and use the IntraProxy.

Once this is configured it can be used by all of our tools including our API, Screenshot Tool and Web Scraper. As all requests to the router IP address and port will now resolve to the correct internal website. For instance, if your website is located at http://localhost/mywebsite/index.html and your router IP address is 123.123.123.123 then to resolve your website externally you can pass http://123.123.123.123:10000/http://localhost/mywebsite/index.html to GrabzIt's API or tools.

Similarly, if you have the GrabazItDemo installed locally and what to call its callback handler at http://localhost/GrabzItDemo/handler.php you could pass http://123.123.123.123:10000/http://localhost/GrabzItDemo/handler.php as a callback handler URL.

Remember to remove this URL prefix if you make your website publicly available on the internet!

Requirements

  • Only allows access from GrabzIt's servers
  • Requires Java 1.6+

An Alternative Method

For intranet or localhost websites that don't have absolute URLs pointing to resources, such as CSS, image and JavaScript files, that are not accessible on the Internet simplest option would be to set up port forwarding to your internal website. However, you should only do this for websites that you don't mind opening the up to the Internet. Furthermore, it probably wouldn’t be suitable if you have a large number of internal websites to capture.

You would need log in to your router and add a port forwarding rule to forward all requests from that arrive at the routers IP address and port to the computer that hosts your website. You then need to configure your web server to accept calls on the port you are forwarding over.

For instance, if your router IP address is 222.222.222.222 you could add a port forwarding for port 12345 to the computer that hosts the website and add this port to your web server configuration as one of the ports it listens on.

Further information on how to configure your web server and router should be available on the internet. Once this is done calling an address like http://222.222.222.222:12345/mypage.html should load your website.

Windows Download Linux Download

Try all our premium features for free with a 7 day free trial. Then from $5.99 a month, unless cancelled.
  • More Captures
  • More Features
  • More API's
  • Bigger Scrapes
  • Bigger Captures
  • Bigger Everything
Start Free Trial