A common task is to download images from a website, with GrabzIt's Web Scraper this is easy. First of all create a new scrape with the normal details such as the starting page of the scrape and any other options.
To download all images on a website you can also use this template.
Then go to the Scrape Instructions tab and click the button. This will enter the Page
keyword into the scrape instructions and will open a drop down. Select getTagAttributes
from the list. Next add 'src'
as the first parameter, this tells the Web Scraper to extract the src attribute, then type a comma.
Next click the this allows you to tell the Web Scraper what elements to extract the src attribute from. In the filter window ensure type is set to 'Web Page' and the restriction is 'tag name' and 'equal to'. Then enter img
in the text box and then click the Add button and then Insert Filter button. Finish the instruction by adding a semi-colon to the end of the line.
You should be left with something like what is shown below.
Page.getTagAttributes('src', {"tag":{"equals":"img"}});
The above code will extract all image URL's from the web page, but we now need to use those image URL's to save those images as files. To do this we will wrap this command minus the semi-colon in a Data.saveFile
command. To do this go to the begining of the line and select the button. Then in the drop down select saveFile
, then go to the end of the line and add a )
before the semi-colon.
You should now have the following scrape instructions.
Data.saveFile(Page.getTagAttributes('src', {"tag":{"equals":"img"}}));
Now if you run the scrape you will extract all images from the website. Much of this tutorial could also have been achieved by using the wizard button in the Scrape Instructions toolbar.