Tools to Capture and Convert the Web
GrabzIt's Online Community

ConvertHTML - difference between PDF and PNG/JPG

Ask questions relating to GrabzIt's Web Scraper Tool. Such as how to use the web scraper and API to extract data from web pages, images or PDF documents.

Hi

I've been using ConvertHTML with PNG successfully and now I have been trying to use my same code (modified slightly for the PDF parameters) to have the option for PDF export as well but PDF export gives a slightly different result that is not quite right. The PDF export puts some extra white space (not margins) on the right and bottom of my HTML.  I have tested the exact same HTML with the code below. Any ideas why I am getting different export?  The only thing I can think of is when I convert pixels to mm, I have to round. 

Thanks

Here is the code I have for PNG:

GrabzIt("").ConvertHTML(whatToConvert, {
     "target": "#captureThis",
     "address": theAddress,
     "format": "png",
     "transparent": 1,
     "bwidth": 1200,
     "bheight": 628,
     "width": 1200,
     "height": 628,
     "displayid": "finalImage",
}).DataURI(dataURLAfterGrabzIt);

 

And here is the code for PDF:

GrabzIt().ConvertHTML(whatToConvert, {
     "target": "#captureThis",
     "address": theAddress,
     "format": "pdf",
     "width": 317,  //Pixels converted to mm
     "height": 166,  //Pixels converted to mm
     "displayid": "finalImage",
     "mtop": 0,
     "mleft": 0,
     "mbottom": 0,
     "mright": 0
}).DataURI(dataURLAfterGrabzIt);

 

Asked by Corey Alderin on the 16th of December 2019

Hi Corey,

Yes they probably will be slightly different. This is because one is creating an exact image of whats in the browser and then cutting it to the target element. The other is converted to PDF elements and then correct PDF element is cut out.

Also for PDF I don't think you should be specifying width and height. 

Kind Regards

Answered by GrabzIt Support on the 16th of December 2019

Thanks.  So is it possible to get the same results?  I have been testing lots of different things with no success.  Will I need to use a PNG to PDF converter instead?  I was trying to avoid that but maybe that is the only solution.

 

If I don't specify height and width, Then it adds even more white space.  I have tried different combinations of specifying only width and only height and they all give different results but not the correct results. 

 

Thanks

 

 

Answered by Corey Alderin on the 16th of December 2019

Could you provide the HTML you are trying to capture. It would probably make sense to email it to us.

Answered by GrabzIt Support on the 16th of December 2019

Sure, I can send that.  Where should I send it to?

Answered by Corey Alderin on the 16th of December 2019

Ok, I think I can see what you are getting at. At the moment when you capture a target as a PDF. It will still return the page size you asked for so if you asked for a page size of A4 you would get the target croped and left sitting in a A4 page.

However, you seem to want the targetted HTML element on its own with the page size set to the HTML element page size.

Is this correct? If so this will need a code change. I think we can do it by allowing a -1 to be passed to the page size parameter in a similar way to the creating a targetted image.

Answered by GrabzIt Support on the 16th of December 2019

Yes, that is correct.  That would be great if that would be an option.  Thanks

Answered by Corey Alderin on the 16th of December 2019