Tools to Capture and Convert the Web

Convert URL's and HTML to DOCXRuby API

Adding the ability to convert HTML or webpages into Word documents to your application has never been easier with GrabzIt's Ruby API. However before you start remember that after calling the url_to_docx, html_to_docx or file_to_docx methods the save or save_to method must be called to actually create the DOCX.

Basic Options

Capturing webpages as DOCX converts the entire web page into a Word document that can consist of many pages. Only one parameter is required in order to convert a web page into a Word document or to convert HTML to DOCX as shown in the below examples.

grabzItClient.url_to_docx("http://www.google.com")
# Then call the save or save_to method
grabzItClient.html_to_docx("<html><body><h1>Hello World!</h1></body></html>")
# Then call the save or save_to method
grabzItClient.file_to_docx("example.html")
# Then call the save or save_to method

Custom Identifier

You can pass a custom identifier to the DOCX methods as shown below, this value is then returned to your GrabzIt Ruby handler. For instance this custom identifier could be a database identifier, allowing a DOCX document to be associated with a particular database record.

grabzItClient = GrabzIt::Client.new("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzIt::DOCXOptions.new()
options.customId = "123456"

grabzItClient.url_to_docx("http://www.google.com", options)
# Then call the save method
grabzItClient.save("http://www.example.com/handler/index")
grabzItClient = GrabzIt::Client.new("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzIt::DOCXOptions.new()
options.customId = "123456"

grabzItClient.html_to_docx("<html><body><h1>Hello World!</h1></body></html>", options)
# Then call the save method
grabzItClient.save("http://www.example.com/handler/index")
grabzItClient = GrabzIt::Client.new("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzIt::DOCXOptions.new()
options.customId = "123456"

grabzItClient.file_to_docx("example.html", options)
# Then call the save method
grabzItClient.save("http://www.example.com/handler/index")

Headers and Footers

To add a header or footer to a Word document you can request that you want to apply a particular template to the DOCX being generated. This template must be saved in advance and will specify the contents of the header and footer along with any special variables. In the example code below the user is using a template they created called "my template".

grabzItClient = GrabzIt::Client.new("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzIt::DOCXOptions.new()
options.templateId = "my template"

grabzItClient.url_to_docx("http://www.google.com", options)
# Then call the save or save_to method
grabzItClient.save_to("result.docx")
grabzItClient = GrabzIt::Client.new("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzIt::DOCXOptions.new()
options.templateId = "my template"

grabzItClient.html_to_docx("<html><body><h1>Hello World!</h1></body></html>", options)
# Then call the save or save_to method
grabzItClient.save_to("result.docx")
grabzItClient = GrabzIt::Client.new("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzIt::DOCXOptions.new()
options.templateId = "my template"

grabzItClient.file_to_docx("example.html", options)
# Then call the save or save_to method
grabzItClient.save_to("result.docx")

Convert HTML element to DOCX

If you want to just convert a HTML element such as a div or span directly into a Word document you can with GrabzIt's Ruby Gem. You must pass the CSS selector of the HTML element you wish to convert to the targetElement method of DOCXOptions class.

...
<span id="Article">
<p>This is the content I am interested in.<p>
<img src="myimage.jpg">
</span>
...

In this example, we wish to capture all the content in the span which has the id of Article, therefore we pass this to GrabzIt API as shown below.

grabzItClient = GrabzIt::Client.new("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzIt::DOCXOptions.new()
options.targetElement = "#Article"

grabzItClient.url_to_docx("http://www.bbc.co.uk/news", options)
# Then call the save or save_to method
grabzItClient.save_to("result.docx")