GrabzIt
Tools to Capture and Convert the Web

Capture HTML Tables from Websites with JavaJava API

Warning At least a Entry Package is required to use HTML table capture. Try it for free with a 7 day free trial. Then from $5.99 a month, unless cancelled.Start 7 Day Free Trial

There are multiple ways of converting HTML tables into JSON, CSV's and Excel spreadsheets using GrabzIt's Java API, detailed here are some of the most useful techniques. However before you start remember that after calling the URLToTable, HTMLToTable or FileToTable methods the Save or SaveTo method must be called to capture the table. If you want to quickly see if this service is right for you, you can try a live demo of capturing HTML tables from a URL.

Basic Options

This code snippet will convert the first HTML table found in a specified webpage into a CSV document.

grabzIt.URLToTable("http://www.google.com");
grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>");
grabzIt.FileToTable("tables.html");

By default this will convert the first table it identifies into a table. However the the second table in a web page could be converted by passing a 2 to the setTableNumberToInclude method of the TableOptions class.

TableOptions options = new TableOptions();
options.setTableNumberToInclude(2);

grabzIt.URLToTable("http://www.google.com", options);
TableOptions options = new TableOptions();
options.setTableNumberToInclude(2);

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options);
TableOptions options = new TableOptions();
options.setTableNumberToInclude(2);

grabzIt.FileToTable("tables.html", options);

You can also use the setTargetElement method to ensure that only tables within the specified element id will be converted.

TableOptions options = new TableOptions();
options.setTargetElement("stocks_table");

grabzIt.URLToTable("http://www.google.com", options);
TableOptions options = new TableOptions();
options.setTargetElement("stocks_table");

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options);
TableOptions options = new TableOptions();
options.setTargetElement("stocks_table");

grabzIt.FileToTable("tables.html", options);

Alternatively you can capture all tables on a web page by passing true to the setIncludeAllTables method, however this will only work with the XLSX and JSON formats. This option will put each table in a new sheet within the generated spreadsheet workbook.

TableOptions options = new TableOptions();
options.setFormat(TableFormat.XLSX);
options.setIncludeAllTables(true);

grabzIt.URLToTable("http://www.google.com", options);
TableOptions options = new TableOptions();
options.setFormat(TableFormat.XLSX);
options.setIncludeAllTables(true);

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options);
TableOptions options = new TableOptions();
options.setFormat(TableFormat.XLSX);
options.setIncludeAllTables(true);

grabzIt.FileToTable("tables.html", options);

Convert HTML Tables to JSON

GrabzIt can also convert HTML tables found on the web to JSON, just specify the JSON format instead. In the example below the data is read synchronously and is returned as a GrabzItFile object by using the SaveTo method, however it is generally recommended that you do this asynchronously.

When the conversion is completed the toString method is called to get the JSON as a string, this can be then parsed by a library like google gson.

TableOptions options = new TableOptions();
options.setFormat(TableFormat.JSON);
options.setTableNumberToInclude(1);

grabzIt.URLToTable("http://www.google.com", options);

GrabzItFile file = grabzIt.SaveTo();
if (file != null)
{
    String json = file.toString();
}

Custom Identifier

You can pass a custom identifier to the table methods as shown below, this value is then returned to your GrabzIt Java handler. For instance this custom identifier could be a database identifier, allowing a screenshot to be associated with a particular database record.

TableOptions options = new TableOptions();
options.setCustomId("123456");

grabzIt.URLToTable("http://www.google.com", options);
TableOptions options = new TableOptions();
options.setCustomId("123456");

grabzIt.HTMLToTable("<html><body><h1>Hello World!</h1></body></html>", options);
TableOptions options = new TableOptions();
options.setCustomId("123456");

grabzIt.FileToTable("example.html", options);