GrabzIt
Tools to Capture and Convert the Web

Capture HTML Tables from Websites with PHPPHP API

Warning At least a Entry Package is required to use HTML table capture. Try it for free with a 7 day free trial. Then from $5.99 a month, unless cancelled.Start 7 Day Free Trial

There are multiple ways of converting HTML tables into JSON, CSV or Excel spreadsheets using GrabzIt's PHP API, detailed here are some of the most useful techniques. However before you start remember that after calling the URLToTable, HTMLToTable or FileToTable methods the Save or SaveTo method must be called to capture the table. If you want to quickly see if this service is right for you, you can try a live demo of capturing HTML tables from a URL.

Basic Options

The code example found below automatically converts the first HTML table discovered in a specified webpage into a CSV document.

$grabzIt->URLToTable("http://www.google.com");
$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>");
$grabzIt->FileToTable("tables.html");

By default this will convert the first table it identifies into a table. However the the second table in a web page could be converted by passing a 2 to the setTableNumberToInclude method.

$options = new GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->URLToTable("http://www.google.com", $options);
$options = new GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
$options = new GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->FileToTable("tables.html", $options);

You can also use the setTargetElement method to ensure that only tables within the specified element id will be converted.

$options = new GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->URLToTable("http://www.google.com", $options);
$options = new GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
$options = new GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->FileToTable("tables.html", $options);

Alternatively you can capture all tables on a web page by passing true to the setIncludeAllTables method, however this will only work with the XLSX and JSON formats. This option will put each table in a new sheet within the generated spreadsheet workbook.

$options = new GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->URLToTable("http://www.google.com", $options);
$options = new GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
$options = new GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->FileToTable("tables.html", $options);

Convert HTML Tables to JSON

There is sometimes the need to read HTML tables programmatically, GrabzIt enables you to do this using PHP by converting online HTML tables into JSON. To do this specify json as the format parameter. For instance in the example below we are converting a HTML Table synchronously then using the inbuilt json_decode PHP method to parse the JSON string into an object we can easily work with.

$options = new GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->URLToTable("http://www.google.com", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}
$options = new GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}
$options = new GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->FileToTable("tables.html", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}

Custom Identifier

You can pass a custom identifier to the table methods as shown below, this value is then returned to your GrabzIt PHP handler. For instance this custom identifier could be a database identifier, allowing a extracted table to be associated with a particular database record.

$options = new GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->URLToTable("http://www.google.com", $options);
$options = new GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->HTMLToTable("<html><body><h1>Hello World!</h1></body></html>", $options);
$options = new GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->FileToTable("example.html", $options);