Get a Free Trial

Capture HTML Tables from Websites with ASP.NET

DownloadTechnical Information
ASP.NET API

There are multiple ways of converting HTML tables into JSON, CSV's and Excel spreadsheets using GrabzIt's ASP.NET API, detailed here are some of the most useful techniques. However before you start remember that after calling the URLToTable, HTMLToTable or FileToTable methods the SaveAsync or SaveToAsync method must be called to capture the table. If you want to quickly see if this service is right for you, you can try a live demo of capturing HTML tables from a URL.

Basic Options

The following code example converts the first HTML table in a specified webpage into a CSV document.

grabzIt.URLToTable("https://www.tesla.com");
//Then call the SaveAsync or SaveToAsync method
grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>");
//Then call the SaveAsync or SaveToAsync method
grabzIt.FileToTable("tables.html");
//Then call the SaveAsync or SaveToAsync method

By default this will convert the first table it identifies into a table. However the the second table in a web page could be converted by passing a 2 to the TableNumberToInclude property.

//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.TableNumberToInclude = 2;

grabzIt.URLToTable("https://www.tesla.com", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.csv");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.TableNumberToInclude = 2;

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.csv");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.TableNumberToInclude = 2;

grabzIt.FileToTable("tables.html", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.csv");

You can also specify the TargetElement property that will ensure only tables within the specified element id will be converted.

//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.TargetElement = "stocks_table";

grabzIt.URLToTable("https://www.tesla.com", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.csv");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.TargetElement = "stocks_table";

grabzIt.HTMLToTable("<html><body><table id='stocks_table'><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.csv");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.TargetElement = "stocks_table";

grabzIt.FileToTable("tables.html", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.csv");

Alternatively you can capture all tables on a web page by passing true to the IncludeAllTables property, however this will only work with the XLSX or JSON format. If you choose the XSLX format each table will be put in a new sheet within the generated spreadsheet workbook.

//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.Format = TableFormat.xlsx;
options.IncludeAllTables = true;

grabzIt.URLToTable("https://www.tesla.com", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.xlsx");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.Format = TableFormat.xlsx;
options.IncludeAllTables = true;

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.xlsx");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.Format = TableFormat.xlsx;
options.IncludeAllTables = true;

grabzIt.FileToTable("tables.html", options);
//Then call the SaveAsync or SaveToAsync method
await grabzIt.SaveToAsync("result.xlsx");

Convert HTML Tables to JSON

GrabzIt can also convert HTML tables to JSON, just specify the JSON format as shown below. Here we are reading the data synchronously into the GrabzItFile object by using the SaveTo method, however it is generally recommended that you do this asynchronously instead.

Once we have the result we get the string representation of the JSON file by calling the ToString method, this can be then deserialized into a dynamic object using your favourite JSON library.

//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.Format = TableFormat.json;
options.TableNumberToInclude = 1;

grabzIt.URLToTable("https://www.tesla.com", options);

GrabzItFile file = await grabzIt.SaveToAsync();
if (file != null)
{
    string json = file.ToString();
}
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.Format = TableFormat.json;
options.TableNumberToInclude = 1;

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options);

GrabzItFile file = await grabzIt.SaveToAsync();
if (file != null)
{
    string json = file.ToString();
}
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.Format = TableFormat.json;
options.TableNumberToInclude = 1;

grabzIt.FileToTable("tables.html", options);

GrabzItFile file = await grabzIt.SaveToAsync();
if (file != null)
{
    string json = file.ToString();
}

Custom Identifier

You can pass a custom identifier to the table methods as shown below, this value is then returned to your GrabzIt ASP.NET handler. For instance this custom identifier could be a database identifier, allowing a screenshot to be associated with a particular database record.

//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.CustomId = "123456";

grabzIt.URLToTable("https://www.tesla.com", options);
//Then call the SaveAsync method
await grabzIt.SaveAsync("http://www.example.com/Home/Handler");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.CustomId = "123456";

grabzIt.HTMLToTable("<html><body><h1>Hello World!</h1></body></html>", options);
//Then call the SaveAsync method
await grabzIt.SaveAsync("http://www.example.com/Home/Handler");
//The client should be stored somewhere and reused!
GrabzItClient grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret", client);

TableOptions options = new TableOptions();
options.CustomId = "123456";

grabzIt.FileToTable("example.html", options);
//Then call the SaveAsync method
await grabzIt.SaveAsync("http://www.example.com/Home/Handler");