Tools to Capture and Convert the Web

Adding Page Breaks to PDF and DOCX documents

When converting from raw HTML or web pages to PDF or DOCX, you can force page breaks to appear in the documents at the exact location you desire by specifying the break-after:always and break-inside:avoid CSS rules for each page you want to create. Note that this does mean you must control the HTML of the webpage you are converting.

For instance the HTML below will create three pages within the DOCX or PDF document.

<html>
  <head>
    <meta http-equiv="content-type" content="text/html;charset=UTF-8" />
    <style type="text/css">
      div.page
      {
        break-after: page;
        break-inside: avoid;
      }
    </style>
  </head>
  <body>
    <div class="page">
      <h1>This is Page 1</h1>
    </div>
    <div class="page">
      <h1>This is Page 2</h1>
    </div>
    <div class="page">
      <h1>This is Page 3</h1>
    </div>
  </body>
</html>