How To Make Complicated PDFs From Basic HTML And CSS

Many web applications require PDF reports and downloads. There are numerous open-source libraries that can build basic PDFs, but if you need a more complicated PDF document or are sending your document to a printer, you’ll need a commercial PDF generation tool.

Unlike the browser-based open-source HTML to PDF libraries, commercial engines are designed specifically for generating PDFs and offer a lot more functionality.

Today, I’ll walk you through DocRaptor’s online PDF generation API, how to use their Python library, and how to add some advanced headers and footers to your PDF. Their library is built for Python 3, but should still work for Python 2 (but you should upgrade if you’re still on 2!).

First, install via pip:

pip install --upgrade docraptor

Then, import the module, initiate the class, and add your API key:

import docraptor
doc_api = docraptor.DocApi()
doc_api.api_client.configuration.username = 'YOUR_API_KEY_HERE'

If you don’t have a DocRaptor account, the YOUR_API_KEY_HERE key actually works without creating an account. It only works for test mode (watermarked) documents though.

Once initiated, creating a PDF is a simple as:

response = doc_api.create_doc({
  "test": True,
  "document_content": "<html><body>Hello World</body></html>",
  # "document_url": "http://docraptor.com/examples/invoice.html
  "document_type": "pdf",
  # "javascript": True,
  # "prince_options": {
  #   "media": "screen",                              
  #   "baseurl": "http://hello.com",
  # },
})

As you can see, you can use HTML content or a document URL. You can also enable JavaScript parsing, which is disabled by default to speed up the PDF generation time.

That’s the basics of using DocRaptor’s Python library. Where it gets fun, though, is with more of their advanced functionality.

For example, you can add a footer with simple CSS and HTML, like this:

<body>
<style>
footer {
  flow: static(footer-html);
}
@page {
  margin-bottom: 80px;
  @bottom {
    content: flow(footer-html);  
  }
}
</style>

<footer>
  <a href="https://serhiipuzyrov.com">Serhii is awesome!</a>
</footer>
<p>Normal document content </p>
</body>

This CSS tells the PDF engine to put the HTML in <footer> into a different flow, one named “footer-html”. Then it defines the content of the document bottom region (also known as the footer) as that same flow. That HTML block will be repeated on every page.

But if you wanted to have a different footer on a different page, like maybe a title page, you could use CSS to defined a named page:

<style type="text/css">
  /* This page is named "title-page"! */
  @page title-page {
    /* The title-page uses a different flow than the rest of the pages */
     @bottom {
       content: flow(title-footer);  
    }
  }

 /* This defines where the named page is to be used */
 .title-page {
   page: title-page;
 }

  /* The new footer for the title page) */
  footer#title-footer {
    flow: static(title-footer);
  }
</style>


<footer id="title-footer">
  <a href="https://serhiipuzyrov.com">Serhii is incredibly awesome! :)</a>
</footer>

<div class="title-page">
  <h1>This is the title</h1>
</div>

This same approach for varying objects on different pages could be used on headers. full-bleed page backgrounds, or page orientations.

With DocRaptor, you can easily add page or chapter counters in your headers (or anywhere else in the document), add a dynamic table of contents, watermarks, footnotes, or even do text substitutions. All this is listed in the documentation and knowledge base. I hope this helps you if you’re struggling to create advanced PDF documents with Python!