Introduction

PdfClerk is a secure desktop software designed to create feature-rich PDF document collections and binders. It offers a comprehensive set of tools for converting document references into hyperlinks, bundling submissions with stamps and bookmarks, and managing metadata for headers, footers, and bookmarks.

Main features

  • Automatically convert document references to hyperlinks, typically between submissions (briefs, motions, pleadings, etc) and exhibits.
  • Bundle submissions (court bundles) with stamps, bookmarks and table of contents
  • Create stamps with graphic elements such as borders, solid or transparent backgrounds and company logos in addition to text derived from metadata.
  • Manage metadata like exhibit numbers, titles, and dates for use in headers & footers (stamps) and bookmarks
  • Extract metadata from the filename of the document
  • Import and export metadata using Microsoft Excel format
  • Create customized table of contents
  • Create hyperlinks to specific pages in the same or other binders using Clerk Links.
  • Edit and split binders
  • Reuse settings and workflows with configurable presets

PdfClerk’s autolink capabilities can be configured to work in different languages and on various types of documents, such as legal submissions, contracts, technical documentation and reports.

PdfClerk has several built-in configuration settings (aka Presets). Example documents and files can be found here.

Files

Input files

The input files are the files that you want to do something with - like merging, stamping and linking.

You select the input files using the dropdown button Open input files. The button has three options:

  • Folder - Use this to select a folder as input. All pdf files in this folder and subfolders will be loaded into the document list.
  • Binder - Opens an existing binder (document collection) and loads the bookmark hierarchy of that binder into the document list.
  • Single file to link - Use this to select a single pdf or Word file that contains textual references to other documents to convert those references into links. It loads the selected file and all pdfs in the folder and subfolders into the document list. Only files marked “Source” are processed for references.

Note that when the source of the documents in the document list is an existing binder you can add stamps and bookmarks the binder, but you cannot use Autolinking or create a table of contents.

Select output folder

Use this button to select the output folder where files created or updated by PdfClerk should be saved.

Click the “New folder” item on the ribbon in the Select folder dialog to create a new folder.

Document list metadata

The metadata for each item in the document list is shown in the document list. Metadata can be used for stamps, bookmarks and table of contents. By default, the name of the item is the filename without extension for single documents, and existing bookmarks for binders.

Each item can be edited by clicking the text to be edited. You can navigate between entries using the up and down arrows.

Additional metadata columns can be added by clicking the plus sign in the top right of the document list.

Metadata is read from filenames or bookmarks using the regular expression specified under Metadata extraction. This is useful to extract information already present in filenames and bookmarks. Metadata can be imported and exported in Excel format.

Document Level

In a pdf, document bookmarks are nested in a hierarchy. In PdfClerk this is achieved by changing the level of an item. Since this is a tree structure, the first item in the document list needs to have 1 as level and the increase in level from one item to the next cannot be more than one.

The level can be changed for multiple items at once by selecting several items as described in Sort & reorder documents and then using the arrows next to the level in the document list for one of the selected items.

Sort & reorder documents

By default, the documents in the document list are sorted alphabetically on the name field.

You can sort documents in the document list using the sort menu. Click the sort arrows after the column name to open the sort menu. In addition to selecting the column you would like to sort on, you also need to specify what level you want to sort on. This is required since you normally would like to keep documents in the same place in the hierarchy when you sort them.

The sorting function is useful if you have added exhibits numbers manually and would like to sort the documents in accordance with their numbering.

Documents can be reordered manually by selecting documents by clicking the circle to the right of the document in the document list. Select additional documents by holding the shift key and clicking additional documents. You can then use the arrows in the floating menu to the left to move documents up or down.

Use the following keyboard shortcuts to reorder documents more efficiently.

  • ctrl+up/down: Move selected document(s) up or down
  • ctrl+shift+ up/down: Move selected document(s) up or down 10 positions
  • alt+up/down: Move document selection (cursor) up or down
  • shift+up/down: Expand or reduce selection
  • left or right arrow: Change the document level

If you are into keyboard shortcuts there are also the global keyboard shortcuts.

Folders in the document list can be collapsed and expanded by clicking the - and + signs to the left of their name in the document list.

Add bookmarks

Additional bookmarks can be added by selecting the wrench icon on the right of the document in the document list. The bookmark will link to the first page of the relevant document. The folders are added as bookmarks when loading documents from a folder structure.

Document settings

By clicking the wrench icon on the right of the document in the document list, you can change the following properties of each individual document/bookmark:

  • Name - The name (and metadata) of each document and bookmark can also be edited from the document list by clicking on the text that should be edited.
  • Exclusion - Each document and bookmark can be excluded from stamping, from being included as a bookmark, and from appearing in the table of contents
  • Bookmark appearance - The appearance of each bookmark can be adjusted by setting font color and font style
  • Bookmark expansion - By default, all bookmarks are expanded. This can be changed on a per-level basis in the bookmarks section in the Output tab. In addition, this can be set per bookmark by selecting either default, collapsed, or expanded in the document properties.

Working with Existing binders

If an existing binder is selected as input via the Open input files button, PdfClerk will work on that binder only. The document structure based on the bookmarks will be loaded into the document list. In this mode, you can add stamps to the binder, rebuild bookmarks, update the document properties or split the binder into individual files based on the bookmarks. The information to be used in stamps and bookmarks can be retrieved from existing bookmarks, edited in PdfClerk, or loaded from Excel via the Load/save menu. You can also save bookmarks and their page numbers from an existing binder to Excel. This is useful for creating index files using Clerk Links.

Metadata extraction

PdfClerk can extract information from the name of each document, like exhibit numbers or dates. Extraction is made using regular expressions and named capturing groups. The named capturing groups will be available as metadata for the document. Setting up regular expressions requires some technical knowledge. Here are some examples that can be edited to your use case

  • “(.*)exhibit ?(?<exhibit>[0-9]{1,5}[A-z]{0,1})” - extracts 24 from “Letter to IBM, Exhibit 24” as exhibit
  • “.{6}_(?<doc>(.)+)” - extracts “Document 1” from “2001-02-03_Document 1.pdf” as doc
  • “(?<date>.+?)\s{1,3}(?<name>.*)(\s){0,3}\[(?<doc>[0-9]{1,5})?(-(?<exhibit>[0-9]{1,5})\])?” - extracts date as 2019-03-08, name as “Minutes from meeting”, doc as 4 and exhibit as 7 from “2019-03-08 Minutes from meeting [4-7]”

Press the refresh icon to run a new or edited regular expression.

To extract different data, you can run different regular expressions with different group names. If you run a regular expression with the same group names(s) they will overwrite data in the existing columns.

Click “Add extractor” to add additional extractors. If you select another level than All in the Level selector, metadata will only be extracted for the selected level.

Edit Document List

  • Set column value – Use this function to batch update the value of a column in the document list. You can use the same metadata text functions as when creating stamps and bookmarks.
  • Remove entries – Use this function to remove entries with a level at or above the provided level.

Map Exhibits

When you have a main document with exhibits, this can be organized in PdfClerk by saving the exhibits in a subfolder and placing the main document right before that folder in the document list.

Instead of manually reordering the documents, you can use the function Map Exhibits to tell PdfClerk what folder contains the exhibits to each of the main documents.

In cases where there are several series of exhibits belonging to different main documents, this mapping is required for PdfClerk to understand references of the type “Exhibit X of Document Y, page 22”.

Any existing cross-links between pdf files merged into a bundle will be maintained. This is useful for links added to a document using Microsoft Word or a pdf editor. Matching of links with a relative path is made both with the source folder and the target folder. For links added via Microsoft Word you can also specify which page number to link to by adding a ‘#’ and the page number at the end of the filename (c:\files\myFile.pdf#3 will link to page 3)

Settings

Output Format

PdfClerk can update individual files or bundle all documents into a binder.

You can select what documents PdfClerk outputs by selecting one of the options under Output.

  • All documents – Copies all documents, modified or not, to the output folder.
  • Modified documents – Copies only modify documents to the output folder. Note that when using this setting, links to documents that are not modified and copied to the output folder will not work before the documents are copied to the same folder.
  • Merged binder – Merges the documents in the document list in a binder.

General Settings

  • Start documents on odd page – Ensures that all documents start on an odd page to allow double-sided printing
  • Add blank page before – Use this option to add a blank page in front of other documents. Use in combination with stamps to create named separation sheets and similar. By default, a blank page is added in front of all documents, except the first document in the binder. Select a field from the dropdown list to only add blank pages in front of documents that have a value for this field, including the first document. Optionally, select a background color.
  • Create binders as attachment binders – Creates a pdf binder with the first document in the document list as the main document and the other documents in the document list as attachments to that document.
  • Open output folder – Opens the output folder after the document(s) have been created
  • Debug – If this option is selected additional information is collected during execution to enable debugging, including more detailed information in the Log-tab. The information is stored in a folder named “debug” in the output folder. In addition any temporary files created during execution will not be deleted and may be found in the “_temp” folder located in the output directory.

Table of Contents

PdfClerk can create a clickable table of contents based on the documents in the document list.

  • Select the template to use in the dropdown.
  • An entry for the table of contents then appears in the document list. The name as it appears in the document list will be the title of the table of contents. You can change it by clicking and editing (default name is Table of contents)
  • For templates in Word format, you can edit the project template by clicking the wrench icon next to the document in the document list.
  • Once inserted the table of contents can be moved to another location in the document list.
  • Optionally, select a background color.
  • Templates are Word or HTML documents located in the PdfClerk’s Templates folder (normally found in the Windows Documents folder).
  • You can edit the templates or add your own by adding them to the Templates folder. In addition to changing the layout and adding company stationery, you can configure them to output additional metadata fields for the documents like exhibit numbers and dates. Read more about how to edit and create your own table of content templates in the Create Index section.
  • PdfClerk has an alternative, even more flexible, method for creating tables of contents. See Create Index.

Stamps

General

PdfClerk can put Bates numbering, exhibit numbers, page numbers, document titles and other information on documents.

There are two types of stamps:

  • Text stamps – Inserts the text defined in the Text field.
  • Graphic stamps – Inserts a graphic element in addition to the text. The graphic element can be a border, a solid or transparent background, a logo, fixed text or other elements. If no text is provided in the Text field, only the graphic will be added.

Each stamp has the following settings:

  • Text – The text to be inserted. Text enclosed in < > will be handled as metadata and replaced with the corresponding value. Available fields are shown below the stamps. Click the value to insert it in the active stamp. See Accessing Metadata on how to use metadata.
  • Position – Defines where on the page the stamp should be placed
  • X-margin – The vertical margin for the stamp. For top and middle positions, it is calculated from the top edge of the page. The bottom position is calculated from the bottom of the page. Negative values are allowed.
  • Y-margin – The horizontal margin for the stamp. For right and center positions, it is calculated from the right edge side. For left positions, it is calculated from the left. Negative values are allowed.
  • Font – The font used for the text in the stamp
  • Color – Colors are specified using the color picker or alternatively using their HEX values.
  • Level – Specify if a stamp should only be printed documents at a specific level.
  • Document pages – The default setting “all” prints the stamp on all the pages of a document, “first page” prints it only on the first page of the document, while “other” prints on all document pages other than the first page. If you use the function Add blank page before the added page is considered the first page of the document.
  • First binder page – Uncheck this box to not print the stamp on the first page of a binder.
  • No text – For graphic stamps, thick this box to prevent any text from being printed on the page together with the stamp. This is useful if you use a textual expression to control which documents the stamp is printed on. If the expression evaluates to an empty string, then the graphic stamp is not added.
  • Use this – Uncheck this box to not apply the stamp.

More settings

For graphic stamps, there might be optional settings. The options available depend on the type of the stamp and the configuration options provided in the configuration file of the stamp.

For graphic stamps where the placement of the text is not defined in the svg file, the placement of the text relative to the graphic element can be adjusted:

  • Horizontal position - Left, Right or Center
  • Vertical position - Top, Middle or Bottom
  • textAdjustX - Adjustment to the position on the X-axis - both positive and negative numbers can be used
  • textAdjustY- Adjustment to the position on the Y-axis - both positive and negative numbers can be used

PdfClerk ships with some predefined stamps located in the stamps folder. Additional stamps can be added in the stamps folder by adding a configuration file and an image (svg, png, jpeg, gif, bmp or tiff). If you need help adding a stamp, like your company logo, let us know and we will help.

Bookmarks

Select this option to add bookmarks to a new binder or replace bookmarks in an existing binder. Bookmarks in pdfs are sometimes called outlines. Specify the metadata fields to use for the bookmark text in the Build string field. See Accessing Metadata below for a reference on how to create bookmarks based on the available metadata.

You can use different build strings for different bookmark levels by adding additional build strings and specifying the level for each build string. Bookmarks are only added to for levels that has a build string defined.

Note that adding bookmarks to an existing binder will remove any existing bookmarks. To remove all bookmarks in a binder without adding new bookmarks, check the Remove existing option.

By default all bookmarks are expanded, meaning that the whole bookmark tree is shown when the binder is opened. Use the Collapse Bookmarks function to collapse one or more levels of the bookmarks. These settings can be overridden for individual documents using the properties for each document available using the wrench icon in the document list.

Rename Files

By default individual files keep the name they have. You can change the name of individual files by editing the name column in the document list. Alternatively, you can rename all files by checking the Rename files checkbox. You can then provide the pattern for the file names in the File name text box using the metadata items as building blocks.

Split binder by bookmarks

Use this function to split a binder by its bookmarks. You can select which level (of the bookmarks) that should be used for splitting the binder. The file's default name will be the bookmark’ s title . This can be changed by specifying another value in the Filename pattern field. This works like stamps and bookmarks. Different levels of metadata can be accessed using the <l1-name> syntax (where 1 represents the level). Documents can be organized in subfolders by adding a backslash (\) between different property names. If there are file attachments to the binder, these will be extracted to a subfolder named attachments.

This setting is only available when an existing binder is loaded.

Accessing Metadata

Default and additional properties

The following default metadata items are available for use in stamps and bookmarks:

  • <name>: the name of the bookmark or file without extension (like “filename”)
  • <filename>: the filename of the document (like “filename.pdf”)
  • <page>: the page number of the document
  • <numpages>: the number of pages in the document or binder
  • <docpage>: the page number of the merged in document (stamps only)
  • <filenumber: counter increased by one per file saved (split binder only)

You can add additional metadata items by clicking the plus sign in the floating menu to the right of the document list in the Files tab view. You can also load metadata from an Excel Workbook. Additional metadata items may typically be exhibit numbers, dates etc.

Formatting

For metadata the following formatting may be applied:

  • add - Use this function if you would like to start the pagination on another page number than 1. If you want to start the pagination of a binder on page 101 use the following format: <page|add:100>
  • pad - Use this function to ensure that a number is represented by a fixed number of digits by adding leading zeros. Example: “003” instead of “3”. Use the following format <page|pad:3>
  • max - Use this function to limit text length to a specific number of letters. Use the following format <title|max:30>
  • maxe - Same as above but an ellipsis (…) is added to the end of the string if it is shortened by the function. Use the following format <title|maxe:29>

The formatting functions may be chained.

Empty values & text control

If metadata for an item is unset or empty PdfClerk ignores the metadata item but inserts the rest of the string provided in the input box. This can be changed by enclosing the relevant part of the string in curly brackets. Examples:

  • “Exhibit <exhibit>” inserts “Exhibit “ if there is no value for exhibit
  • “{Exhibit <exhibit>}” inserts nothing if there is no value for the exhibit.

Accessing metadata at specific levels

When accessing metadata items (like <name>), PdfClerk by default uses the highest metadata level for the relevant item. For stamps and for file names when splitting binders, lower-level metadata items can be accessed by adding “l”, the level of the metadata item to be accessed ,and a hyphen in front of the property name like <l1-name>. This is useful if a metadata item like the exhibit number is available at level 2 and should be printed also on documents at level 3. It is also useful to include the full context of a document in the document's header or footer like “Movies / Action / Die Hard series”

PDF Document Properties

For binders the document properties title, author and subject can be set under Document Properties. If the checkbox next to the value is not checked, the original values will be kept.

When working with individual files or when splitting a binder into multiple files these properties can be set using dynamic metadata in the same way as for the file names.

Restricted Documents

Electronically signed documents or documents locked for editing cannot be merged or stamped. To overcome this limitation PdfClerk creates a copy of the document with the digital signature removed. For the same purposes, PdfClerk ignores the protected status for some pdfs solely for the purpose of merging and stamping such documents.

Autolink

General

PdfClerk’s Autolink function identifies textual references to other documents or pages in the text of pdf documents and adds hyperlinks linking to the document and page being referenced. As an example PdfClerk can add a hyperlink to the text “Pleading of 23th December 2021, Exhibit 4, page 86” pointing to page 86 in the document that is Exhibit 4 to such pleading.

For each reference PdfClerk can handle four items:

  • Document – the target document to link to
  • Exhibit – a target document which is an exhibit to a document
  • Page number – the page number in the the target document
  • Paragraph number – the paragraph number in the target document

When the autolink feature is enabled in the Autolink tab, PdfClerk looks for references in all documents that have the checkbox Source in the document list checked. By default, all documents at level 1 are checked.

How PdfClerk recognises references can be configured to fit different use cases and languages. Look below in the section Searches to learn how to configure reference Searches in PdfClerk.

Documents

The document part of the reference is matched against the field Alias in the document list. By default the Alias contains the filename of a document (without the file extension). Alias can be manually edited. There may be several aliases for each document. To add multiple entries, separate them using a comma. If the entry itself contains a comma, escape it using backslash (\,) You can configure PdfClerk to read the entries from another column in the document list using the setting Show document field search mapping.

Exhibits

An exhibit is a document that is a sub document to a main document. Typical examples are exhibits to briefs or annexes to agreements. In PdfClerk documents on level 1 are considered as main documents, while documents on level 2 are considered as exhibits to the parent level 1 document.

When opening documents in PdfClerk all files in the folder selected are loaded as main (level 1) documents while documents in subfolders are loaded as exhibits (level 2). You can map which subfolder is assigned to each level 1 document by using the Map Exhibits feature. You can change the level of documents. By enabling the feature All documents share all exhibits, all exhibits are shared by all documents.

For exhibits, references are by default matched against the Exhibit column in the document list and not the Alias column. By checking the box Exhibits are also documents in the Autolink-tab, level 2 documents are also matched against the value in the Alias column.

By default, PdfClerk only looks for references in level 1 documents. This may be changed by checking the Source checkbox in the document list for any other document.

If PdfClerk identifies a reference to an exhibit without any document reference, it assumes that the reference references an exhibit of the main document being processed.

You can configure PdfClerk to read the entries from another column than Exhibits in the document list using the setting Show document field search mapping.

Page Numbers

Page numbers are the number or numbers in the reference that represents the page or pages of the target document that should be linked to. If no page number is provided PdfClerk creates a link to the first page of the document.

Sometimes a reference references multiple pages in the same document. PdfClerk supports automatically creating links to each page number. This works by creating a page number search that accepts multiple numbers. PdfClerk then identifies that the search result includes multiple page numbers and creates a link to each of them.

Example searches
page (?<page>\d+) = matches single page number
pages (?<page>[\d+,\S]) = matches multiple page numbers like “23, 45”

Paragraph Numbers

Paragraph numbers are the number or numbers in the reference that represents the paragraph or paragraphs of the target document that should be linked. This works by PdfClerk analyzing the target document identifying paragraph numbers by looking for sequential numbers at the same indent on multiple pages. The function works best with native electronic documents. Like for page numbers PdfClerk supports having multiple paragraph numbers in the same reference.

Searches

PdfClerk identifies the references in the text of the pdf files by performing searches. These searches are specified in Searches in the Autolink-tab.

For each search there are two text fields. The first text field is to give the search a name (does not have any effect on the searches). The second text field contains the actual search.

The main building blocks of the searches are the base patterns {d}, {e}, {p} and {pp} which represent a document, an exhibit, page and paragraph respectively. If an exclamation mark is added after the letter, it means that the search will only match against text that contains this element. This is useful i.e to enable a search that finds references both with and without page numbers.

Examples

  • {d!} {e!} {p} : Finds references on the form: “Document Exhibit page number” and page number is not mandatory.

There must be at least one mandatory base pattern in each search.

A phrase of text in a document may match against more than one of the searches. In most cases, a reference to an exhibit of the document itself may be simply referred to as “Exhibit B” while a reference to an exhibit of another document may be referred to as “Exhibit B to Pleading of 3rd January”. A search for exhibits without qualifying document references will also find the qualified references. For that reason, the searches should be executed in order, starting with the most specific first. The searches are done in the same order as they are listed in the interface, and the order may be changed by drag and drop.

Base Patterns

Each of the base patterns is defined using regular expression syntax and are defined in the Base patterns setting. The base patterns are building blocks for the searches described above.

The base pattern for Document {d} uses the name “doc” as the extraction group. The “#doclist# in the definition are replaced with a list of all the aliases when the search is performed.

The base pattern for Exhibit {e} is defined as a regular expression with “exhibit” as the extraction group name. Typically, the references for an exhibit would consist of a fixed name like “Exhibit” and an exhibit number.

The pase pattern for Page Number(s) {p} is defined as a regular expression with “page” as the extraction group name. Typically, the references for a page number would consist of an identifier like “page” or “p.” followed by a number.

The base pattern for Paragraph Number(s) {pp} is defined as a regular expression with “para” as the extraction group name. Typically, the references for a paragraph number would consist of an identifier like “para” or “ ¶” followed by a number.

Besides basic patterns, there is also a building block for whitespace. When the search is performed any whitespace in the base patterns and searches is replaced with the content of this configuration.

In most cases, PdfClerk recognizes patterns also when they are spread on two lines, and if the page has multiple columns. PdfClerk will also handle line breaks within single words if words are separated with a hyphen or underscore

Advanced

  • Regex lookaheads and lookbehinds in a search pattern need to be wrapped in a comment with the text “prtIgnore”. Example lookahead for “:” is (?#prtIgnore)(?=:)(?#/prtIgnore)
  • By default PdfClerk only allows one link on each item of text. That means that if there are two searches that match the same text, only the search that runs first results in a link. Add #i# before the search to override this behavior.
  • The #doclist# in the definition of the document base pattern is replaced runtime with a list of the document aliases (separated by |)
  • Underline Links – If checked, the links will be underlined in the selected color. Width is the thickness of the line. Default is 1 pt.
  • Highlight Links – If checked, the links will be highlighted in the selected color. Note that the highlight color becomes part of the pdf once applied and cannot be removed. Link highlighting may not work on some documents.
  • Open Links in New Window – If checked, the links are opened in a window (or tab, based on which pdf viewer is used) instead of reusing the same window.
  • Adjust page link target – Sometimes there is an offset between the page number stated in the reference and the page you would like to link to. The target page to link to can be adjusted by selecting the wrench icon on the right of the document in the Files tab and changing the value for Adjust page link target. Positive and negative numbers can be used.
  • Documents to ignore – If there are references in the document that should not be linked up, linking is excluded by adding the names used to the Documents to Ignore section below the document list in the File tab. Multiple references can be added using a comma as delimiter. No links will be added to references listed in the Documents to Ignore box.
  • Tag Links and Documents to Enable Binder Recognition – Enable this option to add tags to the target file of a link to enable links to work also after a linked document is inserted into another document collection using software other than PdfClerk.
  • Show document field search mapping – By default, PdfClerk looks for references using the data field Alias for documents and the field exhibit for exhibits. Use this option to set alternative columns to match documents and exhibit references.
  • Exhibits are also documents – If checked, PdfClerk will use the Alias field (or any other field configured as the field to use for documents to match references against also for exhibits. Click the Populate link to load default alias names from the file names.
  • All documents share all exhibits – By default, exhibits in a subfolder belong to one or more documents as configured by the Map Exhibits function in the Files tab. That means that if there is a reference like “Exhibit 1” in a level 1 document PdfClerk understands that this references Exhibit 1 of this document and not an Exhibit 1 to another document. Enabling this function all exhibits will belong to all documents. This would work as long as there are not multiple exhibits sharing the same reference belonging to different documents.
  • Case-insensitive searches – Enable this option to use case-insensitive searches.
  • Ignore zero-padding – If checked, PdfClerk will ignore zero padding when looking up references against documents and exhibits. The reference Exhibit 0001 will match the alias and exhibit Exhibit 1. Only the first zero padding is ignored. If a single zero is not followed by a digit, it will not be ignored in the matching.
  • Limit pages to search for references – Click the wrench icon for the document in the document list and specify which pages to search in the Pages to search input field. you can provide one or more ranges like “3, 5-10,!19”. Each range can either consist of a single page or a range. The ranges are separated by comma and you can exclude ranges by prefixing the range with an exclamation mark.

By default the aliases and the locations of the documents to be linked to are defined in the document list. By enabling “Use link targets from Excel” the list of the aliases and their corresponding documents will instead be loaded from an Excel worksheet. This is useful when you want to add links to documents in existing binders.

The first step is to create the link targets list which is an excel worksheet with the following columns:

  • Alias – the alias used as a reference in the text like “RE-123”, “Exhibit 1” or other
  • File – the PDF binder with the documents to be linked to
  • Page – the page number for the first page of the document in the binder

The first row of each column should be the name of the column. The worksheet may contain additional columns.

If a document has more than one alias you can include several aliases for each document by using a comma separated list. Alternatively you can add more columns with names starting with “Alias”.

PdfClerk can help you create the link targets list. First load the binder by using “Open input files” and “Binder”. PdfClerk will then load the binder and show the document names and their location in the binder. You can then add a metadata column named “Alias” and add the aliases manually or you can use the Metadata extraction function to extract the aliases. You can save the list to Excel by using the “Save To Excel” in the Load & Save tab.

Before you can use the Excel file with the Use link targets from Excel you also need to add a column named File with the filename of the binder. If the document you want to add links to will not be in the same folder as the binder you can add the relative path in front of the filename. The process described here can be repeated for multiple binders and the excel files combined to one master file containing all the aliases for all the documents in all the different binders in a case.

To use the link targets list first load the document or the documents that you want to add links to using the “Open input files” > “Folder” or “Single file to link”. Make sure that the Source checkbox for the documents you want to add links to are checked. Then configure the search and load the link targets list using the Use link targets from Excel in the Autolink tab.

There is an example set of documents showing how this works, including an example Exel file here.

This method can also be applied if documents are not organized in binders. It provides an overview of all documents in the case and their aliases, enabling you to easily add links to new documents.

You can also use such a master file to create custom indexes and tables of contents for all the documents in the case with the Create Index function.

Guided Autolinking

PdfClerk’s Autolink function uses formal pattern recognition to automatically find references and translate them into clickable links. For some advanced referencing styles, not all references can be automatically converted into clickable links. This may be the case if the references do not contain sufficient information to determine the target document and page for the reference.

In such situations, you can use PdfClerk’s Guided Autolinking feature to first identify and extract the references and then go through the references and modify them manually.

To use Guided Autolinking, enable the option Export references to Excel (found under Advanced in the Autolink tab). Identified references are then saved to an Excel spreadsheet named “References.xlsx” located in the Output folder. This Excel spreadsheet can then be manually modified to edit, add and remove links. After the Excel spreadsheet has been edited, run PdfClerk using the same configuration and with the same output folder, and select Use links from Excel (in the Autolink tab). PdfClerk will then read the references from the Excel spreadsheet instead of performing searches, and apply the updated links to the documents.

The Excel spreadsheet contains seven columns:

  • Document – The document containing the reference
  • Page – The page of the document containing the reference
  • X, Y, Width, Height – Defines the rectangle on the page where the reference has been identified
  • Hit – The text of the identified reference
  • Modified – The text of the identified reference repeated. Used when editing the link
  • TargetPage – The page in the target document that should be linked to
  • TargetDoc – The target document to be linked to

PdfClerk supports the following modifications to the links identified in “References.xslx”:

  • Edit the target of a link – Edit the TargetDoc and/or the TargetPage properties to change where the link will point to
  • Remove text from link – If the Hit column contains text that should not be part of the link, remove the part of the text that should not be part of the link from the Modified column
  • Split one reference into several links – If a reference is identified as one reference, but the reference is in fact several references, the reference can be split up into several links pointing to different documents and/or pages. To split one reference into several links, copy the row containing the reference to a new row. Then modify the text in the Modified column to contain the text for each link. Edit the TargetDoc and/or TargetPage accordingly.
  • Add new link – Add a new row and supply the information for Document, Page, TargetDoc and TargetDoc. In the Modified column, write the text of the reference. Leave the X, Y,Width, Height columns blank.
  • Discard a link – Delete the relevant row in the Excel spreadsheet

By selecting the option “Include unresolved references in export” PdfClerk will include references that matches the search criterias in the Searches but are not matched to a document, in the “References.xslx”. This can be useful for extracting more complex references and also in other situations where you want to extract information from PDF documents.

Multi binder collections

Sometimes a document collection with sequential page numbers is splitted over multiple pdf binders (aka split binders). Autolinking supports routing the link to the correct binder and page. To enable PdfClerk to map references to the correct page in the correct binder add the page ranges and page adjustments in the options for the document in the document list (click the wrench icon). The binders in the logical document collection should have the same Alias.

Example: All exhibits are collected in two binders named ExhibitsPart1.pdf with pages 1-1000 and ExhibitsPart2.pdf with pages 1001-2000. In order to convert references like Exhibits page 12 and Exhibits page 1222 to links to the relevant page in the corresponding binder to:

  • Add “Exhibit” as an alias for both files
  • For ExhibitsPart1.pdf set “Link Page Start” to 1 and “End link page” to 1000
  • For ExhibitsPart2.pdf set “Link Page End” to 1001 and “End link page” to 2000
  • For ExhibitsPart2.pdf set the “Adjust page” to -1000.

If PdfClerk adds links that are not correct, the links can be removed in a pdf editing tool such as Adobe Acrobat (not Adobe Reader) or Foxit Reader (free, go to “Home”, “Links” in order to edit and remove links). Please note that if the option Highlight links are checked, the highlighting will remain after a link has been removed in the pdf editing tool.

Norwegian Civil Litigation Configuration

The Norwegian Civil Litigation Configuration included in PdfClerk identifies the patterns:

  • Bilag 3
  • Bilag 3 side 5
  • Stevningen side 4
  • Stevningens bilag 3 side 4
  • Stevningen bilag nr 3
  • Bilag 4 til stevningen, side 4

PdfClerk understands both “side”, “s” and “s.” to indicate page numbers.

When looking for documents PdfClerk accepts alternative forms of the alias. This means that “stevning” will also match “stevningen” and “stevningens”.

For an example of resolving links in an US style submission have a look at the example “Motion to dismiss” on the Example documents page.

Load/Save

Projects

PdfClerk allows configurations and documents to be saved and loaded. The configuration files are saved as .clerk files which are associated with the PdfClerk. An alternative way to load a configuration is to double click the configuration file.

The settings are separated into three different categories to enable easy reuse of configurations - Files, Searches, and Configuration. Which categories to load or save can be selected in the interface. Files are the documents in the document list and the input folder and output folder. Searches is configuration related to Autolinking while Configuration is all other settings.

The internal format of the configuration files are JSON. This means that more technical users with particular needs could create and alter PdfClerk project files using a text editor.

Excel Format

PdfClerk can read and write document metadata from and to Microsoft Excel format (.xlsx). This enables the use of a familiar and powerful tool as part of the pdf creation workflow.

The columns and format used are different depending on whether the input is separate files or an existing binder.

Separate Files

Document names and other metadata is stored in a sheet named “Documents” with the column names:

  • Level - the bookmark level
  • Filename - the full name of the file - including extension, but not the path
  • Name - the name of the file - not including the extension

The path to the documents is stored in the “Config” sheet as “SourceFolder”.

An example file for a document assembly can be found here.

Existing Binder

Bookmarks and other metadata is stored in a sheet named “Binder” with the column names:

  • Level - the bookmark level
  • Name - the name of the bookmark

The full path and filename to the binder are stored in the “Config” sheet as “BinderToBeUpdated”.

An example file for a binder can be found here.

General

Additional metadata is saved in additional metadata columns with the name of the column as the header

Column order or formatting does not matter. There should be no empty columns or rows before non-empty ones.

Tools

The Tools section provides various tools to work with existing files and binders.

This function translates Clerk Links to pdf links. Clerk Links can be used to create custom tables of contents or other documents with direct links to specific pages in pdf binders. Custom documents can be created as Microsoft Word or HTML documents. The Clerk Links is inserted in the custom document as a hyperlink using the format described below.

The default way to create a table of contents in PdfClerk is to automatically create and insert one when creating a binder using the insert Table of contents function on the Output tab. With Clerk Links you can create linked documents with other data sources and have full control over how the resulting document will look.

Clerk links can be inserted manually into the source document, or you can use the function Create Index (described below) to combine a template and data from a Microsoft Excel datasheet into a document.

The created pdf can either be a separate pdf that links to other pdfs or be included in a binder referencing other pages in that binder.

The Clerk Links support the following link formats:

  • clerk://MyBinder.pdf - Creates a link to page 1 in MyBinder.pdf
  • clerk://MyExcelSheet.xlsx - Creates a link to open the Excel spreadsheet (or any other file type)
  • clerk://MyBinder.pdf?page=27 - Creates a link to page 27 in MyBinder.pdf
  • clerk://Myfolder/MyOtherBinder.pdf?page=3 - Creates a link to page 3 in MyOtherBinder.pdf located in the folder Myfolder.
  • clerk://n/?page=3 - Creates a link to page 3 in the same file
  • clerk://n?anchor=someName - Creates an anchor (named destination) that can be linked to
  • clerk://n/?dest=someName - Links to an anchor (named destination)
  • clerk://n/?embed=MyExcelSheet.xlsx - Embeds MyExcelSheet.xlsx into the binder as an attachment and inserts a link that will open the embedded file. If the path is not absolute, PdfClerk looks for the file in the source folder

The pdf file created should be placed in the same folder as the files linked to or if paths are provided (like the example link to MyOtherBinder above) in the parent directory.

When you want to include a file in a binder that links to other parts of such binder you should just convert the file from Word to pdf and then merge the pdf using PdfClerk. You can convert from Word format to pdf directly using Word or you can use this tool in PdfClerk and check Do not convert links.

A possible workflow is to create links in Excel using the Excel Hyperlink function for each relevant document, then copy the links from Excel to Word as a table, and then format the table in Word before activating the links using PdfClerk.

Optionally, select a background color.

To be able to convert a file from Word to pdf format, MS Word version 2010 or newer needs to be installed on the computer. If the file has been converted from Word format to pdf format using another tool, PdfClerk can update the links as described above.

(The “/n/” part (the host part) of the URL is optional for files and can be added to comply with the URL specification: clerk://n/MyBinder.pdf)

Create Index

You can include a table of contents at the same time you create the binder using the Add Table of Contents in the Settings tab. If you want to link to multiple binders, add a table of contents to an existing binder or for other reasons need more control over the output, you can use the Create Index function found under the Tools tab.

This function works by combining entries from an Excel spreadsheet into a Word or HTML template. If you want to create a table of contents for one or more existing binders with bookmarks you can export the bookmarks to Excel using the save to Excel function found under the Load/Save tab.

The Create Index function creates an editable .docx or HTML file with Clerk Links on the relevant entries. This allows you to edit the file before you proceed to convert it to pdf using the Convert Clerk Links function. The Clerk Links format is described above.

Use the option to Adjust target page number if you need to offset the page numbers provided in the Excel file. This could be handy if you plan to insert the created index at the beginning of the binder that the index links to.

The Excel datasheet

The Excel datasheet should have a sheet named Catalog (or Documents or Binder) with columns with the following column headers

  • file – the file that the entry should link to. If empty it links to the stated page in the same file
  • page – the page in the document that the entry should link to
  • level – The hierarchy level of the entry. If the level is not provided, entries will be handled as level 1
  • name – the text used for the link

Any other columns are handled as metadata and inserted into the document if referenced by the template by the column title as the name.

PdfClerk saves and reads all values either as strings, booleans or numbers. PdfClerk does not support formulas. To display dates consistently, PdfClerk requires that dates are saved as text in the Excel spreadsheet. If you have a date stored as a date, you need to convert it to text format to properly render it in the template. One way to convert dates stored as dates to text is to copy the column containing the date values to Notepad, create a new column in Excel and set the format of the column Text and copy the dates back from Notepad. Alternatively, you can use Excel’s Text function.

Using a Word template

The Word template could either be defined using an ordinary paragraph or be based on a table. If a table is used it should be on this format:

Document

Page

<name><#1#>

<page>

<name><#2#>

<page>

  • In order for PdfClerk to identify which table in the Word document contains the relevant table, the title of the table should be set to index. To name the table, open Table Properties, go to the tab Alt Text and write “index” in the Title field.
  • The row with <#1#> is the template for level 1 entries, the row with <#2#> is the template for level 2 entries, and further - up to 10 levels. The same applies if you use a template based on a paragraph. But then PdfClerk uses items on the same line instead of items on the same row.
  • The text enclosed in <> is replaced with the corresponding text from the column with the same name in the Excel datasheet. Any number of columns can be added.
  • By default, all text becomes clickable links to the relevant entry. If you do not want a particular entry to be a link you can either make sure that there is no data in the file or page columns or explicitly set the item as no-linking by adding “:0” to the property name like <title:0>.
  • An example template file is provided in the example files.
  • The text will be styled with the style that is applied to the relevant cell in the template. Only one style can be applied to text in a table cell. The style should be based on a paragraph style.
  • If the Excel workbook has a datasheet named Meta, the first two columns of this datasheet are handled as key and value pairs that are accessible in the template using <m-key> where key is the name assigned in the first column and the value written to the template is the corresponding value in the second column.

Using an HTML template

PdfClerk also supports creating index files based on HTML templates. You can edit the supplied standard templates or create your own templates. Note the following:

  • Data is filled into the templates using the Liquid template language.
  • Data loaded from the Excel spreadsheet is exposed to the template in the object “catalog” with all lowercase letter properties.
  • By adding “_L” after the name of the property it will be translated into a link. Alternatively, the link uri is available as “uri” for each item.
  • If the Excel workbook has a datasheet named Meta, the first two columns of this datasheet are handled as key and value pairs that are accessible in the template using meta.key where key is the name assigned in the first column and the value written to the template is the corresponding value in the second column.
  • PdfClerk supports full-page coloured background without margins using the custom in-file CSS property “ -clerk-page-background-color” which accepts heximal color codes.
  • PdfClerk supports extracting headings as bookmarks when converting html files to pdf by adding the custom in-file CSS property -clerk-extract-headings-as-outline: all.
  • You can use ordinary HTML, CSS and Javascript to render the templates, and load external stylesheets, images and fonts. To work with the document catalog using javascript you can include the catalog in as a javascript variable named “catalog” by outputting the variable "jsonCatalog" in the template.

PdfClerk supports alternative links for the same entry (Excel spreadsheet row). Add additional link columns in the Excel spreadsheet (page, file) by adding a number after the name (page2, file2).

In table-based Word-templates the alternative links are available by adding :X where X is the number used as postfix for the name in the Excel spreadsheet. Like <title:2>.

In HTML-templates the alternative links are available in two separate ways:

  • by adding “_Lx” to the property name where x is the number used at the end of the column name in the Excel spreadsheet. Like “title_L2.
  • as additional .uri properties (uri2, uri3 etc)

If the setting Tag links and documents to enable binder recognition under the tab Autolink is enabled when references are recognized and converted to links in a document, information about the link’s target is embedded in the document and the target files. If documents are combined in binders via software other than PdfClerk, this information will remain intact. However, links will have to be reactivated. Using this function, you can reactivate the links in the binder.

You can also use this function to reconnect links if you have changed the name of one or more of the target files for a link created using the Autolink feature.

Use this function to update link target and link type of links in an existing pdf document.

The function searches for external links starting with the pattern provided in the Link address starts with input box and changes this pattern to the pattern provided in the Replacement input box. Links that do not start with the pattern are not changed.

If you select an input file without selecting any changes, PdfClerk will look through the file for links and list out their addresses. This is handy when determining what patterns should be changed.

You can specify that only links of a certain type should be updated. PdfClerk will then only search for links of the type specified. You can specify that links should be converted to another type of links. PdfClerk maintains direct page links when converting between GotoR links and web links.

By selecting the option Include bookmarks, PdfClerk will also search in the bookmarks.

PdfClerk currently supports updating links of the types GotoR (for pdfs), Launch and Web (URI).

You can process multiple search and replace patterns at once by separating them using the “|”-symbol. As an example putting “folder1|folder2” in the search box and “folderA|folderB” will replace all links starting with folder1 with links starting with folderA and links starting with folder2 with folderB.

Presets and example files

PdfClerk ships with predefined settings for various pdf building tasks. The presets are available via the presets menu in the top left corner of PdfClerk. There are examples and sample files showing PdfClerk functionality available here.

The presets and example files may serve as an introduction to how PdfClerk works. You can create your own presets by saving a project to the presets folder (by default located in Documents\PdfClerk\Presets\).

Preset: Simple Merge

Simple Merge is the default setting, merging documents and adding the filenames (without the filename extension) as bookmarks.

  1. Select and load the Simple Merge preset
  2. Click the Open input files button select Folder and then select the folder with the pdf files you want to merge
  3. Use the Select output folder to select the folder to store the merged file
  4. Click the Generate button

Same as Simple Merge but with the filename (without extension) as document name in the upper left corner and page number and total pages of the binder in the bottom right corner.

  1. Select and load the Merge with header and footer preset
  2. Click the Open input files and Folder to select the folder with the pdf files you want to merge
  3. Click the Select output folder to select the folder to store the merged file
  4. Click the Generate button

Preset: Simple linking

Converts textual references to other documents into clickable links to the referenced document. The reference text is the filename (less the extension) of the file to be referenced. If a reference is followed by a page number, the link will link to this particular page. The links are underlined.

  1. Select Single file to link using the Open input files button
  2. Select a Word or pdf document. The documents to link to should be in the same folder or å subfolder.
  3. The document list loads with all pdf documents in the folder and the selected file selected.
  4. Click the Generate button

Preset: Linked Pleading as Binder

Merges a pleading with its exhibits and adds links to the exhibit references. Adds a bookmark for each exhibit with Bilag X as the title. Adds exhibit numbering inside a rounded red rectangle on the top right of each exhibit and page count on the bottom right of each page. This preset is based on Norwegian template documents and works with the example files from the Norwegian litigation example to understand this preset.

  1. Select and load the Linked Pleading as Binder preset
  2. Select Open input files button and then Folder
  3. Select a folder with the pleading stored as a pdf with the exhibits in a subfolder
  4. Set the exhibit folder to the folder that contains the exhibits using the Map Exhibits link
  5. Use the Select output folder to select the folder to store the linked file
  6. Click the Generate button

Preset: Linked Pleading Attachment Binder

Binds a pleading together with its exhibits and creates blue underlined links between references to the exhibits and the exhibits. The exhibits are added as attachments to the pleading and exhibit numbers are padded to allow natural ordering. Large exhibit numbering on the top right first page of each exhibit. Exhibit numbers are read from the filename using the pattern Bilag X.

Adding exhibits as attachments is only designed to work when there is one document with one set of exhibits.

  • Same as above, except select this preset.

Preset: Set of Linked Pleadings

Converts references to exhibits and pleadings into links as described in section Norwegian Civil Litigation Configuration. In addition, this preset merges all the pleadings and their exhibits into one binder and creates page numbers in red and adds exhibits numbers on the exhibits in the top left corner.

  • Same as above, except select this preset and in step 3 the folder should contain multiple sets of pleadings and their exhibits. This example is based on Norwegian template documents. It can be used with the example documents provided in Link & bundle all submissions in a case into one bundle.

Preset: Stamp existing binder

Extracts exhibit numbers from the bookmarks of an existing binder and stamps the first page of each exhibit with the relevant exhibit number with a border around the exhibit number.

  1. Select and load the Stamp existing Binder preset
  2. Open an existing binder by selecting Binder in the Open input files button
  3. Click the Generate button

Finds references that references one or more paragraphs on the form “Document ¶ 3” and adds links to the correct page in the target document using underlined blue links. Same reference search as in the example document Motion to Dismiss

  1. Select and load the Paragraph Search preset
  2. A folder including both the document that have the references and the documents that are references using Select Input files then Folder.
  3. Make sure that the Source box is ticked of for the document that contains the references (the referencing document)
  4. Make sure that the Aliases for the documents to be referenced are the same as the text used in the references in the referencing document
  5. Click the Generate button

Global keyboard shortcuts

PdfClerk has the following global keyboard shortcuts

  • ? = Show the keyboard shortcuts
  • ctrl+g = Generate
  • ctrl+s = Save project
  • ctrl+shift-o = Open project
  • ctrl+1 = Open Files tab
  • ctrl+2 = Open Settings tab
  • ctrl+3 = Open Autolink tab
  • ctrl+4 = Open Tools tab
  • ctrl+5 = Open Load & Save tab
  • ctrl+6 = Open Log tab
  • ctrl+7 = Open Help tab

In addition, there are keyboard shortcuts for rearranging documents in the document list.

Legal stuff

Data privacy

PdfClerk is a Windows desktop application installed on the user’s computer. The application does not share information about documents processed with any external systems.

License

PdfClerk is the property of Protosys AS and its licensors. Third-party software included in the software is listed on the Help tab of the application

Use of PdfClerk is only permitted for license holders and is then limited to the usage, the number of users and the time period described in the license grant. PdfClerk may only be used for its intended purpose as described in the documentation.

The purpose of a trial license is to test the software. The output pdfs are marked as not for production use. Use of a trial license for production purposes is not permitted. Terms and conditions for using PdfClerk can be found here.

Last updated 2025-09-03 for version 2.6.1