Documentation as pdf

Introduction

PdfClerk is a secure desktop software for creating feature-rich pdf document collections and binders.

Main features:

  • Automatically convert document references to hyperlinks, typically between submissions (briefs, motions, pleadings, etc) and exhibits.
  • Bundle submissions (court bundles) with stamps, bookmarks and table of contents
  • Create stamps with graphic elements like borders, solid or transparent backgrounds and company logos in addition to text derived from metadata.
  • Extract metadata to be used for header & footers (stamps) and bookmarks from the filename and add and manage additional metadata like exhibit numbers, titles, and dates
  • Import and export metadata using Microsoft Excel format
  • Create customized table of contents
  • Create hyperlinks to specific pages in the same or other binders using Clerk Links.
  • Edit and split binders
  • Access to standard settings and workflows using configurable presets

PdfClerk’s autolink capabilities can be configured to work in different languages and on different types of documents like legal submissions, contracts, technical documentation and reports.

PdfClerk ships with several built-in configuration settings (aka Presets). Example documents and example files available here.

Files

Input files

The input files are the files that you want to do something with - like merging, stamping and linking.

For operations that change multiple files, PdfClerk operates on a folder basis. That means that an entire folder should be selected as input. For operations that only involve changing one file - like adding links to one file or adding stamps to a binder - the individual source file can be selected.

You select the input files using the dropdown button Open input files. The button has three options:

  • Folder - Use this to select a folder as input. All pdf files in this folder and subfolders will be loaded into the document list.
  • Binder - Opens an existing binder (document collection) and loads the bookmark hierarchy of that binder into the document list.
  • Single file to link - Use this to select a single pdf or Word file that contains textual references to other documents in order to convert those references into links. It loads the selected file and all pdfs in the folder and subfolders into the document list. Only the files marked as “Source” are processed for references.

Select output folder

Use this button to select the output folder where files created or updated by PdfClerk should be saved.

Click the “New folder” item on the ribbon in the Select folder dialog to create a new folder.

Document list metadata

The metadata for each item in the document list is shown in the document list. Metadata can be used for stamps, bookmarks and table of contents. By default, the name of the item is the filename without extension for single documents, and existing bookmarks for binders.

Each item can be edited by clicking the text to be edited. You can navigate between entries using the up and down arrows.

Additional metadata columns can be added by clicking the plus sign in the top right of the document list.

Metadata is read from filenames or bookmarks using the regular expression specified under Metadata extraction. This is useful to extract information already present in filenames and bookmarks. Metadata can be imported and exported in Excel format.

Document Level

In a pdf, document bookmarks are nested in a hierarchy. In PdfClerk this is achieved by changing the level of an item. Since this is a tree structure, the first item in the document list needs to have 1 as level and the increase in level from one item to the next cannot be more than one.

The level can be changed for multiple items at once by selecting several items as described in Sort & reorder documents and then using the arrows next to the level in the document list for one of the selected items.

Sort & reorder documents

By default, the documents in the document list are sorted alphabetically on the name field.

You can sort documents in the document list using the sort menu. Click the sort arrows after the column name to open the sort menu. In addition to selecting the column you would like to sort on, you also need to specify what level you want to sort on. This is required since you normally would like to keep documents in the same place in the hierarchy when you sort them.

The sorting function is useful if you have added exhibits numbers manually and would like to sort the documents in accordance with their numbering.

Documents can be reordered manually by selecting documents by clicking the circle to the right of the document in the document list. Select additional documents by holding down the shift key and clickin additional documents. You can then use the arrows in the floating menu to the left to move documents upp or down.

Use the following keyboard shortcuts to reorder documents more efficiently.

  • ctrl+up/down: Move selected document(s) up or down
  • ctrl+shift+ up/down: Move selected document(s) up or down 10 positions
  • alt+up/down: Move document selection (cursor) up or down
  • shift+up/down: Expand or reduce selection
  • left or right arrow: Change the document level

If you are into keyboard shortcuts there are also the global keyboard shortcuts.

Folders in the document list can be collapsed and expanded by clicking the - and + signs to the left of their name in the document list.

Map Exhibits

When you have a main document with exhibits, this can be organized in PdfClerk by saving the exhibits in a subfolder and placing the main document right before that folder in the document list.

Instead of manually reordering the documents, you can use the function Map Exhibits to tell PdfClerk what folder contains the exhibits to each of the main documents.

In cases where there are several series of exhibits belonging to different main documents, this mapping is required for PdfClerk to understand references of the type “Exhibit X of Document Y, page 22.

Add bookmarks

Additional bookmarks can be added by selecting the wrench icon on the right of the document in the document list. The bookmark will link to the first page of the relevant document. The folders are added as bookmarks when loading documents from a folder structure.

Document settings

By clicking the wrench icon on the right of the document in the document list, you can change the following properties of each individual document/bookmark:

  • Name - The name (and metadata) of each document and bookmark can also be edited from the document list by clicking on the text that should be edited.
  • Exclusion - Each document and bookmark can be excluded from stamping, from being included as a bookmark, and from appearing in the table of contents
  • Bookmark appearance - The appearance of each bookmark can be adjusted by setting font color and font style
  • Bookmark expansion - By default, all bookmarks are expanded. This can be changed on a per-level basis in the bookmarks section in the Output tab. In addition, this can be set per bookmark by selecting either default, collapsed, or expanded in the document properties.

Working with Existing binders

If an existing binder is selected as input via the Open input files button, PdfClerk will work on that binder only and the document structure based on the bookmarks will be loaded into the document list. In this mode, you can add stamps to the binder, rebuild bookmarks, update the document properties or split the binder into individual files based on the bookmarks. The information to be used in stamps and bookmarks can be retrieved from existing bookmarks, edited in PdfClerk, or loaded from Excel via the Load/save menu. You can also save bookmarks and their page numbers from an existing binder to Excel. This is useful for creating index files using Clerk Links.

Metadata extraction

PdfClerk can extract information from the filename of each file, like exhibit numbers or dates. Extraction is made using regular expressions and named capturing groups. The named capturing groups will be available as metadata for the document. Setting up regular expressions require some technical knowledge. Here are some examples that can be edited to your use case

  • “(.*)exhibit ?(?<exhibit>[0-9]{1,5}[A-z]{0,1})” - extracts 24 from “Letter to IBM, Exhibit 24” as exhibit
  • “.{6}_(?<doc>(.)+)” - extracts “Document 1” from “2001-02-03_Document 1.pdf” as doc
  • “(?<date>.+?)\s{1,3}(?<name>.*)(\s){0,3}\[(?<doc>[0-9]{1,5})?(-(?<exhibit>[0-9]{1,5})\])?” - extracts date as 2019-03-08, name as “Minutes from meeting”, doc as 4 and exhibit as 7 from “2019-03-08 Minutes from meeting [4-7]”

Press the refresh icon to run a new or edited regular expression.

To extract different data, you can run different regular expressions with different group names. If you run a regular expression with the same group names(s) they will overwrite data in the existing columns.

Click “Add extractor” to add additional extractors. If you select another level than All in the Level selector, metadata will only be extracted for the selected level.

Any existing cross-links between pdf files merged into a bundle will be maintained. This is useful for links added to a document using Microsoft Word or a pdf editor. Matching of links with a relative path is made both with the source folder and the target folder. For links added via Microsoft Word you can also specify which page number to link to by adding a ‘#’ and the page number at the end of the filename (c:\files\myFile.pdf#3 will link to page 3)

Settings

Output format

PdfClerk can either update individual files or bundle individual files together to a binder.

When the source of the documents in the document list is an existing binder you can add stamps and bookmarks the binder, but you cannot use Autolinking or create a table of contents.

General settings

  • Start documents on odd page – Ensures that all documents start on an odd page to allow double-sided printing
  • Add blank page before – Use this option to add a blank page in front of other documents. Use in combination with stamps to create named separation sheets and similar. By default, a blank page is added in front of all documents, except the first document in the binder. Select a field from the dropdown list to only add blank pages in front of documents that have a value for this field, including the first document. Optionally select a background color.
  • Create binders as attachment binders – Creates a pdf binder with the first document in the document list as the main document and the other documents in the document list as attachments to that document.
  • Open output folder – Opens the output folder after the document(s) have been created
  • Debug – If this option is selected additional information is collected during execution to enable debugging, including more detailed information in the Log-tab. The information is stored in a folder named “debug” in the output folder. In addition any temporary files created during execution will not be deleted and may be found in the “_temp” folder located in the output directory.

Table of contents

PdfClerk can create a clickable table of contents based on the documents in the document list.

  • Select the template to use in the dropdown.
  • Once you have selected to add a table of contents it appears in the document list. By default, the name as it appears in the document list will be the title of the table of contents. You can change it by clicking on the name as it appears in the document list (default name is Table of contents)
  • For templates in Word format, you can edit the project template by clicking the wrench icon next to the document in the document list.
  • Once inserted the table of contents can be moved to another location in the document list.
  • Templates are Word or HTML documents located in the PdfClerk’s Templates folder (normally found in the Windows Documents folder).
  • You can edit the templates or add your own by adding them to the Templates folder. In addition to changing the layout and adding company stationery, you can configure them to output additional metadata fields for the documents like exhibit numbers and dates. Read more about how to edit and create your own table of content templates in the Create Index section.
  • The metadata in the document list for the item table of contents is accessible as <m-name> in Word templates and as meta.name in HTML templates. This is useful in different scenarios like setting a title in the table of contents. If you update the name of the item table of contents to “List of documents in project Green” this text becomes the title of the table of contents in the actual table of contents. Other metadata items is accessible using the same naming convention (<m-xxxx> and meta.xxxx)
  • PdfClerk has an alternative, even more flexible, method for creating tables of contents. See Create Index.

Stamps

General

PdfClerk can apply header and footer text to documents. This function is typically used to mark documents with exhibit numbers, page numbers and document titles.

There are two types of stamps:

  • Text stamps – Inserts the text defined in the Text field.
  • Graphic stamps – Inserts a graphic element in addition to the text. The graphic elements can be borders, solid or transparent backgrounds, logos, fixed text or other elements. If no text is provided in the Text field, only the graphic will be added. If an expression is added in the Text field that evaluates to an empty string, then the graphic stamp is not added. To use a textual expression to control whether the stamp is printed without printing any text thick the “No text” checkbox.

Each stamp has the following settings:

  • Position – Defines where on the page the stamp should be placed
  • X-m argin – The vertical margin for the stamp. For top and middle positions, it is calculated from the top edge of the page. The bottom position is calculated from the bottom of the page. Negative values are allowed.
  • Y-margin – The horizontal margin for the stamp. For right and center positions, it is calculated from the right edge side. For left positions, it is calculated from the left. Negative values are allowed.
  • Font – The font used for the text in the stamp
  • Color – Colors are specified using the color picker or alternatively using their HEX values.
  • Level – Specify if a stamp should only be printed on a specific level.
  • Text – The text to be inserted. Any text enclosed in < > will be handled as metadata and looked up in the metadata table. Available values are listed below the stamps. Click the value to insert it in the active stamp. See Accessing Metadata for a reference on how to use metadata.
  • Document pages – The default setting “all” prints the stamp on all the pages of a document, “first page” prints it only on the first page of the document, while “other” prints on all document pages other than the first page. If you use the function Add blank page before the added page is considered the first page of the document.
  • First binder page - Uncheck this box to not print the stamp on the first page of a binder.
  • No text - For graphic stamps, thick this box to prevent any text from being printed on the page together with the stamp. This is useful if you use a textual expression to control which documents the stamp is printed on.
  • Use this – Uncheck this box in order to not apply the stamp.

More settings

For graphic stamps, there might be optional settings. The options available depend on the type of the stamp and the configuration options provided in the configuration file of the stamp.

For graphic stamps where the placement of the text is not defined in the svg file, the placement of the text relative to the graphic element can be adjusted:

  • Horizontal position - Left, Right or Center
  • Vertical position - Top, Middle or Bottom
  • textAdjustX - Adjustment to the position on the X-axis - both positive and negative numbers can be used
  • textAdjustY- Adjustment to the position on the Y-axis - both positive and negative numbers can be used

PdfClerk ships with some predefined stamps located in the stamps folder. Additional stamps can be added in the stamps folder by adding a configuration file and an image (svg, png, jpeg, gif, bmp or tiff). If you need help adding a stamp, like your company logo, let us know and we will help.

Bookmarks

Select this option to add bookmarks to a new binder or to replace the bookmarks in an existing binder. Bookmarks in pdfs are sometimes referred to as outlines. Specify the metadata fields to use for the bookmark text in the Build string field. See Accessing Metadata below for a reference on how to create bookmarks based on the available metadata.

Note that adding bookmarks to an existing binder will remove any existing bookmarks. In order to remove all bookmarks in a binder without adding new bookmarks, click the Remove existing option.

By default all bookmarks are expanded, meaning that the whole bookmark tree is shown when the binder is opened. Use the Collapse Bookmarks function to collapse one or more levels of the bookmarks. These settings can be overridden for individual documents using the properties for each document available using the wrench icon in the document list.

You can use different build strings for different bookmark levels by adding additional build strings and specifying the level for each build string.

Rename Files

By default individual files keep the name they have. You can change the name of individual files by editing the name column in the document list. Alternatively, you can rename all files by checking the Rename files checkbox. You can then provide the pattern for the file names in the File name text box using the metadata items as building blocks.

Split binder by bookmarks

Use this function to split a binder by its bookmarks. You can select which level (of the bookmarks) that should be used for splitting the binder. The default name of the file will be the text of the bookmark. This can be changed by specifying another value in the Filename pattern field. This works like stamps and bookmarks. Different levels of metadata can be accessed using the <l1-name> syntax (where 1 represents the level). Documents can be organised in subfolders by adding a backslash (\) between different property names. If there are file attachments to the binder, these will be extracted to a subfolder named attachments.

This setting is only available when an existing binder is loaded.

Accessing Metadata

Default and additional properties

The following default metadata items are available for use in stamps and bookmarks:

  • <name>: the name of the bookmark or file without extension (like “filename”)
  • <filename>: the filename of the document (like “filename.pdf”)
  • <page>: the page number of the document
  • <numpages>: the number of pages in the document or binder
  • <docpage>: the page number of the merged in document (stamps only)
  • <filenumber: counter increased by one per file saved (split binder only)

You can add additional metadata items by clicking the plus sign in the floating menu to the right of the document list in the Files tab view. You can also load metadata from an Excel Workbook. Additional metadata items may typically be exhibit numbers, dates etc.

Formatting

For metadata the following formatting may be applied:

  • add - Use this function if you would like to start the pagination on another page number than 1. If you want to start the pagination of a binder on page 101 use the following format: <page|add:100>
  • pad - Use this function to ensure that a number is represented by a fixed number of digits by adding leading zeros. Example: “003” instead of “3”. Use the following format <page|pad:3>
  • max - Use this function to limit the length of the text to a specific number of letters. Use the following format <title|max:30>
  • maxe - Same as above but an ellipsis (...) is added to the end of the string if it is shortened by the function. Use the following format <title|maxe:29>

Empty values & text control

If metadata for an item is unset or empty PdfClerk ignores the metadata item but inserts the rest of the string provided in the input box. This can be changed by enclosing the relevant part of the string in curly brackets. Examples:

  • “Exhibit <exhibit>” inserts “Exhibit “ if there is no value for exhibit
  • “{Exhibit <exhibit>}” inserts “Exhibit 1” if value “1” is given for the exhibit and nothing if there is no value for the exhibit.

Accessing metadata at specific levels

When accessing metadata items (like <name>), PdfClerk by default uses the highest metadata level for the relevant item. For stamps and when splitting binders, lower-level metadata items can be accessed by adding the letter “l”, the level of the metadata item to be accessed and a hyphen in front of the property name like <l1-name>. This is useful to include the full context on the header or footer of a document.

Pdf Document Properties

For binders the document properties title, author and subject can be set under Document Properties. If the checkbox next to the value is not checked, the original values will be kept.

When working with individual files or when splitting a binder into multiple files these properties can be set using dynamic metadata in the same way as for the file names.

Restricted Documents

Electronically signed documents or documents locked for editing cannot be merged or stamped. To overcome this limitation PdfClerk creates a copy of the document with the digital signature removed. For the same purposes, PdfClerk ignores the protected status for some pdfs solely for the purpose of merging and stamping such documents.

Autolink

General

PdfClerk’s Autolink identifies textual references to other documents or pages in the text of pdf documents and adds hyperlinks to the document and page being referenced. As an example PdfClerk can add a hyperlink to the text “Pleading of 23th December 2021, Exhibit 4, page 86” pointing to page 86 in the document that is Exhibit 4 to such pleading.

For each reference PdfClerk can handle three items of information:

  • Document – the document to link to
  • Exhibit – a document that belongs to a another document (optional)
  • Page number – the page number in the document (optional)

When the autolink feature is enabled in the Autolink tab, PdfClerk looks for references in all documents that have the checkbox Source in the document list checked. By default, all documents at level 1 are checked.

How PdfClerk recognises references can be configured to fit different use cases and languages. Look below in the section Searches to learn how to configure reference Searches in PdfClerk.

Documents

The document part of the reference is matched against the field Alias in the document list. By default the Alias contains the filename of a document (without the file extension). Alias can be manually edited. There may be several aliases for each document. To add multiple entries, separate them using the pipe symbol (“|”). You can configure PdfClerk to read the entries from another column in the document list using the setting Show document field search mapping.

Exhibits

An exhibit is a document that is a sub document to a main document. Typical examples are exhibits to briefs or annexes to agreements. In PdfClerk documents on level 1 are considered as main documents, while documents on level 2 are considered as exhibits to the parent level 1 document.

When opening documents in PdfClerk all files in the folder selected are loaded as main (level 1) documents while documents in subfolders are loaded as exhibits (level 2). You can map which subfolder is assigned to each level 1 document by using the Map Exhibits feature. You can change the level of documents as explained in the Documents Level. By enabling the feature All documents share all exhibits, all exhibits are shared by all documents.

For exhibits, references are by default matched against the Exhibit column in the document list and not the Alias column. By checking the box Exhibits are also documents in the Autolink-tab, level 2 documents are also matched against the value in the Alias column.

By default PdfClerk only looks for references in level 1 documents. This may be changed by checking the Source checkbox in the document list for any other document.

If PdfClerk identifies a reference to an exhibit without any document reference, it assumes that the reference references an exhibit of the main document being processed.

You can configure PdfClerk to read the entries from another column than Exhibits in the document list using the setting Show document field search mapping.

Page numbers

Page numbers are the number in the reference that represents the page of the document that is linked to.

Searches

PdfClerk identifies the references in the text of the pdf files by performing searches. These searches are configurable under the Searches heading in the Autolink-tab.

For each search there are two text fields. The first text field is to give the search a name (does not have any effect on the searches). The second text field contains the actual search.

The main building blocks of the searches are the base patterns {d}, {e} and {p} which represent a document, an exhibit, and a page respectively. If an exclamation mark is added after the letter, it means that the search will only match against text that contains this element. One example of when this is useful is to enable a search that finds references both with and without page numbers.

Examples

  • {d!} {e!} {p} : Finds references on the form: “Document Exhibit page number” and page number is not mandatory.

There needs to be at least one mandatory base pattern in each search.

A phrase of text in a document may match against more than one of the searches. In most cases, a reference to an exhibit of the document itself may be simply referred to as “Exhibit B” while a reference to an exhibit of another document may be referred to as “Exhibit B to Pleading of 3. January”. A search for exhibits without qualifying document references will also find the qualified references. For that reason, the searches should be executed in order, starting with the most specific first. The searches are done in the same order as they are listed in the interface, and the order may be changed by drag and drop.

Base patterns

Each of the base patterns is defined using regular expression syntax and are defined in the Base patterns setting. The base patterns are building blocks for the searches described above.

The base pattern for Document {d} uses the name “doc” as the extraction group. The “#doclist# in the definition are replaced with a list of all the aliases when the search is performed.

The base pattern for Exhibit {e} is defined as a regular expression with “exhibit” as the extraction group name. Typically the references for an exhibit would consist of a fixed name like “Exhibit” and an exhibit number.

The pase pattern for Page Number {p} is defined as a regular expression with “page” as the extraction group name. Typically the references for a page number would consist of an identifier like “page” or “p.” followed by a number.

In addition to the base patterns, there is also a building block for white space. When the search is performed any whitespace in the searches is replaced with the content of this configuration.

In most cases, PdfClerk is able to recognize patterns also when they are spread on two lines, and also if the page has multiple columns. PdfClerk will also handle line breaks within single words as long as words are separated with a hyphen or underscore

Advanced

  • Regex lookaheads and lookbehinds in a search pattern need to be wrapped in a comment with the text “prtIgnore”. Example lookahead for “:” is (?#prtIgnore)(?=:)(?#/prtIgnore)
  • By default PdfClerk only allows one link on each section of text. That means that if there are two searches that match the same text, only the search that runs first is matches. Add #i# in front of the search to override this behavior. This could be useful in order to link to both page 100 and 155 in the reference “Report A page 100 and 155”.
  • The #doclist# in the definition of the document base pattern is replaced runtime with a list of the document aliases (separated by |)
  • Underline Links – If checked, the links will be underlined in the selected color. Width is the thickness of the line. Default is 1 pt.
  • Highlight Links – If checked, the links will be highlighted in the selected color. Note that the highlight color becomes part of the pdf once applied and cannot be removed. Link highlighting may not work on some documents.
  • Open Links in New Window – If checked, the links are opened in a window (or tab, based on which pdf viewer is used) instead of reusing the same window.
  • Adjust page link target – Sometimes there is an offset between the page number stated in the reference and the page you would like to link to. The target page to link to can be adjusted by selecting the wrench icon on the right of the document in the Files tab and changing the value for Adjust page link target. Both positive and negative numbers can be used.
  • Documents to ignore – If there are references in the document that should not be linked up, linking is exluded by adding the names used to the Documents to Ignore section below the document list in the File tab. Multiple references can be added using a pipe (“|”) as delimiter. No links will be added to references listed in the Documents to Ignore box.
  • Tag Links and Documents to Enable Binder Recognition – Enable this option to add tags to the target file of a link to enable links to work also after a linked document is inserted into another document collection using software other than PdfClerk.
  • Show document field search mapping – By default, PdfClerk looks for references using the data field Alias for documents the field exhibit for exhibits. Use this option to set alternative columns to match documents and exhibit references.
  • Exhibits are also documents – If checked, PdfClerk will use the Alias field (or any other field configured as the field to use for documents to match references against also for exhibits. Click the Populate link to load default alias names from the file names.
  • All documents share all exhibits – By default, exhibits in a subfolder belong to one or more documents as configured by the Map Exhibits function in the Files tab. If this option is enabled, all exhibits are shared by all documents.
  • Case-insensitive searches – Enable this option to use case-insensitive searches.
  • Ignore zero-padding – Use this setting for PdfClerk to ignore zero padding when looking up references against documents and exhibits. The reference Exhibit 0001 will match the alias and exhibit Exhibit 1. If a single zero is not followed by a digit, it will not be ignored in the matching.

Guided Autolinking

PdfClerk’s Autolink function uses formal pattern recognition to automatically find references and translate them into clickable links. For some advanced referencing styles, not all references can be automatically converted into clickable links. This may be the case if the references do not contain sufficient information to determine the target document and page for the reference.

In such situations, you can use PdfClerk’s Guided Autolinking feature to first identify and extract the references and then go through the references and modify them manually.

To use Guided Autolinking, enable the option Export references to Excel (Autolink tab, Advanced). Identified references are then saved to an Excel spreadsheet named “References.xlsx” located in the Output folder. This Excel spreadsheet can then be manually modified to edit, add and remove links. After the Excel spreadsheet has been edited, run PdfClerk using the same configuration and with the same output folder, and select Use links from Excel sheet (in the Autolink tab). PdfClerk will then read the references from the Excel spreadsheet instead of performing searches, and apply the updated links to the documents.

The Excel spreadsheet contains seven columns:

  • Document – The document containing the reference
  • Page – The page of the document containing the reference
  • Rectangle – Defines the rectangle on the page where the reference has been identified
  • Hit – The text of the identified reference
  • Modified – The text of the identified reference repeated. Used when editing the link
  • TargetPage – The page in the target document that should be linked to
  • TargetDoc – The target document to be linked to

PdfClerk supports the following modifications to the links identified in “References.xslx”:

  • Edit the target of a link – Edit the TargetDoc and/or the TargetPage properties to change where the link will point to
  • Remove text from link – If the Hit column contains text that should not be part of the link, remove the part of the text that should not be part of the link from the Modified column
  • Split one reference into several links – If a reference is identified as one reference, but the reference is in fact several references, the reference can be split up into several links pointing to different documents and/or pages. To split one reference into several links, copy the row containing the reference to a new row. Then modify the text in the Modified column to contain the text for each link. Edit the TargetDoc and/or TargetPage accordingly.
  • Add new link – Add a new row and supply the information for Document, Page, TargetDoc and TargetDoc. In the Modified column, write the text of the reference. Leave the Rectangle column blank.
  • Discard a link – Delete the relevant row in the Excel spreadsheet

Multi binder collections

Sometimes a logical document collection with sequential page numbers is spread over multiple pdf binders (split binders). To enable PdfClerk to map references to the correct page in the correct binder add the page ranges and page adjustments in the options for the document in the document list (click the wrench icon). The binders in the logical document collection should have the same Alias.

Example: All exhibits are collected in two binders named ExhibitsPart1.pdf with pages 1-1000 and ExhibitsPart2.pdf with pages 1001-2000. In order to convert references like Exhibits page 12 and Exhibits page 1222 to links to the relevant page in the corresponding binder do the following:

  • Add “Exhibit” as an alias for both files
  • For ExhibitsPart1.pdf set “Start link page” to 1 and “End link page” to 1000
  • For ExhibitsPart2.pdf set “Start link page” to 1001 and “End link page” to 2000
  • For ExhibitsPart2.pdf set the “Adjust page link target” to -1000.

If PdfClerk adds links that are not correct, the links can be removed in a pdf editing tool such as Adobe Acrobat (not Adobe Reader) or Foxit Reader (free, go to “Home”, “Links” in order to edit and remove links). Please note that if the option Highlight links are checked, the highlighting will remain after a link has been removed in the pdf editing tool.

Norwegian Civil Litigation Configuration

The Norwegian Civil Litigation Configuration included in PdfClerk identifies the following patterns:

  • Bilag 3
  • Bilag 3 side 5
  • Stevningen side 4
  • Stevningens bilag 3 side 4
  • Stevningen bilag nr 3
  • Bilag 4 til stevningen, side 4

PdfClerk understands both “side”, “s” and “s.” to indicate page numbers.

When looking for documents PdfClerk accepts alternative forms of the alias, meaning that “stevning” will also match “stevningen” and “stevningens”.

For an example of resolving links in an US style submission have a look at the example “Motion to dismiss” on the Example documents page.

Load/Save

Projects

PdfClerk allows configurations and documents to be saved and loaded. The configuration files are saved as .clerk files which are associated with the PdfClerk. An alternative way to load a configuration is to double click the configuration file.

The settings are separated into three different categories to enable easy reuse of configurations. Which of the categories to load or save can be selected in the interface. Files are the documents in the document list and the input folder and output folder. Searches is configuration related to Autolinking while Configuration is all other settings.

The internal format of the configuration files are JSON, meaning that more technical users with particular needs could create and alter PdfClerk project files using a text editor.

Excel format

PdfClerk can read and write document metadata from and to Microsoft Excel format (.xlsx). This enables the use of a familiar and powerful tool as part of the pdf creation workflow.

The columns and format used are different depending on whether the input is separate files or an existing binder.

Separate files

Document names and other metadata is stored in a sheet named “Documents” with the column names:

  • Level - the bookmark level
  • Filename - the full name of the file - including extension, but not the path
  • Name - the name of the file - not including the extension

Additional metadata is saved in additional columns with the name of the column as the header. The path to the documents is stored in the “Config” sheet as “SourceFolder”.

An example file for a document assembly can be found here.

Existing binder

Bookmarks and other metadata is stored in a sheet named “Binder” with the column names:

  • Level - the bookmark level
  • Name - the name of the bookmark
  • Additional metadata is saved in additional metadata columns with the name of the column as the header

The full path and filename to the binder are stored in the “Config” sheet as “BinderToBeUpdated”.

An example file for a binder can be found here.

General

The order of the columns or formatting does not matter. There should not be any empty columns or rows before non-empty ones.

Tools

The Tools section provides various tools to work with existing files and binders.

This function translates Clerk Links to pdf links. Clerk Links can be used to create custom tables of contents or other documents with direct links to specific pages in pdf binders. The custom documents can be created as Microsoft Word or HTML documents. The Clerk Links is inserted in the custom document as a hyperlink using the format described below.

The default way to create a table of contents in PdfClerk is to automatically create and insert one when creating a binder using the insert Table of contents function on the Output tab. With Clerk Links you can create linked documents with other data sources and have full control over how the resulting document will look.

Clerk links can be inserted manually into the source document, or you can use the function Create Index (described below) to combine a template and data from a Microsoft Excel datasheet into a document.

The created pdf can either be a separate pdf that links to other pdfs or be included in a binder referencing other pages in that binder.

The Clerk Links support the following link formats:

  • clerk://MyBinder.pdf - Creates a link to page 1 in MyBinder.pdf
  • clerk://MyExcelSheet.xlsx - Creates a link to open the Excel spreadsheet (or any other file type)
  • clerk://MyBinder.pdf?page=27 - Creates a link to page 27 in MyBinder.pdf
  • clerk://subdir/MyOtherBinder.pdf?page=3 - Creates a link to page 3 in MyOtherBinder.pdf located in the subfolder subdir.
  • clerk://n/?page=3 - Creates a link to page 3 in the same file
  • clerk://n?anchor=someName - Creates an anchor (named destination) that can be linked to
  • clerk://n/?dest=someName - Links to an anchor (named destination)
  • clerk://n/?embed=MyExcelSheet.xlsx - Embeds MyExcelSheet.xlsx into the binder as an attachment. The link will open the file. If the path is not absolute, PdfClerk looks for the file in the source folder

The pdf file created should be placed in the same folder as the files linked to or if paths are provided (like the example link to MyOtherBinder above) in the parent directory.

When you want to include a file in a binder that links to other parts of such binder you should just convert the file from Word to pdf and then merge the pdf using PdfClerk. You can convert from Word format to pdf directly using Word or you can use this tool in PdfClerk and check Do not convert links.

A possible workflow is to create links in Excel using the Excel Hyperlink function for each relevant document, then copy the links from Excel to Word as a table, and then format the table in Word before activating the links using PdfClerk.

In order to be able to convert a file from Word to pdf format, MS Word version 2010 or newer needs to be installed on the computer. If the file has been converted from Word format to pdf format using another tool, PdfClerk can update the links as described above.

(The “/n/” part (the host part) of the URL is optional for files and can be added to comply with the URL specification: clerk://n/MyBinder.pdf)

Create Index

You can include a table of contents at the same time as you create the binder using the Add Table of Contents in the Settings tab. If you want to link to multiple binders, add a table of contents to an existing binder or for other reasons that need full control over the output, you can use the Create Index function found under the Tools tab.

This function works by combining entries from an Excel spreadsheet into a Word or HTML template. If you want to create a table of contents for one or more existing binders with bookmarks you can export the bookmarks to Excel using the save to Excel function found under the Load/Save tab.

The Create Index function creates an editable .docx or HTML file with Clerk Links on the relevant entries. This allows you to edit the file before you proceed to convert it to pdf using the Convert Clerk Links function.

The Excel datasheet

The Excel datasheet should have a sheet named Catalog with columns with the following column headers

  • file – the file that the entry should link to. If empty it links to the stated page in the same file
  • page – the page in the document that the entry should link to
  • level – The hierarchy level of the entry. If the level is not provided entries will be handled as level 1
  • title – the text used for the link. Could be title or any other value set in the template

Any other columns are handled as metadata and inserted into the document if referenced by the template by the column title as the name.

PdfClerk saves and reads all values either as strings, booleans or numbers. In order to display dates consistently, PdfClerk requires that dates is saved as text in the Excel spreadsheet. If you have a date stored as a date, you need to convert it to text format to properly render the template. The easiest is probably to copy the column containing the date values to Notepad, change the format of the column in Excel to Text and copy them back from Notepad. Alternatively, you can use Excel’s Text function.

If the Excel workbook only has one datasheet, this datasheet is used.

Using a Word template

The Word template could either be defined using an ordinary paragraph or be based on a table. If a table it used it should be on this format:

Column title (optional)

Column title (optional)

<title><#1#>

<page>

<title><#2#>

<page>

  • In order for PdfClerk to identify which table in the Word document contains the relevant table, the title of the table should be set to index. To name the table open Table Properties, go to the tab Alt Text and write “index” in the Title field.
  • The row with <#1#> is the template for level 1 entries, the row with <#2#> is the template for level 2 entries, and further - up to 10 levels. The same applies if you use a template based on a paragraph. But then PdfClerk uses items on the same line instead of items on the same row.
  • The text enclosed in <> is replaced with the corresponding text from the column with the same name in the Excel datasheet. Any number of columns can be added.
  • By default, all text becomes clickable links to the relevant entry. If you do not want a particular entry to be a link you can either make sure that there is no data in the file or page columns or explicitly set the item as no-linking by adding “:0” to the property name like <title:0>.
  • An example template file is provided in the example files.
  • The text will be styled with the style that is applied to the relevant cell in the template. Only one style can be applied to text in a table cell. The style should be based on a paragraph style.
  • If the Excel workbook has a datasheet named Meta, the first two columns of this datasheet are handled as key and value pairs that are accessible in the template using <m-key> where key is the name assigned in the first column and the value written to the template is the corresponding value in the second column.

Using an HTML template

PdfClerk also supports creating index files using HTML templates. You can edit the supplied standard templates or create your own templates. Note the following:

  • Data is filled into the templates using the Liquid template language.
  • Data loaded from the Excel spreadsheet is exposed to the template in the object “catalog” with all lowercase letter properties.
  • By adding “_L” after the name of the property it will be translated into a link. Alternatively, the link uri is available as “uri” for each item.
  • If the Excel workbook has a datasheet named Meta, the first two columns of this datasheet are handled as key and value pairs that are accessible in the template using meta.key where key is the name assigned in the first column and the value written to the template is the corresponding value in the second column.
  • PdfClerk supports full-page coloured background without margins using the custom in-file CSS property “ -clerk-page-background-color” which accepts heximal color codes.
  • PdfClerk supports extracting headings as bookmarks when converting html files to pdf by adding the custom in-file CSS property -clerk-extract-headings-as-outline: all.
  • You can use ordinary HTML, CSS and Javascript to render the templates, and load external stylesheets, images and fonts.

General

The Create Index function creates a Word/HTML document based on the provided template with Clerk Links on all items. The Word/HTML file can be edited before you convert it to pdf using the function Convert Clerk Links. The format of the Clerk Links is described above.

Use the option to Adjust target page number if you need to offset the page numbers provided in the Excel file. This could be handy if you plan to insert the created index at the beginning of the binder that the index links to.

PdfClerk supports alternative links for the same entry (Excel spreadsheet row). Add additional link columns in the Excel spreadsheet (page, file) by adding a number after the name (page2, file2).

In table-based Word-templates the alternative links are available by adding :X where X is the number used as postfix for the name in the Excel spreadsheet. Like <title:2>.

In HTML-templates the alternative links are available in two separate ways:

  • by adding “_Lx” to the property name where x is the number used at the end of the column name in the Excel spreadsheet. Like “title_L2.
  • as additional .uri properties (uri2, uri3 etc)

If the setting Tag links and documents to enable binder recognition under the tab Autolink is enabled when references are recognized and converted to links in a document, information about the link’s target is embedded in the document and the target files. If documents are combined in binders via software other than PdfClerk, this information will remain intact. However, the links will need to be reactivated. By using this function, you can reactivate the links in the binder.

You can also use this function to reconnect links if you have changed the name of one or more of the target files for a link created using the Autolink feature.

Use this function to update link target and link type of links in an existing pdf document.

The function searches for external links starting with the pattern provided in the Link address starts with input box and changes this pattern to the pattern provided in the Replacement input box. Links that do not start with the pattern are not changed.

If you select an input file without selecting any changes, PdfClerk will look through the file for links and list out their addresses. This is handy when determining what patterns should be changed.

You can specify that only links of a certain type should be updated. PdfClerk will then only search for links of the type specified. You can specify that links should be converted to another type of links. PdfClerk maintains direct page links when converting between GotoR links and web links.

By selecting the option Include bookmarks, PdfClerk will also search in the bookmarks.

PdfClerk currently supports updating links of the types GotoR (for pdfs), Launch and Web (URI).

You can process multiple search and replace patterns at once by separating them using the “|”-symbol. As an example putting “folder1|folder2” in the search box and “folderA|folderB” in the replace box will replace all links starting with folder1 with links starting with folderA and links starting with folder2 with folderB.

Presets and example files

PdfClerk ships with predefined settings for various pdf building tasks. The presets are available via the presets menu in the top left corner of PdfClerk. There are examples and example files showing the functionality of PdfClerk available here.

The presets and example files may serve as an introduction to how PdfClerk works. You can create your own presets by saving a project to the presets folder (by default located in Documents\PdfClerk\Presets\).

Preset: Simple Merge

Simple Merge is the default setting, merging documents and adding the filenames (without the filename extension) as bookmarks.

  1. Select and load the Simple Merge preset
  2. Click the Open input files button select Folder and then select the folder with the pdf files you want to merge
  3. Use the Select output folder to select the folder to store the merged file
  4. Click the Generate button

Same as Simple Merge but with the filename (without extension) as document name in the upper left corner and page number and total pages of the binder in the bottom right corner.

  1. Select and load the Merge with header and footer preset
  2. Click the Open input files and Folder to select the folder with the pdf files you want to merge
  3. Click the Select output folder to select the folder to store the merged file
  4. Click the Generate button

Preset: Simple linking

Converts textual references to other documents into clickable links to the referenced document. The reference text is the filename (less the extension) of the file to be referenced. If a reference is followed by a page number, the link will link to this particular page. The links are underlined.

  1. Select Single file to link using the Open input files button
  2. Select a Word or pdf document. The documents to link to should be in the same folder or å subfolder.
  3. The document list loads with all pdf documents in the folder and the selected file selected.
  4. Click the Generate button

Preset: Linked Pleading as Binder

Merges a pleading with its exhibits and adds links to the exhibit references. Adds a bookmark for each exhibit with Bilag X as the title. Adds exhibit numbering inside a rounded red rectangle on the top right of each exhibit and page count on the bottom right of each page. This preset is based on Norwegian template documents and works with the example files from the Norwegian litigation example to understand this preset.

  1. Select and load the Linked Pleading as Binder preset
  2. Select Open input files button and then Folder
  3. Select a folder with the pleading stored as a pdf with the exhibits in a subfolder
  4. Set the exhibit folder to the folder that contains the exhibits using the Map Exhibits link
  5. Use the Select output folder to select the folder to store the linked file
  6. Click the Generate button

Preset: Linked Pleading Attachment Binder

Binds a pleading together with its exhibits and creates blue underlined links between references to the exhibits and the exhibits. The exhibits are added as attachments to the pleading and exhibit numbers are padded to allow natural ordering. Large exhibit numbering on the top right first page of each exhibit. Exhibit numbers are read from the filename using the pattern Bilag X.

Adding exhibits as attachments is only designed to work when there are one document with one set of exhibits.

  • Same as above, except select this preset.

Preset: Set of Linked Pleadings

Converts references to exhibits and pleadings into links as described in section Norwegian Civil Litigation Configuration . In addition, this preset merges all the pleadings and their exhibits into one binder and creates page numbers in red and adds exhibits numbers on the exhibits in the top left corner.

  • Same as above, except select this preset and in step 3 the folder should contain multiple sets of pleadings and their exhibits. This example is based on Norwegian template documents an can be used with the example documents provided in Link & bundle all submissions in a case into one bundle .

Preset: Stamp existing binder

Extracts exhibit numbers from the bookmarks of an existing binder and stamps the first page of each exhibit with the relevant exhibit number with a border around the exhibit number.

  1. Select and load the Stamp existing Binder preset
  2. Open an existing binder by selecting Binder in the Open input files button
  3. Click the Generate button

Global keyboard shortcuts

PdfClerk has the following global keyboard shortcuts

  • ? = Show the keyboard shortcuts
  • ctrl+g = Generate
  • ctrl+s = Save project
  • ctrl+shift-o = Open project
  • ctrl+1 = Open Files tab
  • ctrl+2 = Open Settings tab
  • ctrl+3 = Open Autolink tab
  • ctrl+4 = Open Tools tab
  • ctrl+5 = Open Load & Save tab
  • ctrl+6 = Open Log tab
  • ctrl+7 = Open Help tab

In addition, there are keyboard shortcuts for rearranging documents in the document list.

Legal stuff

Data privacy

PdfClerk is a desktop application installed on the user’s computer. The application does not share any information about the documents processed with any external systems.

License

PdfClerk is the property of Protosys AS and its licensors. Third-party software included in the software is listed on the Help tab of the application

Use of PdfClerk is only permitted for license holders and is then limited to the usage, the number of users and the time period described in the license grant. PdfClerk may only be used for its intended purpose as described in the documentation.

The purpose of a trial license is to test the software, and the output pdfs are marked as not for production use. Use of a trial license for production purposes is not permitted.

Last updated 2022-09-15 for version 2.2