Merging Multiple PDFs Into One With Python

3 min readMar 15, 2023

Recently, I ran into an issue where I needed to merge multiple PDFs into one file. The problem is that I don’t have an Adobe Acrobat Pro subscription and I didn’t want to use online tools due to privacy concerns. To solve this, I built my own using Python.

Hi, I’m Da Data Guy! Before I begin, if you are interested in Power BI, Python, Data visualization and SQL, please follow me on Medium and on my LinkedIn. I focus on writing quality articles breaking down each step in the process so you can follow along and learn too. You can also visit my GitHub page to view my published resources and follow along.

Getting Your Files Prepped

To follow along, you create a folder on your local computer where you will store all the PDF files you want to merge and a Jupyter Notebook (.ipynb) file. Also, for this example, I swapped the original PDF files I used with Power BI PDFs.

For those that don’t have the PyPDF2 library, you can learn how to install it or review their documentation here: PyPDF2 · PyPI. Also, I’m using the latest version, which I believe is 3.0.0 at the time of this writing (3/15/2023).

Let’s Get Started

# Importing all libaries and the updated PyPDF2 library codes. 
# If you need to install, type: pip install PyPDF2
import os 
import PyPDF2
from PyPDF2 import PdfReader , PdfWriter, PdfMerger

pdfFiles = [] # variable 

for root, dirs, filenames in os.walk(os.getcwd()): # Root and directory pathway.
    for filename in filenames: 
        if filename.lower().endswith('.pdf'):# for loop for all files with .pdf in the name.
            pdfFiles.append(os.path.join(root,filename)) 
            # Appending files to root name from OS (operating system).
            
# Sorting the files by forcing everything to lower case.
pdfFiles.sort(key=str.lower)

# Assigning the pdfWriter() function to pdfWriter.
pdfWriter = PyPDF2.PdfWriter()

Example of the pathway the function found on my local machine.

# Displaying the pathways it's found on the local file. 
pdfFiles

['C:\\Users\\NAME\\Desktop\\Data Science\\Jupyter Notebook\\Sandbox\\Merge PDFs\\Test_PDFs\\Harry Potter - Final.pdf',
 'C:\\Users\\NAME\\Desktop\\Data Science\\Jupyter Notebook\\Sandbox\\Merge PDFs\\Test_PDFs\\Mexico Restaurant Ratings.pdf',
 'C:\\Users\\NAME\\Desktop\\Data Science\\Jupyter Notebook\\Sandbox\\Merge PDFs\\Test_PDFs\\Super Bowl-Final.pdf']

If you have several PDF files you are trying to merge, then you might want to count the number of files. Be careful, if you already have ran this script, it will count the output file you’ve generated.

# Displaying the pathways it's found on the local file. 
print(len(pdfFiles))
# Output: 3

The next step is to now append each of the 3 files together based on the file path that we’ve stored in the pdfFiles variable.

for filename in pdfFiles: # Starting a for loop.
    pdfFileObj = open(filename, 'rb') # Opens each of the file paths in filename variable.
    pdfReader = PyPDF2.PdfReader(pdfFileObj) # Reads each of the files in the new varaible you've created above and stores into memory.
        pageObj = pdfReader.pages[pageNum] # Reads only those that are in the varaible.
        pdfWriter.add_page(pageObj) # Adds each of the PDFs it's read to a new page.

Lastly, we need to take the appended files that have been stored in the pdfOutput variable and generate a new PDF output file. The new file will be placed in the same directory as the other PDF files that have been used.

# Name of the PDF file can be written here.
pdfOutput = open('Power_BI_Test_Files.pdf', 'wb') 

# Writing the output file using the pdfWriter function.
pdfWriter.write(pdfOutput)
pdfOutput.close()

URL to view Output File.

https://github.com/DaDataGuy/PDF_Merged_Script/blob/main/Final_PDF_Output/Power_BI_Test_Files.pdf

Thank you!

If you have enjoyed this article, please give me a follow and if you’re interested in Using Python to Transform Multiple CSV files, click here: Using Python To Transform Data From Multiple CSV Files | by Da Data Guy | Medium

Merging Multiple PDFs Into One With Python

Getting Your Files Prepped

Let’s Get Started

URL to view Output File.

Thank you!

Written by Da Data Guy

Responses (5)