SearchWP offers the unique feature of extracting plain text from PDF files uploaded to your WordPress website. Out of the box, SearchWP attempts to do this using only PHP, but due to the complexity and variation of the PDF format that sometimes results in content not being accurately extracted. Enterย Xpdf.
Xpdfย has a set of command line tools thatย must be installed on your serverย in order for this Extension to work. Instructions are included here.
Using the Xpdf Integration Extension you can offload all the work PHP has to do in processing your PDF files to Xpdfโs command line tools, which are extremely fast and accurate when extracting content from your PDFs. After activating the Extension, you will need to follow the installation instructions. Once installed, SearchWP will offload the PDF content extraction process to Xpdf.
Using this extension you can utilizeย Xpdfย to extract the content from your PDFs.
IMPORTANT:ย Xpdf command line tools are not provided in this Extension download. Youย mustย follow these instructions to download the command line tools and upload them to aย non-publicย (outside your Web root) location.
You can download the Xpdf command line tools for both Windows and Linux atย http://www.xpdfreader.com/download.html.
Once you have downloaded theย command line toolsย for your server type:
xpdf-tools-linux-4.03.tar.gzย (the version number may be different)pdftotextย binary (found in either theย bin32ย orย bin64ย directory after extracting, depending on your server architecture) to aย non-publicย location, outside your Web rootpdfinfoย binary (found in either theย bin32ย orย bin64ย directory after extracting, depending on your server architecture) to aย non-publicย location, outside your Web rootpdftotextย andย pdfinfoย have execute permissions for the PHP user on your serverThe last step is to tell SearchWP Xpdf Integration where you installedย pdftotextย andย pdfinfo. To do this:
Add the following to yourย SearchWP Customizationsย plugin,ย replacingย /path/to/pdftotextย with the actual pathย to theย pdftotextย andย pdfinfoย binaries (not the folder)ย on your server.
// Tell SearchWP the location of the pdftotext binary.
add_filter( 'searchwp_xpdf_path', function() {
return '/home/johndoe/pdftotext'; // Full absolute path to the binary NOT A FOLDER, NOT A URL.
} );
// Tell SearchWP the location of the pdfinfo binary.
add_filter( 'searchwp_pdfinfo_path', function() {
return '/home/johndoe/pdfinfo'; // Full absolute path to the binary NOT A FOLDER, NOT A URL.
} );
Thatโs it!
See also:ย Adding PDF password support
After uploading and activating the Xpdf Integration Extension and defining your path toย pdftotext, you can manually confirm that Xpdf text extraction is working as expected on specific PDFs uploaded to your Media library. Begin by going to the SearchWP Settings screen (Settings > SearchWP) and find the Xpdf Integration link in the Extensions on the SearchWP settings screen.
On the Xpdf Integration Testing screen, you can enter in the ID of the PDF youโd like to test:

The ID can be found by navigating to your Media section and then clicking the Edit link of your PDF, the ID will be in the URL, followed byย post=
After submitting a valid ID you will be given a detailed log of the steps taken by the Xpdf Integration Extension as well as any failure points that may have occurred. Youโre also shown the exact content Xpdf extracted from the PDF:

If the log displays a point of failure, please include that in any support requests you submit.
Exclusive bulk discount on all digital product categories. The more you buy, the more you save!
Valid until May 31, 2026 ยท All categories included ยท Discount applied automatically at checkout.
User Reviews
There are no reviews yet.