Extract text from image or pdf

10 March 2021

How can you effectively extract text from a pdf or an image ? commmonly called OCR (optical character recognition). I found 2 extremly powerfull tools based on the open source engine Tesseract (Official website).

I am using windows and can be both used on this OS. One permit to convert scanned pdf to searchable pdf (as well as copiable). The other permit to get a screenshot from an area of your screen, convert it to text and store it in your clipboard.

  • Ocrmypdf
    • you need to use Ubuntu on windows more info here
    • update your apt: sudo apt-get update
    • install it: sudo apt install ocrmypdf
  • normcaphttps://github.com/dynobo/normcap
    • easy to install, just use the exe

Have a try :)


    Join the discussion for this article on this ticket. Comments appear on this page instantly.
    Thanks to aristaht for making this static comment system possible.