Did you try Simple Scan, the default Ubuntu program, but were disappointed to see that it doesn't support OCR, etc.? At the same time, is XSANE too complicated for the simple task you set out to do? Do you miss how easy it was to scan documents with Omnipage?
Well, no wonder ... let's see how to scan and perform OCR in the scanned docs in a very, very simple way. You will be amazed with the results. |
How to scan in 2 simple steps
1.- Install gscan2pdf & tesseract-ocr (along with its respective language pack). That is, in case you are going to scan documents in English, install tesseract-ocr-eng; If they are in Spanish, install tesseract-ocr-eng and so.
sudo apt-get install gscan2pdf tesseract-ocr tesseract-ocr-eng
2.- The rest is pretty straightforward for those who have ever scanned and OCR a document in Windows. I opened gscan2pdf, scan the document, go to Options> OCR and select tesseract as an OCR engine. There are other engines, but Tesseract is by far the best performing engine. Finally, you can save the final document as PDF, DJVU, etc. going to File> Save.
The following video is in English but it is enough to see it to understand how everything works.
Alex: Many gamers have a problem getting “friend zoned” with girls they like.
After explaining to a confused Melissa that he is not Waldo,
but The Hon Ludovick Watson, she agrees to go to
England. Your question also needs to be SIMPLE enough
for her to respond without a tone of thought.
Here is my web blog - Tao of Badass Review
Notice that the packages are also available in Fedora. 🙂
I have two scanners, one is the Canon Scan 5000f for A4 documents, and the other is the Braun NovoScan, for scanning negatives and slides. After installing the gscan2 utility, and rebooting, you don't see any of the scanners. what happened? Why don't you see the scanners?
No offense friends, but there is no point in OCRing math functions.
In any case, they should do OCR to the surrounding text (which explains those functions or whatever) and that the functions remain as images.
Cheers! Paul.
Hey, if you've come up with a solution to your problem, I'd like to know.
I think I'm a little late but I have a question. I'm an engineering student and I'm looking for a way to digitize and clean my notes, but the problem is that most of those notes are full of mathematical symbols, graphs, and functions. Is there currently something that can help me?
Great! Good date! In Arch Tesseract it is in the official repositories, but not gscan2pdf. You have to install it through yaourt.
Thank you very much it helped me a lot, make linux more friendly grace again
You're welcome! It is a pleasure to have been able to help.
A hug! Paul.
Very good I was looking for it, I'll try and I'll tell how this is going.
Thanks, I'll try!
When I go to run the OCR with the Tesseract engine it only gives me the option of the process in English even though I installed the tesseract-ocr-spa package. What I can do?
download gnscaner2pdf but it does not scan, it only searches for devices and does not stop searching after 15 min. What's up?