TopOCR's Accessible User Interface

With just a single keyboard command, TopOCR can be transformed into a PC-based Reading Machine application for use with document cameras. It has a very easy to learn Visually Accessible User Interface and a very powerful state-of-the-art OCR (Optical Character Reading) system that transforms text images into speech that can be played on a headset or saved on an MP3 Player or Flash Drive. TopOCR can read and speak in eleven different languages (English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish), and all it takes is just one key F8 to select any of the supported languages.

TopOCR automatically performs text orientation correction on each scanned image, so even if your documents are placed sideways or upside down, TopOCR will always be able to read and display them in the proper reading orientation. TopOCR will also use its Text To Speech System to notify you if the light level is too low for effective reading.

TopOCR's Straighten Mode can be controlled with the F5 key. This will allow you to automatically use the Neural Warp Image Correction filter to correct for page curvature and document skew. Neural Warp can also remove clipped pages as well as graphics and is combined with a Noise Reduction Filter to further enhance accuracy and readability.

TopOCR also supports a Super Resolution image capture mode with the F4 key. This mode can be perfect for users with low vision who want to have a magnified high resolution display image. In many cases, it will also give slightly higher OCR accuracy.

TopOCR's Accessible User Interface is completely "self voicing" so it doesn't require an external screen reader. All OCR processing is performed locally on your PC, making the process faster and more convenient. This also makes it possible for you to read documents in situations where you don't have a network connection.

TopOCR also includes the "SeeHear Visual Translator". This webcam based object identification system can recognize up to 1,000 different objects from your webcam and automatically uses TopOCR's Text To Speech system to announce the results!

What About OCR Accuracy?


When TopOCR was compared to KNFB Reader (the leading Accessible OCR Smartphone application), the result of the comparison was that TopOCR with the Tesseract LSTM OCR classifier had 80% less OCR errors thanks to its advanced image processing and OCR technology! Our Sample Camera Images page shows TopOCR is able to read all 8 sample camera images with no errors!.

TAO OCR - Tesseract Accelerated OCR (Windows 10 Only!)


TAO OCR is a very high performance multilingual OCR engine optimized for document cameras. It can read with greater than 99.8% accuracy with a 5.0 MP document camera, even with lower quality images. TAO OCR is not only super-accurate, it's also super-fast, with an average reading speed of under two seconds per page!

Note: TAO OCR is a derivitive of the same OCR engine used in Microsoft's "Seeing AI" application! However, instead of an Android or iPhone app, it is available as a Windows DeskTop application with OCR processing that executes directly on your PC instead of through a "cloud" interface. As a result, TopOCR is much faster than any of the current mobile versions of Seeing AI, on top of being more accurate than any previous PC-based Accessible OCR application.

TopOCR allows you to select which OCR classifier you want Tesseract OCR to use. You can select between using either the TAO OCR classifier or the LSTM OCR classifier by pressing Control-R.

SeeHear Visual Translator


SeeHear WebCam Input Image SeeHear Recognized Objects
 convertible
 limousine
 pickup
 racer
 beach wagon

The SeeHear Visual Translator is a Deep Convolutional Neural Network that can recognize 1,000 different types of objects based on the ImageNet Image DataBase. When you press the SeeHear function key Esc, it will announce to you through TopOCR's Text To Speech System a list of 5 objects that it has recognized in a frame captured from your webcam.

SeeHear 27 Layer Deep Neural Network
  1. arbitrary resolution RGB webcam image is automatically scaled to Neural Network input layer
  2. 19 deep convolutional layers
  3. 5 down sampling layers
  4. 1 non-linear down-sampling layer
  5. 1 softmax loss layer
  6. 1 cost layer
  7. recognition output is automatically converted to Text To Speech and played on your headset

SeeHear's Deep Convolutional Neural Network requires nearly 25 billion floating point calculations to process the full network pipeline with all 27 layers. It takes about 3 seconds for this network to perform an inference on a VGA sized(640x480) webcam image using a 4-core Intel 3.4GHz i7-6700 (CPU only).

Before you use SeeHear, make sure you have a webcam with a resolution of 640x480 or greater attached to your PC!

To use SeeHear, first, press the F6 key to select the proper webcam, then point your webcam at any scene you want to recognize, and then press the Esc key. You will immediately hear a "ticking clock" sound that can last from 2-12 seconds depending on the speed of your PC, followed by the list of recognized objects spoken to you through TopOCR's Text To Speech interface. If you want to return to document scanning, then press the F6 key and select your document camera instead of your webcam, or you can press the Esc key again to recognize another frame from your webcam.

A recommended platform to use with SeeHear would be a notebook PC running Windows 10 in the $299 to $599 price range with a built-in webcam and HDMI port.

Note: Currently, SeeHear's list of 1,000 recognizable objects is English only.

TopOCR's Accessible User Interface Function Keys


               



TopOCR's Accessible Interface has 13 Function Keys that are organized as 6 command keys plus 7 option keys. Whenever you use an option key to change a particular setting, the change is permanently stored on your PC until you decide to change it again. Here is a brief functional description of all of the Accessible Function Keys:

Esc - SeeHear Visual Translator - announce a list of objects recognized in a webcam image
--
F1 - Scan an image, OCR the image, and read the OCR Output
F2 - Pause/Resume Text To Speech reading
F3 - Save the OCR text output as an MP3 audio file and copy it to an MP3 player or save it as text and copy it to a flash drive
F4 - Turn ON/OFF Super Resolution mode, - default is OFF
F5 - Straighten Mode - NONE, Straighten Columns, or Neural Warp - default is NONE
--
F6 - Select Document Camera or WebCam - Default if only 1 camera is present
F7 - Select Capture Delay Timer for Image Capture - Default is 10 seconds
F8 - Select Language for OCR and Speech - Default is English
F9 - Select Voice - select the Text To Speech voice using either MS Speech or SAPI Text To Speech voices
F10 - Select Volume - select the volume of the voice
--
F11 - Audio and Screen Help Information
F12 - Exit TopOCR

Note: After changing either Voice or Volume, you will need to exit TopOCR and restart the program.

TopOCR's Accessible User Interface Control Keys


TopOCR's Accessible User Interface also uses 5 easy to locate (Control+QWERT) top level control keys that are described below:

Control-Q - switch between the Accessible User Interface mode and the standard Windows GUI mode - the default is the standard Windows GUI mode.

Control-W - turn ON/OFF "Debug Mode" (before OCR text is spoken - announce number of spelling errors) - default is OFF.

Control-E - list all of the installed languages for TAO OCR.

Control-R - switch between TAO OCR and LSTM OCR for recognition.

Control-T - change the format of the saved OCR text output as either an MP3 audio or raw text file.

TopOCR's Scroll Control Keys


TopOCR Reader also has four keys to allow you to scroll a displayed image.



Down Arrow - Scroll the currently displayed image downward
Up Arrow - Scroll the currently displayed image upward
Left Arrow - Scroll the currently displayed image to the left
Right Arrow - Scroll the currently displayed image to the right

TopOCR's Accessible User Interface Text To Speech System with Multiple Voices


TopOCR's Accessible User Interface provides a very easy to use interface to a very powerful concept! It merges free medium quality Microsoft Speech voices with any optional SAPI voices you may have installed for a particular language into a single list of available voices. TopOCR maintains a pointer to the current voice index in this voice list, and all speech is performed using this pointer. To change to a different voice, all you have to do is move this pointer, and you can easily do that using the F9 key.

Please note that TopOCR is configured by default to be a US English system and so does have a dependency on a US English MS Speech voice being available in order to use the TTS system. If you are operating on a version of Windows on any location other than US English, then you will need to download the MicroSoft Speech US English voice and optionally any additional voices you may require. Please go to our Voices page for more information.

Note: Currently, MP3 creation is restricted to MS Speech voices only.

TopOCR Installation


After you double-click on the TopOCR installation file there are 3 steps to complete the installation:

  1. Select "Yes" to allow User Account Control to install the program
  2. Press "Enter" to begin installation
  3. Press "Enter" to finish installation

TopOCR's Accessible User Interface Configuration


Once you've installed TopOCR on your PC then you're ready to configure it for use with your document camera. This generally only needs to be done once and takes just a few seconds to complete.

1. Plug your document camera into a USB 2.0 port on your PC.
2. Launch the TopOCR application by typing Ctrl+Alt+Q from the DeskTop Window.
3. Type "Control-Q" to put TopOCR into Accessible Mode.
4. TopOCR's default language is English, if you want to use the Accessibile User Interface with another language, then use the F8 key to select another language.
5. If you have a PC with more than one webcam/document camera, use the F6 key to select the correct document camera.

TopOCR's Accessible User Interface configuration, like its operation is 100% Accessible, there are no on screen dialogs, menus, or buttons that would require the use of an external screen reader. All functions are handled through the keyboard.

TopOCR's Accessible User Interface and the Clipboard


After OCR, TopOCR automatically places the recognized text in the clipboard, and it also allows you to paste images with "Control-V" into the clipboard and have them automatically read to you. A "quick and dirty" screen reader can be easily implemented in TopOCR's Accessible User Interface by typing "Ctrl + Alt + Print Screen" and then typing "Control V" in TopOCR.

TopOCR's Accessible User Interface Global Illumination Check


Whenever you press the F1 key to OCR a document, TopOCR will measure the total amount of available light on the scanned image. If this value is too low then TopOCR will give you a text to speech warning and will also provide you with the global illumination value. If this number is very low, below 10 for instance, it means that there is very little illumination. Please note that if you have a dark background and a small text image, it can give a lower than expected value, and as a result may generate a false warning.

TopOCR's Accessible User Interface Keyboard Shortcut


When you install TopOCR it automatically creates a keyboard shortcut. Pressing the Ctrl+Alt+Q key combination will launch TopOCR from the DeskTop. If you want to change the TopOCR keyboard shortcut key to another letter, you can by using the following procedure:

1. Right-click on the DeskTop TopOCR shortcut, and then click Properties
2. In the Shortcut Properties dialog box, click the Shortcut tab
3. Click in the Shortcut key box, press the key on your keyboard that you want to use in combination with Ctrl+Alt, for instance "G" and then click OK.

After this step has been performed, pressing Ctrl+Alt+G key combination will launch TopOCR from the DeskTop.