Abstract:
A real-time object detection and recognition Application is made to assist VI people
to move safely, in indoor surroundings, by using a mobile device. The Application
changes the visual world into the sound world with the possibility to help visually
impaired users out with their surroundings. Objects or Text (signboard etc.) detected
with help of smart-phone camera from the environment are differentiated by their
labels and converted them into speech so that it would easy for the users to know about
what’s near to them. The Application comprises of several modules. For instance,
Video that is captured with a mobile camera on the user side and is moved to the server
for real-time image recognition with existing object detection models (TensorFlow).
The location ofthe objects is estimated from the location and the size ofthe bounding
boxes from the detection algorithm. Then, a sound generation application based on
IBM Watson or Google text to speech renders the binaural sound with locations
encoded. The sound is transmitted to the user with mobile device speaker. The
prototype device is tested in a situation where a VI people being exposed to a new
environment. The device helps the user successfully found a table that is 3-5 meters
away, walk toward the table.