Imagine you are a tourist in a foreign country, with a different alphabet, in Japan, say. You see a vending machine, but you are not able to know which products are being sold, because you can’t read the names.
Or you’re in an airport or a train station in the same country, and you can’t read the board to know if your flight is late or is arriving on time. The same happens if you’re staying at home, but you are one of many visually impaired customers that have trouble accessing information displayed in public spaces.
So, what do these situations have in common? They deal with a context where a fixed layout is present (the vending machine or the airplane board), but the content changes dynamically and, for one reason or another, can’t be accessed the usual way. A solution could be using QR codes to identify the products or use an iPhone app like VizWiz to identify the content, crowdsourcing the recognition to an online community. All of these approaches have their pros and cons: QR codes for instance, require connectivity, while the effectiveness of crowdsourcing depends on how fast the community’s response is.
IBMIBM+0.16%’s researchers in Brazil are trying a different path: instead of focusing on the content, which could be too time consuming or require too much computational power, not to mention connectivity, they are focusing on the layout. The secret is using some identifying markers, placed around a display, and smooth the content’s identification process by using some training templates.