YIBADA

New US Imaging System that can Read Closed Books is a Big Security Problem for CAPTCHA

| Sep 11, 2016 10:35 PM EDT

MIT and Georgia Tech researchers are designing an imaging system that can read closed books.

Researchers at the Massachusetts Institute of Technology (MIT) and the Georgia Institute of Technology (Georgia Tech) in Atlanta are designing an imaging system capable of reading closed books.

They've announced the development of a prototype of a system they've tested on a stack of papers, each with one letter printed on it. The system was able to correctly identify the letters on the top nine sheets.

The MIT researchers developed the algorithms that acquire images from individual sheets in stacks of paper while the Georgia Tech researchers developed the algorithm that interprets the often distorted or incomplete images as individual letters.

The MIT/Georgia Tech system exploits the fact that trapped between the pages of a book are tiny air pockets only about 20 micrometers deep. The difference in refractive index -- the degree to which they bend light -- between the air and the paper means that the boundary between the two will reflect terahertz radiation back to a detector.

At the moment, the algorithm can correctly deduce the distance from the camera to the top 20 pages in a stack.

Past a depth of nine pages, however, the energy of the reflected signal is so low the differences between frequency signatures are swamped by noise.

Terahertz imaging is still a relatively young technology, and researchers are constantly working to improve both the accuracy of detectors and the power of the radiation sources, so deeper penetration should be possible.

"It's actually kind of scary," said Barmak Heshmat, a research scientist at the MIT Media Lab and corresponding author on the new paper, of the letter-interpretation algorithm.

"A lot of websites have these letter certifications (CAPTCHAs) to make sure you're not a robot, and this algorithm can get through a lot of them."

CAPTCHA is a type of challenge-response test widely used on the internet and in computing to determine whether or not a user is human. The most common form of CAPTCHA requires a user to type the letters of a distorted image, sometimes with the addition of an obscured sequence of letters or digits that appears on the screen.

CAPTCHA means "Completely Automated Public Turing test to tell Computers and Humans Apart."

The MIT/Georgia Tech system uses terahertz radiation, the band of electromagnetic radiation between microwaves and infrared light to read text inside a closed book.

Terahertz radiation has several advantages over other types of waves that can penetrate surfaces such as X-rays or sound waves. It's been widely researched for use in security screening because different chemicals absorb different frequencies of terahertz radiation to different degrees, yielding a distinctive frequency signature for each.

By the same token, terahertz frequency profiles can distinguish between ink and blank paper in a way that X-rays can't.

Terahertz radiation can also be emitted in such short bursts that the distance it has traveled can be gauged from the difference between its emission time and the time at which reflected radiation returns to a sensor. That gives it much better depth resolution than ultrasound.

In the researchers' setup, a standard terahertz camera emits ultrashort bursts of radiation and the camera's built-in sensor detects their reflections. From the reflections' time of arrival, the MIT researchers' algorithm can gauge the distance to the individual pages of the book.

While most of the radiation is either absorbed or reflected by the book, some of it bounces around between pages before returning to the sensor, producing a spurious signal. The sensor's electronics also produce a background hum. One of the tasks of the MIT researchers' algorithm is to filter out all this "noise."

The information about the pages' distance helps. It allows the algorithm to hone in on just the terahertz signals whose arrival times suggest that they are true reflections.

Then, it relies on two different measures of the reflections' energy and assumptions about both the energy profiles of true reflections and the statistics of noise to extract information about the chemical properties of the reflecting surfaces.

"The Metropolitan Museum in New York showed a lot of interest in this, because they want to, for example, look into some antique books that they don't even want to touch," said Heshmat.

He said the system can also be used to analyze any materials organized in thin layers, such as coatings on machine parts or pharmaceuticals.

Heshmat is joined on the paper by Ramesh Raskar, the NEC Career Development Associate Professor of Media Arts and Sciences; Albert Redo Sanchez, a research specialist in the Camera Culture group at the Media Lab; two of the group's other members; and by Justin Romberg and Alireza Aghasi of Georgia Tech.

The study was published in the latest issue of Nature Communications.

Related News

Most Popular

EDITOR'S PICK