How brain architecture relates to consciousness and abstract thought

Could lead to better ways to identify and treat brain diseases and to new deep-learning AI systems
December 29, 2015
humanconnectome
Human brain connectome (credit: NIH Human Connectome Project)

Ever wonder how your brain creates your thoughts, based on everything that’s happening around you (and within you), and where these thoughts are actually located in the brain?

UMass Amherst computational neuroscientist Hava Siegelmann has, and she created a geometry-based method for doing just that. Her team did a massive data analysis of 20 years of functional magnetic resonance imaging (fMRI) data from tens of thousands of brain imaging experiments. The goal was to understand how abstract thought arises from brain structure, which could lead to better ways to identify and treat brain disease and even to new deep-learning artificial intelligence (AI) systems.

Details appear in an open-access article in the current issue of Nature Scientific Reports.

How abstract thoughts are formed

KurzweilAI has covered more than 200 research projects involving fMRI. Basically, fMRI detects changes in neural blood flow, which relates to specific brain activities (such as imagining what an object looks like, or talking). More blood flow means higher levels of neural activity in that specific brain region. While fMRI-based research has done an impressive job of relating specific brain areas with activities, surprisingly, “no one had ever tied together the tens of thousands of experiments performed over decades to show how the physical brain could give rise to abstract thought,” Siegelmann notes.

For this study, the researchers took a data-science approach. First, they defined a physiological directed network (a form of a graph with nodes and links) of the whole brain, starting at input areas and labeling each brain area with the distance (or “depth”) from sensory inputs. For example, in the drawing below, the visual cortex (in green) is located far away from the eyes (on the left) while the auditory cortex (in yellow) is relatively close to the ears (although routing via the thalamus makes this more complex).

OK, so what does that mean in terms of thinking? To find out, they processed a massive repository of fMRI data from about 17,000 experiments, representing about one fourth of the fMRI literature).

Regions of motor and sensory cortex (credit: Blausen.com staff/Blausen gallery 2014/Wikiversity)

“The idea was to project the active regions for a cognitive behavior onto the network depth and describe that cognitive behavior in terms of its depth distribution,” says Siegelmann. “We momentarily thought our research failed when we saw that each cognitive behavior showed activity through many network depths. Then we realized that cognition is far richer; it wasn’t the simple hierarchy that everyone was looking for. So, we developed our geometrical ‘slope’ algorithm.”

Ranking cognitive behaviors

The researchers summed all neural activity for a given behavior over all related fMRI experiments, then analyzed it using the slope algorithm. “With a slope identifier, behaviors could now be ordered by their relative depth activity, with no human intervention or bias,” she adds. They ranked slopes for all cognitive behaviors from the fMRI databases from negative to positive and found that they ordered from more tangible to highly abstract. An independent test of an additional 500 study participants supported the result.

She and colleagues found that cognitive function and abstract thought exist as a combination of many cortical sources ranging from those close to sensory cortices to far deeper from them along the brain connectome, or connection wiring diagram.

Generated by human-blind automated procedures, this diagram depicts an oversimplified graphical model of the information representation flow from sensory inputs (bottom) to abstract representations (top) in human cortex. Bottom layer of the pyramid included a sample representative description of the 20th percentile of behavioral elements closest to sensory inputs, the next layer up includes a sample description of behavioral elements from the 20–40th percentile…with the top layer containing a sample description of the behavioral elements distributed deepest in the cortical network, at the structural pinnacle of cognition. (credit: P. Taylor et al./Nature Scientific Reports)

The authors say their work demonstrates that all cognitive behaviors exist on a hierarchy, starting with the most tangible behaviors (such as finger tapping or pain), then to consciousness, and extending to the most abstract thoughts and activities such as naming. This hierarchy of abstraction is related to the connectome structure of the whole human brain — the connections between different regions of the brain — they add.

Creating a massively recurrent deep learning network

Siegelmann says this work will have great impact in computer science, especially in deep learning. “Deep learning is a computational system employing a multi-layered neural net, and is at the forefront of artificial intelligence (AI) learning algorithms,” she explains. “It bears similarity to the human brain in that higher layers are agglomerations of previous layers, and so provides more information in a single neuron.

“But the brain’s processing dynamic is far richer and less constrained because it has recurrent interconnection, sometimes called feedback loops. In current human-made deep learning networks that lack recurrent interconnections, a particular input cannot be related to other recent inputs, so they can’t be used for time-series prediction, control operations, or memory.”

Her lab is now creating a “massively recurrent deep learning network,” she says, for a more brain-like and superior learning AI, along with a new geometric data-science tool, which may find widespread use in other fields where massive data is difficult to view coherently due to data overlap.

New hope for biomarkers of brain disorders

Siegelmann believes this work will also have far-reaching effects for brain disorders. “Many brain disorders are implicated by non-standard processing or abnormal combination of sensory information,” she says. “Currently, many brain disorders lack a clear biological identifier, and are diagnosed by symptoms, such as confusion, memory loss and depression.

Our research suggests an entirely new method for analyzing brain abnormalities and is a source of new hope for developing biomarkers for more accurate and earlier diagnoses of psychiatric and neurological diseases.”

Siegelmann is director of the Biologically Inspired Neural and Dynamical Systems Laboratory at UMass Amherst and one of 16 recipients in 2015 of the National Science Foundation’s (NSF) Brain Research through Advancing Innovative Neurotechnologies (BRAIN) program initiated by President Obama to advance understanding of the brain. The work is supported by the U.S. Office of Naval Research.

Abstract of The global landscape of cognition: hierarchical aggregation as an organizational principle of human cortical networks and functions

Though widely hypothesized, limited evidence exists that human brain functions organize in global gradients of abstraction starting from sensory cortical inputs. Hierarchical representation is accepted in computational networks, and tentatively in visual neuroscience, yet no direct holistic demonstrations exist in vivo. Our methods developed network models enriched with tiered directionality, by including input locations, a critical feature for localizing representation in networks generally. Grouped primary sensory cortices defined network inputs, displaying global connectivity to fused inputs. Depth-oriented networks guided analyses of fMRI databases (~17,000 experiments;~1/4 of fMRI literature). Formally, we tested whether network depth predicted localization of abstract versus concrete behaviors over the whole set of studied brain regions. For our results, new cortical graph metrics, termednetwork-depth, ranked all databased cognitive function activations by network-depth. Thus, we objectively sorted stratified landscapes of cognition, starting from grouped sensory inputs in parallel, progressing deeper into cortex. This exposed escalating amalgamation of function or abstraction with increasing network-depth, globally. Nearly 500 new participants confirmed our results. In conclusion, data-driven analyses defined a hierarchically ordered connectome, revealing a related continuum of cognitive function. Progressive functional abstraction over network depth may be a fundamental feature of brains, and is observed in artificial networks.

references:
P. Taylor, J. N. Hobbs, J. Burroni, H. T. Siegelmann. The global landscape of cognition: hierarchical aggregation as an organizational principle of human cortical networks and functions. Scientific Reports, 2015; 5: 18112 DOI: 10.1038/srep18112 (open access)
Supplementary Information

http://www.kurzweilai.net/how-brain-architecture-relates-to-consciousness-and-abstract-thought

The top A.I. breakthroughs of 2015

December 29, 2015

(credit: iStock)

By Richard Mallah
Courtesy of Future of Life Institute

Progress in artificial intelligence and machine learning has been impressive this year. Those in the field acknowledge progress is accelerating year by year, though it is still a manageable pace for us. The vast majority of work in the field these days actually builds on previous work done by other teams earlier the same year, in contrast to most other fields where references span decades.

Creating a summary of a wide range of developments in this field will almost invariably lead to descriptions that sound heavily anthropomorphic, and this summary does indeed. Such metaphors, however, are only convenient shorthands for talking about these functionalities.

It’s important to remember that even though many of these capabilities sound very thought-like, they’re usually not very similar to how human cognition works. The systems are all of course functional and mechanistic, and, though increasingly less so, each are still quite narrow in what they do. Be warned though: in reading this article, these functionalities may seem to go from fanciful to prosaic.

The biggest developments of 2015 fall into five categories of intelligence: abstracting across environments, intuitive concept understanding, creative abstract thought, dreaming up visions, and dexterous fine motor skills. I’ll highlight a small number of important threads within each that have brought the field forward this year.

Abstracting Across Environments

A long-term goal of the field of AI is to achieve artificial general intelligence, a single learning program that can learn and act in completely different domains at the same time, able to transfer some skills and knowledge learned in, e.g., making cookies and apply them to making brownies even better than it would have otherwise. A significant stride forward in this realm of generality was provided by Parisotto, Ba, and Salakhutdinov. They built on DeepMind’s seminal DQN, published earlier this year in Nature, that learns to play many different Atari games well.

ZeitgeistMinds | Demis Hassabis, CEO, DeepMind Technologies — The Theory of Everything

Instead of using a fresh network for each game, this team combined deep multitask reinforcement learning with deep-transfer learning to be able to use the same deep neural network across different types of games. This leads not only to a single instance that can succeed in multiple different games, but to one that also learns new games better and faster because of what it remembers about those other games. For example, it can learn a new tennis video game faster because it already gets the concept — the meaningful abstraction — of hitting a ball with a paddle from when it was playing Pong. This is not yet general intelligence, but it erodes one of the hurdles to get there.

Reasoning across different modalities has been another bright spot this year. The Allen Institute for AI and University of Washington have been working on test-taking A.I.s over the years, working up from 4th grade level tests to 8th grade level tests, and this year announced a system that addresses the geometry portion of the SAT. Such geometry tests contain combinations of diagrams, supplemental information, and word problems.

In more narrow A.I., these different modalities would typically be analyzed separately, essentially as different environments. This system combines computer vision and natural language processing, grounding both in the same structured formalism, and then applies a geometric reasoner to answer the multiple-choice questions, matching the performance of the average American 11th grade student.

Intuitive Concept Understanding

A more general method of multimodal concept grounding has come about from deep learning in the past few years: Subsymbolic knowledge and reasoning are implicitly understood by a system rather than being explicitly programmed in or even explicitly represented. Decent progress has been made this year in the subsymbolic understanding of concepts that we as humans can relate to.

This progress helps with the age-old symbol grounding problem: how symbols or words get their meaning. The increasingly popular way to achieve this grounding these days is by joint embeddings — deep distributed representations where different modalities or perspectives on the same concept are placed very close together in a high-dimensional vector space.

Last year, this technique helped power abilities like automated image caption writing, and this year a team from Stanford and Tel Aviv University have extended this basic idea to jointly embed images and 3D shapes to bridge computer vision and graphics. Rajendran et al. then extended joint embeddings to support the confluence of multiple meaningfully related mappings at once, across different modalities and different languages.

As these embeddings get more sophisticated and detailed, they can become workhorses for more elaborate A.I. techniques. Ramanathan et al. have leveraged them to create a system that learns a meaningful schema of relationships between different types of actions from a set of photographs and a dictionary.

As single systems increasingly do multiple things, and as deep learning is predicated on, any lines between the features of the data and the learned concepts will blur away. Another demonstration of this deep feature grounding, by a team from Cornell and WUStL, uses a dimensionality reduction of a deep net’s weights to form a surface of convolutional features that can simply be slid along to meaningfully, automatically, photorealistically alter particular aspects of photographs, e.g., changing people’s facial expressions or their ages, or colorizing photos.

(credit: Jacob R. Gardner et al.)

One hurdle in deep learning techniques is that they require a lot of training data to produce good results. Humans, on the other hand, are often able to learn from just a single example. Salakhutdinov, Tenenbaum, and Lake have overcome this disparity with a technique for human-level concept learning through Bayesian program induction from a single example. This system is then able to, for instance, draw variations on symbols in a way indistinguishable from those drawn by humans.

Creative Abstract Thought

Beyond understanding simple concepts lies grasping aspects of causal structure — understanding how ideas tie together to make things happen or tell a story in time — and to be able to create things based on those understandings. Building on the basic ideas from both DeepMind’s neural Turing machine and Facebook’s memory networks, combinations of deep learning and novel memory architectures have shown great promise in this direction this year. These architectures provide each node in a deep neural network with a simple interface to memory.

Kumar and Socher’s dynamic memory networks improved on memory networks with better support for attention and sequence understanding. Like the original, this system could read stories and answer questions about them, implicitly learning 20 kinds of reasoning, like deduction, induction, temporal reasoning, and path finding. It was never programmed with any of those kinds of reasoning. The new end-to-end memory networks of Weston et al. added the ability to perform multiple computational hops per output symbol, expanding modeling capacity and expressivity to be able to capture things like out-of-order access, long-term dependencies, and unordered sets, further improving accuracy on such tasks.

Programs themselves are of course also data, and they certainly make use of complex causal, structural, grammatical, sequence-like properties, so programming is ripe for this approach. Last year neural Turing machines proved deep learning of programs to be possible.

This year Grefenstette et al. showed how programs can be transduced, or generatively figured out from sample output, much more efficiently than with neural Turing machines, by using a new type of memory-based recurrent neural networks (RNNs) where the nodes simply access differentiable versions of data structures such as stacks and queues. Reed and de Freitas of DeepMind have also recently shown how their neural programmer-interpreter can represent lower-level programs that control higher-level and domain-specific functionalities.

Another example of proficiency in understanding time in context, and applying that to create new artifacts, is a rudimentary but creative video summarization capability developed this year. Park and Kim from Seoul National U. developed a novel architecture called a coherent recurrent convolutional network, applying it to creating novel and fluid textual stories from sequences of images.

(credit: Cesc Chunseong Park and Gunhee Kim)

Another important modality that includes causal understanding, hypotheticals, and creativity in abstract thought is scientific hypothesizing. A team at Tufts combined genetic algorithms and genetic pathway simulation to create a system that arrived at the first significant new A.I.-discovered scientific theory of how exactly flatworms are able to regenerate body parts so readily as they do. In a couple of days, it had discovered what eluded scientists for a century. This should provide a resounding answer to those who question why we would ever want to make A.I.s curious in the first place.

Dreaming Up Visions

A.I. did not stop at writing programs, travelogues, and scientific theories this year. There are A.I.s now able to imagine, or using the technical term, hallucinate, meaningful new imagery as well. Deep learning isn’t only good at pattern recognition, but indeed pattern understanding and therefore also pattern creation.

A team from MIT and Microsoft Research have created a deep convolution inverse graphic network, which, among other things, contains a special training technique to get neurons in its graphics code layer to differentiate to meaningful transformations of an image. In so doing, they are deep-learning a graphics engine, able to understand the 3D shapes in novel 2D images it receives, and able to photorealistically imagine what it would be like to change things like camera angle and lighting.

A team from NYU and Facebook devised a way to generate realistic new images from meaningful and plausible combinations of elements it has seen in other images. Using a pyramid of adversarial networks — with some trying to produce realistic images and others critically judging how real the images look — their system is able to get better and better at imagining new photographs. Though the examples online are quite low-res, offline, I’ve seen some impressive related high-res results.

Also significant in ’15 is the ability to deeply imagine entirely new imagery based on short English descriptions of the desired picture. While scene renderers taking symbolic, restricted vocabularies have been around a while, this year has seen the advent of a purely neural system doing this in a way that’s not explicitly programmed. This University of Toronto team applies attention mechanisms to generation of images incrementally based on the meaning of each component of the description, in any of a number of ways per request. So androids can now dream of electric sheep.

[+]

(credit: Elman Mansimov et al.)

There has even been impressive progress in computational imagination of new animated video clips this year. A team from U. Michigan created a deep analogy system, which recognizes complex implicit relationships in exemplars and is able to apply that relationship as a generative transformation of query examples. They’ve applied this in a number of synthetic applications, but most impressive is the demo (from the 10:10-11:00 mark of the video embedded below) where an entirely new short video clip of an animated character is generated based on a single still image of the never-before-seen target character and a comparable video clip of a different character at a different angle.

University of Michigan | Oral Session: Deep Visual Analogy-Making

While the generation of imagery was used in these for ease of demonstration, their techniques for computational imagination are applicable across a wide variety of domains and modalities. Picture these applied to voices or music, for instance.

Agile and Dexterous Fine Motor Skills

This year’s progress in A.I. hasn’t been confined to computer screens.

Earlier in the year, a German primatology team recorded the hand motions of primates in tandem with corresponding neural activity, and they’re able to predict, based on brain activity, what fine motions are going on. They’ve also been able to teach those same fine motor skills to robotic hands, aiming at neural-enhanced prostheses.

In the middle of the year, a team at U.C. Berkeley announced a much more general, and easier, way to teach robots fine motor skills. They applied deep reinforcement learning based guided policy search to get robots to be able to screw caps on bottles, to use the back of a hammer to remove a nail from wood, and other seemingly everyday actions. These are the kind of actions that are typically trivial for people but very difficult for machines, and this team’s system matches human dexterity and speed at these tasks. It actually learns to do these actions by trying to do them using hand-eye coordination, and by practicing, refining its technique after just a few tries.

Watch This Space

This is by no means a comprehensive list of the impressive feats in A.I. and ML for the year. There are also many more foundational discoveries and developments that have occurred this year, including some that I fully expect to be more revolutionary than any of the above, but those are in early days and so out of the scope of these top picks.

This year has certainly provided some impressive progress. But we expect to see even more in 2016. Coming up next year, I expect to see some more radical deep architectures, better integration of the symbolic and subsymbolic, some impressive dialogue systems, an A.I. finally dominating the game of Go, deep learning being used for more elaborate robotic planning and motor control, high-quality video summarization, and more creative and higher-resolution dreaming, which should all be quite a sight.

What’s even more exciting are the developments we don’t expect.

Richard Mallah is the Director of A.I. Projects at technology beneficence nonprofit Future of Life Institute, and is the Director of Advanced Analytics at knowledge integration platform firm Cambridge Semantics, Inc.

http://www.kurzweilai.net/the-top-ai-breakthroughs-of-2015

Single-molecule detection of contaminants, explosives or diseases now possible

December 30, 2015
[+]

Artistic illustration showing an ultrasensitive detection platform termed “slippery liquid infused porous surface-enhanced Raman scattering” (SLIPSERS). An aqueous or oil droplet containing gold nanoparticles and captured analytes is allowed to evaporate on a slippery substrate, leading to the formation of a highly compact nanoparticle aggregate for use in surface-enhanced Raman scattering (SERS) detection. (credit: Shikuan Yang, Birgitt Boschitsch Stogin and Tak-Sing Wong/Penn State)

Penn State researchers have invented a way to detect single molecules of a number of chemical and biological species from gaseous, liquid or solid samples, with applications in analytical chemistry, molecular diagnostics, environmental monitoring and national security.

The invention is called SLIPSERS, an acronym combining “slippery liquid-infused porous surfaces” (SLIPS) and surface enhanced Raman scattering (SERS).*

“Being able to identify a single molecule is already very difficult. Being able to detect those molecules in all three phases [air, liquid, or bound to a solid] — that is really challenging,” said Tak-Sing Wong, assistant professor of mechanical engineering and the Wormley Family Early Career Professor in Engineering.

Although there are other techniques that allow researchers to concentrate molecules on a surface, those techniques mostly work with water as the medium. SLIPS can be used with any organic liquids, which makes it useful for environmental detection in soil samples, for example.

One of the researchers’ next steps will be to detect biomarkers in blood for disease diagnosis at very early stages of cancer when the disease is more easily treatable.

“Although the SLIPS technology is patented and licensed, the team has not sought patent protection on their SLIPSER work. “We believe that offering this technology to the public will get it developed at a much faster pace,” said Professor Wong. “This is a powerful platform that we think many people will benefit from.”

Their work appears an open-access online paper in Proceedings of the National Academy of Sciences (PNAS). It was funded by the National Science Foundation.

* Raman spectroscopy is a well-known method of analyzing materials in a liquid form using a laser to interact with the vibrating molecules in the sample, generating scattered light. But the molecule’s unique vibration shifts the frequency of the photons in the laser light beam up or down in a way that is characteristic of only that type of molecule, allowing it be uniquely identified.

However, the Raman signal is typically very weak and has to be enhanced in some way for detection. In the 1970s, researchers found that chemically roughening the surface of a silver substrate concentrated the Raman signal of the material adsorbed on the silver, and SERS was born.

Wong developed SLIPS as a post-doctoral researcher at Harvard University. SLIPS is composed of a surface coated with regular arrays of nanoscale posts infused with a liquid lubricant that does not mix with other liquids. The small spacing of the nanoposts traps the liquid between the posts and the result is a slippery surface that nothing adheres to.

“The problem,” Wong said, “is that trying to find a few molecules in a liquid medium is like trying to find a needle in a haystack. But if we can develop a process to gradually shrink the size of this liquid volume, we can get a better signal. To do that we need a surface that allows the liquid to evaporate uniformly until it gets to the micro or nanoscale. Other surfaces can’t do that, and that is where SLIPS comes in.”

The researchers assemble the gold nanoparticles so they have nanoscale gaps, called SERS “hot spots.” Using a laser with the right wavelength, the electrons will oscillate and a strong magnetic field will form in the gap area. This gives us very strong SERS signals of the molecules located within the gaps.”

If a droplet of liquid is placed on any normal surface, it will begin to shrink from the top down. When the liquid evaporates, the target molecules are left in random configurations with weak signals. But if all the molecules can be clustered among the gold nanoparticles, they will produce a very strong Raman signal.

Abstract of Ultrasensitive surface-enhanced Raman scattering detection in common fluids

Many analytes in real-life samples, such as body fluids, soil contaminants, and explosives, are dispersed in liquid, solid, or air phases. However, it remains a challenge to create a platform to detect these analytes in all of these phases with high sensitivity and specificity. Here, we demonstrate a universal platform termed slippery liquid-infused porous surface-enhanced Raman scattering (SLIPSERS) that enables the enrichment and delivery of analytes originating from various phases into surface-enhanced Raman scattering (SERS)-sensitive sites and their subsequent detection down to the subfemtomolar level (<10−15 mol⋅L−1). Based on SLIPSERS, we have demonstrated detection of various chemicals, biological molecules, and environmental contaminants with high sensitivity and specificity. Our platform may lead to ultrasensitive molecular detection for applications related to analytical chemistry, diagnostics, environmental monitoring, and national security.

references:
Shikuan Yang, Xianming Dai, Birgitt Boschitsch Stogin, and Tak-Sing Wong. Ultrasensitive surface-enhanced Raman scattering detection in common fluids. PNAS 2015, doi: 10.1073/pnas.1518980113 (open access)

http://www.kurzweilai.net/single-molecule-detection-of-contaminants-explosives-or-diseases-now-possible

Looks like Google Glass is back

by Tracey Lien
Remember Google Glass, the optical head-mounted display prototype that Google stopped selling in January?

It looks like the wearable, which never really took off, is getting a second chance.

Photos of a new version of Google Glass appeared in a filing on the Federal Communications Commission website on Monday, showing new features such as foldable arms, a larger glass prism (which acts as a screen) and a new charging port.

The filing also included a manual with user instructions, not too different from the original Google Glass.

Google announced in January that Google Glass was “graduating” from the tech giant’s secretive lab for developing new technologies and would be spun off as its own entity within the company. As part of the transition, it stopped selling the wearable.

The company did not reveal at the time what plans it had for Google Glass, but reports emerged months later that the technology was now being developed for business use.

Google did not immediately respond to requests for comment, and it is unclear whether the model that appeared on the FCC’s website is a consumer or enterprise product.

Unlike other electronic wearables such as fitness trackers and smart watches, Google Glass received a lackluster response when it launched to the public in 2014. In addition to the head-mounted display’s camera stoking privacy concerns, the wearable wasn’t discreet because people wore it on the bridge of their nose like a pair of glasses.

Read more: http://www.afr.com/technology/looks-like-google-glass-is-back-20151230-glwonz#ixzz3vlOicS00

Algorithm turns smartphones into 3-D scanners

December 28, 2015
[+]

Structured light 3-D scanning normally requires a projector and camera to be synchronized. A new technique eliminates the need for synchronization, which makes it possible to do structured light scanning with a smartphone. (credit: Taubin Lab/Brown University)

An algorithm developed by Brown University researchers my help bring high-quality 3-D depth-scanning capability to standard commercial digital cameras and smartphones.

“The 3-D scanners on the market today are either very expensive or unable to do high-resolution image capture, so they can’t be used for applications where details are important,” said Gabriel Taubin, a professor in Brown’s School of Engineering — like 3-D printing.

Most of the high-quality 3-D scanners capture images using a technique known as structured light. A projector casts a series of light patterns on an object, while a camera captures images of the object. The way these patterns deform when striking surfaces allows the structured-light 3-D scanner to calculate the depth and surface configurations of the objects in the scene, creating a 3-D image.

No sync required

The limitation with current 3-D depth scanners is that the pattern projector and the camera have to precisely synchronized, which requires specialized and expensive hardware.

The problem in trying to capture 3-D images without synchronization is that the projector could switch from one pattern to the next while the image is in the process of being exposed. As a result, the captured images are mixtures of two or more patterns. A second problem is that most modern digital cameras use a rolling shutter mechanism. Rather than capturing the whole image in one snapshot, cameras scan the field either vertically or horizontally, sending the image to the camera’s memory one pixel row at a time. As a result, parts of the image are captured a slightly different times, which also can lead to mixed patterns.

The fix

The algorithm Taubin and his students have developed enables the structured light technique to be done without synchronization between projector and camera. That means an off-the-shelf camera can be used with an untethered (unconnected by a wire) structured light flash. The camera just needs to have the ability to capture uncompressed images in burst mode (several successive frames per second), which many DSLR cameras and smartphones can do.

After the camera captures a burst of images, the algorithm calibrates the timing of the image sequence using the binary information embedded in the projected pattern. Then it goes through the images, pixel by pixel, to assemble a new sequence of images that captures each pattern in its entirety. Once the complete pattern images are assembled, a standard structured light 3D reconstruction algorithm can be used to create a single 3-D image of the object or space.

The researchers presented a paper describing the algorithm last month at the SIGGRAPH Asia computer graphics conference. In their paper, the researchers showed that the technique works just as well as synchronized structured light systems. During testing, the researchers used a fairly standard structured light projector, but the team envisions working to develop a structured light flash that could eventually be used as an attachment to any camera.

Northwestern University engineers have developed another inexpensive solution to the problem (see A fast, high-quality, inexpensive 3-D camera), but it uses a proprietary 3-D capture camera instead of an existing smartphone.

Abstract of Unsynchronized structured light

Various Structured Light (SL) methods are used to capture 3D range images, where a number of binary or continuous light patterns are sequentially projected onto a scene of interest, while a digital camera captures images of the illuminated scene. All existing SL methods require the projector and camera to be hardware or software synchronized, with one image captured per projected pattern. A 3D range image is computed from the captured images. The two synchronization methods have disadvantages, which limit the use of SL methods to niche industrial and low quality consumer applications. Unsynchronized Structured Light (USL) is a novel SL method which does not require synchronization of pattern projection and image capture. The light patterns are projected and the images are captured independently, at constant, but possibly different, frame rates. USL synthesizes new binary images as would be decoded from the images captured by a camera synchronized to the projector, reducing the subsequent computation to standard SL. USL works both with global and rolling shutter cameras. USL enables most burst-mode-capable cameras, such as modern smartphones, tablets, DSLRs, and point-and-shoots, to function as high quality 3D snapshot cameras. Beyond the software, which can run in the devices, a separate SL Flash, able to project the sequence of patterns cyclically, during the acquisition time, is needed to enable the functionality.

references:
Daniel Moreno, Fatih Calakli, Gabriel Taubin. Unsynchronized structured light. ACM Transactions on Graphics (TOG) – Proceedings of ACM SIGGRAPH Asia 2015, Volume 34 Issue 6, November 2015, Article No. 178; DOI: 10.1145/2816795.2818062

http://www.kurzweilai.net/algorithm-turns-smartphones-into-3-d-scanners