Articles

What is Deep Learning?

brainI am writing a series of blog posts based on a paper from a colleague, Andreas Tolk. Andreas is presenting a paper entitled, “The Next Generation of Modeling & Simulation: Integrating Big Data and Deep Learning”, at the Summer Simulation Multi-Conference. The ideas contained herein are the intellectual property of Andreas Tolk.

Deep Learning is often understood as a new domain of machine learning research that is dealing with learning multiple levels of representation and abstraction that can be discovered in structured data as well as in unstructured data. However, many of the deep learning algorithms are rooted in the domain of artificial intelligence (AI). They only seem new as they take full advantage of the new computational re- sources and recent developments. Deep learning algorithms are implementing supervised learning algorithms, as well as unsupervised learning algorithms. For a good introduction to Deep Learning from a computational perspective, the interested reader is referred to tutorials like [1] or [2].

Deep Learning has to address challenges on several levels, such as how a machine can learn representations from observing the perceptual world, or how we can learn abstractions from observing and evaluating several instances. Is it possible, and how is it possible to learn hierarchical representations with a few algorithms? Deep learning is trying to solve these challenges by using trainable feature hierarchies based on a series of trainable feature transformations, where each transformation connects two internal representations with each other. The algorithms developed help to learn all steps and representation by supervised and unsupervised learning. In other words, the algorithms help to learn the structures as well as the transformations of these structures into each other. The tools and methods applied will immediately be recognized by AI researchers:

  • Multilayer and convolutional neural nets process the information through a set of layers of interconnected “neurons.” As a rule, supervised learning is applied to learn connections and weights between neutrons in hierarchical layers. Convolutional neural nets add the spatially–local aspect by applying a sort of filter with the first set of neurons. These neural nets are trained using supervised learning. Usually, a set of training data is used to learn the desired behavior, and then a set of control data is used to validate the result.
  • Deconvolutional neural nets and stacked sparse coding are trained by backward–propagation, similar to convolutional nets. Again, supervised learning is used as a rule to calibrate the solutions that can be validated by control data. However, these nets are also used to discover hierarchical decompositions to extract features from complex inputs, such as images.
  • Deep Boltzman machines and stacked auto–encoders are the most complex types. They use forward as well as back- ward propagation, supervised and unsupervised learning, and often combine several of the approaches described in this itemization combined. Very often, energy–based methods are used to bring the calibrated system into a state of an energy minimum, which also minimizes the deviation of the learned functionality from the observable functionality.

All these techniques and methods have statistical counterparts, as at the end the various neural networks learn to approximate observable functionality from the data set used to train them. The better the observed data correlate with the representation by the neural net, the smaller the mistakes. Neural nets, even in the complex and complicated forms used here, are statistically speaking non–linear regression models. The universal approximation theorem states that a standard multilayer feed–forward network with just one single hidden layer, which contains a finite number of hidden neurons, is a universal approximator among continuous functions on com- pact subsets of the real numbers. All methods above are ex- tensions of this relatively simple network, so they can approximate the hidden functionality that deep learning is interested in. The interested reader is referred to [3] for the details.

Deep Learning is used to find functional connections between provided input and observed output data. These can be highly non–linear, complex functions. The software learns to recognize patterns in structured and unstructured data. It can therefore recognize sounds, images, or journal information that is applicable. With the right amount of data, the right mix of algorithms, and the right computational power many objectives of AI can now be realized.

Works Cited

  1. Deng, L. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing 3 (2014), e2.
  2. LeCun, Y., and Ranzato, M. Deep learning tutorial. In Tutorials in International  Conference on Machine Learning (ICML13),  Citeseer (2013).
  3. Ku¨ rkova´, V. Kolmogorov’s theorem and multilayer neural networks. Neural networks 5, 3 (1992), 501–506.

 

Profile_PicBlogged by:
Jeffrey Strickland, Ph.D.

Jeffrey Strickland, Ph.D., is the Author of Predictive Analytics Using R and a Senior Analytics Scientist with Clarity Solution Group. He has performed predictive modeling, simulation and analysis for the Department of Defense, NASA, the Missile Defense Agency, and the Financial and Insurance Industries for over 20 years. Jeff is a Certified Modeling and Simulation professional (CMSP) and an Associate Systems Engineering Professional (ASEP). He has published nearly 200 blogs on LinkedIn, is also a frequently invited guest speaker and the author of 20 books including:

  • Operations Research using Open-Source Tools
  • Discrete Event simulation using ExtendSim
  • Crime Analysis and Mapping
  • Missile Flight Simulation
  • Mathematical Modeling of Warfare and Combat Phenomenon
  • Predictive Modeling and Analytics
  • Using Math to Defeat the Enemy
  • Verification and Validation for Modeling and Simulation
  • Simulation Conceptual Modeling
  • System Engineering Process and Practices

Connect with Jeffrey Strickland
Contact Jeffrey Strickland

3efe82bAuthored by:
Andreas Tolk, Ph.D.

Dr. Tolk is Chief Scientist at SimIS Inc. in Portsmouth, Virginia. He is responsible for the evaluation of emerging technologies regarding their applicability for Modeling and Simulation applications, in particular in the domains of medical simulation, defense simulations, and architectures of complex systems. He is an adjunct professor at Old Dominion University.

Dr. Tolk has been a faculty member (Professor) in the Department of Engineering Management and Systems Engineering at the Old Dominion University from 2006 to 2013. He held a joint appointment with the Modeling, Simulation, and Visualization Engineering department. He received his Ph.D. in Computer Science (1995) and has a M.S. in Computer Science (1988) from the University of the Federal Armed Forces, Germany. His emphasis was Applied Systems Science and Military Operations Research. He has authored several books including:

2 replies »

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s