Machine learning (ML) and deep learning (DL) are popular right now—and for good reason, as there’s a lot of interesting work happening there. The hype makes it easy to forget about more tried and true mathematical modeling methods, but that doesn’t make those methods any less relevant. They are all complementary tools in a larger toolbox.
We like to look at the landscape in terms of the Gartner Hype Cycle:
We think that ML, and DL in particular, is at the Peak of Inflated Expectations (or, at least, very close to it). Meanwhile, there are many other methods in the Plateau of Productivity. They are workhorses—people understand them and use them all the time, but nobody talks about them. They’re still important, though, and we understand that at Manifold. To create successful data products, you must often deploy the full breadth of available tools, far beyond ML.
What does that mean in practice?
Tried and True Tools
Let’s look at a few of these mature tools that continue to be useful: control theory, signal processing, and mathematical optimization.
Control theory, which became its own field in the late 1950’s, deals with realtime observation, inference, and control of (potentially unobserved) states of a dynamic system. It’s particularly useful when you understand the physics of a system, i.e., where the dynamics aren’t arbitrary. This is an important distinction since ML is very useful when we don’t fully understand the underlying physics—such as retail demand behavior or ad buying on the internet. Consider vehicular motion, which has physical laws that we don’t need a ML algorithm to learn; we know how Newton’s equations work and we can write down the differential equations governing the motion of a vehicle. Building ML models to learn these physics would burn reams of data and compute cycles to learn something that is already known; it is wasteful. In contrast, by putting the known physics into a state-space model, and posing the inference in the language of control theory, we can learn something relevant more quickly.
Another useful tool is signal processing, which deals with the representation and transformation of any signal, from time-series to hyper-spectral images. Classical signal processing transformations, like spectrograms and wavelet transforms, are often useful features to use with ML techniques. In fact, many advances in speech ML use these representations as inputs to a deep neural network. At the same time, classical signal processing filters, like the Kalman filter, are often very good first solutions to problems that get you 80% of the way to a solution with 20% of the effort. In addition, techniques like this are often much more interpretable than more sophisticated DL ones.
Lastly, mathematical optimization deals with finding optimal solutions to a given objective function. Classical applications include linear programming to optimize inventory allocation, and nonlinear programming to optimize financial portfolio allocations. Advances in DL are partially due to innovations in the underlying optimization techniques, like stochastic gradient descent with momentum, that allow the training to get out of local minima.
As with other techniques, mathematical optimization is very complementary to ML. All of these tools don’t work against each other, but instead offer fascinating ways to combine them.
Blending Old and New
Many successful solutions across disparate domains lend themselves to melding the new world of ML/DL with classical mathematical modeling techniques. For example, in a thermodynamic parameter estimation problem, you can combine state-space modeling techniques with ML to infer unobserved parameters of the system. Or, in a marketing coupon optimization problem, you can combine ML-based forecasting of customer behavior with a larger mathematical optimization to optimize the coupons sent.
Manifold has substantial experience at the interface of signal processing and ML. A common pattern we have deployed is to use signal processing for feature engineering and combine it with modern ML to classify temporal events based on these features. Features inspired by signal processing of multi-variate time series, such as short time Fourier Transform (STFT) coefficients, exponential moving averages, and edge finders, allow domain experts to quickly encode information into the modeling problem. Using ML enables the system to continuously learn from additional annotated data and improve its performance over time.
In the end, that’s what’s important to understand: All of these tools are complementary, and you need to understand many of them to create data products that solve real business problems. An overly narrow focus on ML misses the forest for the trees.