Train a small neural network! Feed it a photo in the 2D tab, and fiddle with the options to see how good of an approximation you can get it to reproduce after, say, 5000 steps. Stay on the Adam optimizer if you didn't already care because it's the least fragile; but do check out how rerolling it with a different activation function (the nonlinearity building block) affects the shape of the result.
The network chains together a bunch of "layers" of the form
input numbers | → | some weighted averages | activation → |
output numbers. |
It learns to approximate a function f : x → y by repeatedly checking a random sample of inputs and tweaking the averaging weights to match f more closely on those inputs. The "activation function" is a response curve (vaguely like gamma, tone mapping, or dynamic range compression) it can use to build up nonlinear effects.
As always, the details matter, and you can quickly get some intuition for how by watching it learn live.