Why Convolutions?
A lot of attention has been spent on attention mechanisms, where they come from, connections to prior techniques (like kernel methods), and so on, and rightfully so. Attention layers have been instrumental in the biggest AI revolution we've seen in our lifetime: the large language model. Comparatively, the successful exemplars and applications f...
Gwern is Wrong About Variational Principles
What is a Variational Principle and Why Do We Care?
Firstly, I would like to clarify what I will mean by a variational principle. I am speaking of variational principles as: systems of PDEs which can be found as the Euler-Lagrange equations for a specific functional. A functional is a mapping from a suitable space of functions (or trajector...
Subjectivism in Probability
Subjectivism in probability is the belief that probabilities correspond to degrees of belief. This more or less reduces down to each subject (e.g. you) assigning a weight representing something like "believability" or, "how likely I think it is to be true," to each proposition or judgment about the world, e.g. "it will rain tomorrow." What is a ...
Hegel on Arithmetic
Hegel wrote a decent bit on mathematics and associated topics, relevant today are his writings on
arithmetic. Specifically, it's interesting to consider the analogies between a modern description of the integers (as a ring)
and Hegel's treatment.
A Mathematical Prelude
Suppose we begin with a free group generated by a single symbo...
Exterior Differential Systems
An isometry of inner product spaces preserves the inner product, therefore, it preserves both lengths and angles.
For real inner product spaces, the group of isometries is \(SO(n)\), the special orthogonal group for dimension \(n\),
which consists of rotations. These are all of those transformations with determinant equal to one. In a more gen...