There's a good deal of information about the Discrete Fourier Transform online, and you can certainly learn from what exists. However, a long time ago I saw a nice characterization of it that I can no longer find, and think is worth writing about.
Consider a sequence \( (a_0, a_1, \ldots, a_{N-1}) \) of length \( N \). Think of these as the coefficients of a polynomial \( f(x) = \sum_{k=0}^{N-1}a_kx^k \). Clearly the coefficients determine the polynomial, but as we know we can also define \( f \) uniquely by specifying its value at \( N \) points, since we know it has degree \( N-1 \). The Discrete Fourier Transform of \( (a_0, a_1, \ldots, a_{N-1}) \) is the sequence \( (f(1), f(\zeta), f(\zeta^2), \ldots, f(\zeta^{N-1})) \) where \( \zeta \) is a primitive \(N\)th root of unity (generally \( e^{\frac{2i\pi}{N}} \)).
This formulation makes the circular convolution theorem almost trivial. If we have two sequences \( (a_0, \ldots, a_{N-1}) \) and \( (b_0, \ldots, b_{N-1}) \) and we consider their associated polynomials \( f(x) = \sum_{k=0}^{N-1} a_kx^k \) and \( g(x) = \sum_{k=0}^{N-1} b_kx^k \), then their product is of the form \( f(x)g(x) = \sum_{k=0}^{2N-2}\sum_{j=0}^{k}a_jb_{k-j}x^{k} \). The coefficients are almost the circular convolution of the two sequences -- to get there we just need to combine the terms for \( x^k \) with the terms for \( x^{k+N} \). In other words, we need to take \( f(x)g(x) \) modulo \( x^N - 1 \).
This means that \( h(x) = f(x)g(x) - r(x)(x^N - 1) \) is the polynomial whose coefficients are the circular convolution of \( (a_0, \ldots, a_{N-1}) \) and \( (b_0, \ldots, b_{N-1}) \). Remember that to determine \( h \), we just need to determine its value at \( N \) distinct points, and whenever \( x^N - 1 = 0 \), we have \( h(x) = f(x)g(x) \). As \( x^N - 1 \) has the \( N \) distinct roots \( 1, \zeta, \zeta^2, \ldots, \zeta^{N-1} \), if we know the values of \( f \) and \( g \) at these points, we also know the value of \( h \) there, and can therefore know all of \( h \). In other words, we're just looking for the sequence \( (f(1)g(1), f(\zeta)g(\zeta), \ldots, f(\zeta^{N-1})g(\zeta^{N-1})) \).