Some data on the shape of the forgetting curve
The forgetting curve is often schematically pictured like this, as on Wikipedia:
Learners often take this to mean that their retention of a given fact will, over time and on average, tend to look something like that. So, for example, the Wikipedia entry on Ebbinghaus glosses it as "describ[ing] the exponential loss of information that one has learned." But:
- Ebbinghaus's original forgetting curve is defined in terms of "savings," a metric we tend not to use: it describes how long it takes to learn something after studying it relative to the time it takes initially to learn something.
- Ebbinghaus's 1885 formula is
b = 100k / ((log t)^c + k), which involves an exponent and decays but is not a literal exponential curve. - I've never found strong evidence that my own forgetting curves--here defined as I think it's generally understood, in terms of the probability of my getting a flashcard correct over time--are exponentially distributed.
Here's my performance (LOWESS-smoothed) on my fourth response after I get the first three responses on a flashcard correct:

I've chosen the correct-correct-correct ("CCC") prefix because it has a large sample size and a reasonable spread of intervals between the third and fourth responses.1 When I run Bayesian information criterion ("BIC") analyses on this, it consistently chooses models with the fewest parameters, because more or less any kind of distribution can fit the data very well. (Even a linear fit does almost as well as anything else.)
If I didn't know these were spaced-repetition data, or if I hadn't read that they are supposed to be exponentially distributed,2 neither looking at the data, nor exploring them, nor running BIC or any similar analysis would be tempting me to think they are exponentially distributed.
As always, the hard question is what to do about this. I still take the pragmatic lesson that we should worry less about fine details of algorithms and more about the ergonomics of the broader learning system. Others disagree (explicitly or implicitly). But whatever lesson you draw from it, I've never seen a retention-versus-time curve from my own data that is obviously exponential, and I've certainly never seen one that looks anything like the standard Wikipedia-style schematic.