I recently became interested in identifying the pseudonymous creator of Bitcoin, Satoshi Nakamoto. I started from the Bitcoin whitepaper  published in late 2008, and proceeded to run reverse textual analysis –essentially, searching the internet for highly unusual turns of phrase and vocabulary patterns (in particular places which you would expect a cryptography researcher to contribute to), then evaluating the fitness of each match found by running textual similarity metrics on several pages of their writing.
Which led me rather directly to several articles from Nick Szabo’s blog.
For those who wouldn’t know Nick Szabo and his documented links to Bitcoin: prior to the apparition of Bitcoin, Nick had been developing for several years (since 1998 ) the enabling mechanism for a decentralized digital currency, eventually converging on a system he called “bit gold” , which is the direct precursor to the Bitcoin architecture.
According to what seems to be a widely accepted origin story of Bitcoin, Satoshi Nakamoto was a highly skilled computer scientist (or group thereof) who found about Nick’s proposition for bit gold, hit upon an idea for bettering it, published the Bitcoin whitepaper, and decided to turn it into reality by developing the original Bitcoin client. Nick denies being Satoshi, and has stated his official opinion on Satoshi and Bitcoin in a May 2011 article .
I would argue that Satoshi is actually Nick Szabo himself, probably together with one or more technical collaborators.
As I mention above, what originally led me to this hypothesis is that reverse-searching for content similar to the Bitcoin whitepaper led me to Nick’s blog, completely independently of any knowledge of the official Bitcoin story. I must stress this: an open, unbiased search of texts similar in writing to the Bitcoin whitepaper over the entire Internet, identifies Nick’s bit gold articles as the best candidates. It could still be a coincidence, although an unlikely one -since cryptocurrencies were a fairly niche topic in 2008 and earlier (seemingly 3 or 4 people), every contributor to the field was going to be reusing the same shared expressions and vocabulary. Satoshi would have been a reader of Nick’s blog, so you would expect him to describe the same concepts in a similar way. But there’s more.
Running similarity metrics on the whitepaper and Nick’s bit gold articles as well as his paper “formalizing and securing relationships on public networks”  indicated an excellent match over content-neutral expressions as well –so either Nick wrote the whitepaper, or it was written by somebody imitating Nick’s writing style. Here is a brief summary of some of the more salient common points. For each expression, when it is possible and relevant, we will mention the proportion of cryptography papers containing the expression (using Google Scholar), to measure how common its use is among researchers, and later provide a rough value for the probability of the null hypothesis. Of course, we’ll only do this for content-neutral expressions.
- Repeated use of “of course” without isolating commas, contrary to convention (“the problem of course is”)
- Expression “can be characterized”, frequent in Nick’s blog (found in 1% of crypto papers)
- Use of “for our purposes” when describing hypotheses (found in 1.5% of crypto papers)
- Starting sentences with “It should be noted”(found in 5.25% of crypto papers)
- Use of “preclude” (found in 1.5% of crypto papers)
- Expression “a level of “ + noun (“achieves a level of privacy by…”) as a standalone qualifier
Content-bearing terms that have common synonyms in the field and thus could easily have been expressed in a different way:
- Expression “timestamp server”, central in the Bitcoin paper, used in Nick’s blog as early as January 2006
- Repeated use of expression “trusted third party”
- Expressions “cryptographic proof” and “digital signatures”
- Repeated use of “timestamp” as a verb
Consider this: if we assume that, when a content-neutral expression is part of a researcher’s vocabulary, they use it in at least one in ten papers (for instance “for our purposes” appears in 1.5% of papers, so we’ll assume that 15% of researchers would be susceptible of using it in a paper), then the probability of finding all of “it should be noted”, “for our purposes”, “can be characterized” and “preclude” as part of a given researcher’s vocabulary has the upper bound 0.08%. That’s our p-value right there (8e-4): this particular combination could pinpoint one researcher in a thousand.
Of course, the “one in ten papers” hypothesis is purely arbitrary, so it’s up to you to judge if it is acceptable. It seems rather generous to me, as most researchers actually tend to constantly reuse the same handful of expressions.
In short: most of the unusual wording found in the Bitcoin whitepaper can also be found in recurring occurrences in Nick’s articles. Not all of it, though: the Britishism “favour” used by Satoshi is not used by Nick, who writes “favor”. However, the Bitcoin paper may have had several authors, Nick being merely the main one. In fact, since all the paper is written in American English except for this one word, it is highly probable that either the paper had several authors, or this one word was a deliberate attempt at adding confusion as to the origins of the paper.
Then, there is secondary evidence. It is obvious that Satoshi did extensive research about prior mentions of concepts similar to Bitcoin, as any proper scientist writing a paper would have. This is evidenced by Satoshi’s reference to Wei Dai’s b-money, as well as hashcash, while both of them do not even seem to have been a direct inspiration to Bitcoin. However, he made no mention of Nick Szabo’s bit gold, whereas Bitcoin is quite visibly built directly on top of the bit gold ideas. If Satoshi had been writing independently from Nick, wouldn’t he have cited his work as per proper scientific etiquette?
There is also the remarkable lack of public reaction on Nick’s part when Bitcoin started taking off. For somebody as deeply involved in these concepts as Nick, it strikes me as surprising that it took Nick many months to even mention Bitcoin, while his ideas were coming to life in an exciting way.
Another interesting fact that may or may not be significant, is that the main mentions of bit gold on Nick’s blog have been retroactively post-dated to appear as slightly posterior to the Bitcoin whitepaper, and this right after the publication of the whitepaper. There are two major articles on bit gold on Nick’s blog, one originally posted in December 2005 and post-dated to December 2008 , and one from April 2008, also post-dated to December 2008  (note: it is possible to manually edit the dates of blog posts on Blogger, however the original date is still visible in the (uneditable) url of the posts).
It is unclear why this post-dating occurred –it cannot really be an effort to confuse the dating of the bit gold system, since it is widely documented to have been publicized prior to 2008 (and again, Nick asserted once that he had started working on the idea as early as 1998 ). I would guess that, shortly after the publication of the Bitcoin whitepaper, Nick found something to edit in both of his bit gold articles.
Lastly, one thing to consider is that the profiles of Nick and Satoshi match perfectly. Satoshi is highly likely to have an academic background (Nick is a professor with a significant publication history), as demonstrated by his mastery of scientific writing –writing a paper following proper scientific convention is something difficult to improvise if you haven’t already done it a few times. In fact, the whole idea of getting an idea out there by writing a scientific paper, of all things, is very academically-minded. And the idea of a decentralized digital currency was a central project of Nick’s, that only a handful of people were interested in around the time of publication of the whitepaper. Who was on the 2008 list of academics passionate about cryptocurrencies and who wrote like Nick Szabo? Nick Szabo.
In summary, it seems to me highly likely that Satoshi is Nick (and collaborators). At the very least, there is strong textual analysis evidence that Nick has written significant parts of the Bitcoin whitepaper. I would suppose that either Nick, wanting to get his long-time dream of a decentralized currency further, had contacted one or more technical collaborators that helped him address the shortcomings of bit gold and ship the first client under the collective name Satoshi Nakamoto, or that a brilliant engineer happened to hit upon a better solution for bit gold, contacted Nick, and they decided to bring it to life together. As a side-note, it seems much more likely that a Satoshi-like character inventing Bitcoin would first contact the original father of the project, rather than start devoting all of their resources to shipping what was largely somebody else’s pet idea.
The scenario in which Szabo goes to a technically-minded computer scientist to get help turning bit gold into a reality is strongly backed by the fact that in April 2008, just a few months before the announcement of Bitcoin, Nick was actively looking for collaborators on the bit gold project. He asks on his blog  :
“[bit gold] would greatly benefit from a demonstration, an experimental market (with e.g. a trusted third party substituted for the complex security that would be needed for a real system). Anybody want to help me code one up?”
So, after 10 years of thinking about bit gold, Nick becomes interested in producing a concrete implementation of his decentralized currency dream. What happens right after? The Bitcoin whitepaper and software.
Then again, keep in mind that these two scenarios are pure speculation on my part –the only thing that I do have serious evidence for is merely the authorship of the Bitcoin whitepaper.