The most common letter to start a sentence with in English

It’s not hard to find out which letters commonly start a word.  You can reference Wikipedia to find a nice letter frequency table, or you can analyze the Brown Corpus.  The Brown Corpus is a massive compilation of English text that (as I understand it) is used to examine the statistical properties of the English language, among other things.  The order of letters from most to least common in the Brown Corpus is:

T A O S I  W H C B F P M D R E L N G U Y V J K Q Z X

But here’s a question: what is the most frequent letter to start a sentence?  That’s a bit trickier.  For the Brown Corpus, you get the following letter order

T  I A H S W B M O F N P C D E Y L R G J U V K Q Z X

corresponding to frequencies

11928 7006 4830 4653 3225 3100 2412 1836 1735 1462 1314 1034 981 941 868 859 713 666 517 354 299 143 143 41 10 2.

Putting the two lists on top of each other, it’s easy to see that there are some systematic differences:

T A O S I  W H C B F P M D R E L N G U Y V J K Q Z X

T  I A H S W B M O F N P C D E Y L R G J U V K Q Z X

Pretty neat!

Thanks to Gary Shannon for submitting this analysis on the Conlang listserv, and for letting me repost it on my blog.

Advertisements

About glossarch

The word "glossarch" doesn't exist. At least, not yet. But let's pretend it does for a second. The first part is "gloss," a word that comes to us from Ancient Greek via Latin and English. It means "language." The second part also comes from Ancient Greek and can mean "having power over." So "glossarch" means simply "language controller." So what am I doing making up words? Well, I made up an entire language once. It's called Angosey. So I'm the Glossarch of Angosey. I'm currently a doctorate student in volcano seismology (a branch of geophysics). I enjoy writing fiction and poetry, launching balloons, programming, and hanging out with my lovely wife! Follow me on Twitter! Writing and language creation: @glossarch Balloons and science: @bovineaerospace
This entry was posted in Natural Languages and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s