Skip to main content

HOW MANY "THE" IN A SENTENCE?

 

      We know "the" is the most occurring word in English language. I want to find out how many 'the' occurs in a sentence normally.  In a passage of 100 words, mostly 7 'the' occurs.  The probability of occurrence is 0.07.  Also, it is found that normally English sentence contains 19 words.

     The probability of occurrence of two 'the' in a sentence of 19 words is,
= C(19,2)* 0.07^2 * 0.93 ^17 = 24.4%
Let us understand the formula step by step.
1. The probability of 'the' not being present is 1-0.07 =0.93.  The probability of 'the' not being one of the 17 words is 0.93^17.
2. The chance of 'the' appearing two times is 0.07^2.
3. C(19,2) is called combination or binomial co-efficient.  It gives, the no.of ways two 'the' and the remaining 17 words can be arranged.  Refer foot-note.  All the three factors should be multiplied to get the correct probability.  It gives 0.244 or 24.4%.

     Similarly what is the chance of 'the appearing one time only in 19 word sentence.
= C(19,2) * 0.07 * 0.93 ^18 =0.36=36%
Again, the chance of 'the not being present even one time is
=C(19,0)*0.07*0.93^19 = 0.252=25.2%

Add all the three probabilities.
The zero time =25.2
       one time   =36.0
       two time  = 24.4
                          85.6%

Hence 'the ' appearing zero, one or two times in a sentence is more than 85%"the' occurring more than two times in a normal sentence is very unlikely.
     In similar way, we can analyze English text.  One more result is the average word length is 5 letters.  You can do text analysis using 'advanced text analysis' featured on the website English.com.
--------------------------------------------------------------------------------------------------
Foot note:
     How many ways you can arrange one A two B.
ABB, BAB, BBA - 3 ways
combination = Factorial 3/Factorial 1*factorial(3-1) = 3*2.1/1*1*2 =3
    This is how we find the no. of ways of arranging two 'the' and 17 words.  Online calculators are available to find combinations.