Skip to main content

HOW MANY "THE" IN A SENTENCE?

 

      We know "the" is the most occurring word in English language. I want to find out how many 'the' occurs in a sentence normally.  In a passage of 100 words, mostly 7 'the' occurs.  The probability of occurrence is 0.07.  Also, it is found that normally English sentence contains 19 words.

     The probability of occurrence of two 'the' in a sentence of 19 words is,
= C(19,2)* 0.07^2 * 0.93 ^17 = 24.4%
Let us understand the formula step by step.
1. The probability of 'the' not being present is 1-0.07 =0.93.  The probability of 'the' not being one of the 17 words is 0.93^17.
2. The chance of 'the' appearing two times is 0.07^2.
3. C(19,2) is called combination or binomial co-efficient.  It gives, the no.of ways two 'the' and the remaining 17 words can be arranged.  Refer foot-note.  All the three factors should be multiplied to get the correct probability.  It gives 0.244 or 24.4%.

     Similarly what is the chance of 'the appearing one time only in 19 word sentence.
= C(19,2) * 0.07 * 0.93 ^18 =0.36=36%
Again, the chance of 'the not being present even one time is
=C(19,0)*0.07*0.93^19 = 0.252=25.2%

Add all the three probabilities.
The zero time =25.2
       one time   =36.0
       two time  = 24.4
                          85.6%

Hence 'the ' appearing zero, one or two times in a sentence is more than 85%"the' occurring more than two times in a normal sentence is very unlikely.
     In similar way, we can analyze English text.  One more result is the average word length is 5 letters.  You can do text analysis using 'advanced text analysis' featured on the website English.com.
--------------------------------------------------------------------------------------------------
Foot note:
     How many ways you can arrange one A two B.
ABB, BAB, BBA - 3 ways
combination = Factorial 3/Factorial 1*factorial(3-1) = 3*2.1/1*1*2 =3
    This is how we find the no. of ways of arranging two 'the' and 17 words.  Online calculators are available to find combinations.    

Comments

Popular posts from this blog

THE EARTH, A SUPER ORGANISM

     JOIN MY COURSE: "Become a programmer in a day with python"       A man called 'love lock' (what a name) proposed a theory called Gaia theory, named after Greek Goddess.      It says, "Earth is a self-regulating organism like a human being.  The organic life in it interacts with in-organic matter and maintains atmosphere, temperature and environment".  Hence the earth is still suitable for the life to thrive.      Imagine, in a particular place, there are lot of flowers.  Some flowers are white and some are darkly coloured.  We know, white reflects light and heat while dark absorbs the same.  White flowers can thrive in hot climate.  But dark flowers requires cold climate.  The absorption and reflection balances and the environment reaches average, warm temperature at which both the flowers can co-exist.  This is the essence of "Gaia" theory.      On our earth, ...

DISORDER IS THE "ORDER OF THE DAY"

         Imagine a balloon full of air.  The air molecules are moving randomly inside the balloon.  Let us pierce the balloon with a pin.  The air rushes out.  Why should not the air molecules stay inside the balloon safely and ignore the little hole?  That is not the way the world works.  The molecules always "want to occupy as many states as possible".  Hence the air goes out in the open to occupy more volume.   The things always goes into disorder (entropy) and the disorder increases with time.  The above statement is what we call "second law of thermodynamics".      Consider a cup of coffee on the table. Suppose the heat from entire room flows to your cup of coffee, the coffee will boil and the rest of the room will freeze.  Freezing means bringing things to order and arrangement.  It violates the second law.  Hence it will never happen.  Hence heat must flow from high ...

CASINO'S GAME

           Let us find out how the casino survives with mathematics.      Say, your friend invite you for a game of dice.  You must bet (wager) 2 dollars.  If you roll 'six' you will get back 8 dollars.  The game will go on for 30 rounds.  All sounds good.      The probability of rolling 'six' is 1/6.  Since the game will be played for 30 times, the 'expected win' is 30*1/6 = 5.  That is, you are expected to win 5 rounds out of 30.  Hence your gain will be 5 * 8 =40 dollars.  ok.  This also implies that you will loose 25 rounds.  Hence your loss will be 25*2 =50 dollars.  Your net gain will be gain-less = 40-50 = -10 dollars. For 30 rounds, the loss is -10 dollars, Hence, for one round =-10/30 = -1/3 dollars.  There will be a loss of -1/3 or 0.33 dollars per round.  It is not a fair game.     Let us make a simple formula to calculate  'Pa...