Skip to main content

HOW MANY ENGLISH WORDS DO YOU USE IN YOUR WRITING?

   

         In normal dictionary, the words are arranged according to alphabetical order.  But we can arrange the words according the frequency of occurrence.  That is, most used words will appear first, less used words follows.  Then each word will have a rank and frequency.  If we multiply rank and frequency, we get a number.  That number will be the same (more or less) for all the words and it is a constant.  That is zipf's law.
[In the above paragraph, five 'the' are present out of 70 words. hence frequency percentage is 5*100/70=7.1%]
     Frequency percentage * rank = constant.
     The most used words in English is 'The'.  It occurs 7 times in 100 words.  Or the percentage of frequency is 7 % .  A list is given below.

Rank R     word       frequency            constant
                             percentage F           F * R = C

   1.            the                6.8%            1 * 6.8 =6.8 %
   2.             of                3.1                2 * 3.1 = 6.2
   3.             to                2.7                3 * 2.7 = 8.1
   4.             and             2.6                4 * 2.6 = 10.4
   5.              in               1.8                5*1.8 = 9.0
   6.              is                1.2                6*1.2 = 7.2
   7               for              1                   7*1 = 7
   8               that             0.8                8 * .8 = 6.4

     The constant of proportionality in English language is about 7.5% or .075

     In a nutshell : We can have English frequency dictionary.  The frequency of a word is inversely proportional to the rank. The constant of proportionality is 0.075.

     Coming to our title question.  Let us find out how much typical English text is made of top 1000 words.  Or most commonly used 1000 words.
     We have to note down the frequency percentage of each word (ranked 1 to 1000 ) from the dictionary.  And add all of the percentages.  Then we will get total percentage of English text that is written using top 1000 words.
     There is a mathematical short-cut.  If you are allergic to math, you can skip this portion.

     Frequency  * rank = constant
     frequency % = constant /rank
     So frequency percentage of first 1000 words
           = 0.075/1 +0.075/2 + ..... 0.075/1000
or       = 0.075(1/1+1/2+1/3+ ....+0.075/1000)
          = 0.075(log(1000) + 0.58)     math formula
          = 0.56
          =56%

     That is, more than half of English text that we write or read is only made up of 1000  words.  A lay man or learned man may mostly use 3000 words.  Even Shakespeare said to have known only around 30 thousand words.  But the mighty English language has about 300 thousands words.  How little we know?

     The zipf's law hold good for,
  Ordering 
1. companies by staff.
2. Universities by number of students.
3. Languages by number of speakers.
4. Websites by hits.
5. Cities by population
6. Countries by area.
 and so on.

FOOT NOTE:
     Amazon e books can be ranked according to daily sales and zipf's law can be applied.  Even though there are millions of e books only top few hundred books make more than 50% of the sales daily.
-------------------------------------------------------------------------------------------------

Comments

  1. Great Post with useful information. Thank you. Share more updates.
    IELTS Classes Anna Nagar

    ReplyDelete

Post a Comment

Popular posts from this blog

THE EARTH, A SUPER ORGANISM

     JOIN MY COURSE: "Become a programmer in a day with python"       A man called 'love lock' (what a name) proposed a theory called Gaia theory, named after Greek Goddess.      It says, "Earth is a self-regulating organism like a human being.  The organic life in it interacts with in-organic matter and maintains atmosphere, temperature and environment".  Hence the earth is still suitable for the life to thrive.      Imagine, in a particular place, there are lot of flowers.  Some flowers are white and some are darkly coloured.  We know, white reflects light and heat while dark absorbs the same.  White flowers can thrive in hot climate.  But dark flowers requires cold climate.  The absorption and reflection balances and the environment reaches average, warm temperature at which both the flowers can co-exist.  This is the essence of "Gaia" theory.      On our earth, the oxygen constitute 20% of the atmosphere.  The oxygen level is always mai

THE PARABOLA

          A jet of water shooting from a hose pipe will follow a parabolic path.  What is the so special about parabola.    Y= x^2 Draw a graph for the above equation.  It will result in a parabola.  This parabola is also called unit parabola.  Any equation involving square will yield a parabola. Example:  Y = 2x^2 +3x+3 (also called quadratic equation)    X= 2 and -2, both  satisfies the equation 4 = X^2.  Parabolic equations always have two solutions.     Any motion taking place freely under gravity follows parabolic path. Examples:   An object dropped from a moving train,   A bomb dropped from flying plane,  A ball kicked upwards.      If a beam of light rays fall on the parabolic shaped mirror, they will be reflected and brought to focus on a point.  This fact is made use of in Dish Antenna, Telescope mirrors, etc.      Inverted parabola shape is used in the construction of buildings and bridges.  Because the shape is able to bear more weight.      A plane

DISORDER IS THE "ORDER OF THE DAY"

         Imagine a balloon full of air.  The air molecules are moving randomly inside the balloon.  Let us pierce the balloon with a pin.  The air rushes out.  Why should not the air molecules stay inside the balloon safely and ignore the little hole?  That is not the way the world works.  The molecules always "want to occupy as many states as possible".  Hence the air goes out in the open to occupy more volume.   The things always goes into disorder (entropy) and the disorder increases with time.  The above statement is what we call "second law of thermodynamics".      Consider a cup of coffee on the table. Suppose the heat from entire room flows to your cup of coffee, the coffee will boil and the rest of the room will freeze.  Freezing means bringing things to order and arrangement.  It violates the second law.  Hence it will never happen.  Hence heat must flow from high temperature to low temperature and not the other way.        The air molecules in y