TNPSC – STATISTICS
=> The word ‘Statistics’ has been derived from either latin word ‘status’ (or) Italin word ‘Statista’ (or) German word ‘Statistik’ (or) French word ‘Statistique’ each of which means a political state.
=> By the word statistics we mean “numerical statements as well as statistical methodology”
=> Now a days statistics is used for solving (or) analysing the problem of the state.
=> It supplies essential information for development activities in all departments. It is used in all disciplines.
=> Father of Statistics – Sir. Ronald A. Fisher.
=> He applied statistics to several fields such as psychology, genetics and education.
=> Any collection of information in the form of numerical figures giving the required information is called data.
=> The marks obtained by the students of a class in a subject in an examination.
=> The heights of the student of a class.
=> The raw data is an unprocessed and unclassified data.
=> The marks obtained in a mathematics test by the student of a class is a collection of observation gathered initially.
=> The information which is collected initially and pre-sorted randomly is called raw data.
=> The data which is arranged in groups (or) class is called a grouped data.
=> Sometimes the collected raw data may be huge in number and it gives us no information as such whenever data is very large, we have to group them meaningfully and then analyse.
Data are two kinds:
Primary data: The data collected by the investigator himself is known as primary data.
Secondary data: Sometimes an investigator utilizes the primary data if another investigator collected for a different purpose.
=> Classification and Tabulation of Data
=> Grouping the collection data according to certain common properties is called classification of data.
=> After classification of data the process of arranging numerical figures in groups (or) classes – tabulation.
=> The number of times a particular stores reports itself is called its frequency.
=> The tabular form of representing data showing each score and the corresponding frequency is called frequency distribution.
=> The lower value of the class interval is called the lower limit and upper value of class interval is called the upper limit. The difference between the upper and lower limit of class interval and lower limit of class interval is called the size width of the class interval.
|Class intervals||Tally marks||Frequency|
=> The histogram of a frequency data consists of a no of rectangles erected on the class interval of the distribution.
=> The class of the rectangles are proportional to the frequencies of the respective classes.
=> Class intervals marks successively along x-axis.
=> frequency are marked along the y-axis.
=> To draw a histogram for a given frequency table.
|No of Students||4||6||8||10||7||5|
Frequency of Polygon:
=> Plot the points on a graph paper taking the mid values of the class intervals as x- coordinates and corresponding frequencies y – coordinates.
Example: To construct a frequency polygon for a given frequency distribution
|No of Students||4||6||8||10||7||5|
=> The arithmetic mean (A.M) (or) simply the mean (or) average of an observations X1, X2, Xn is defined to be the number such that x̅ such that the sum of the observations from x̅ is 0.
x̅ = X1+X2+…..+XN
x̅ = ∑ X1 / n
∑ – sigma notation used to represent summation.
If the observations are represented in the form of a frequency table, the mean x̅ is given by
x̅ = (f1x1+f2x2+…..+fnxn) / (f1+f2+…fn), Where X1, X2, Xn are the individual values (or) mid values of class intervals whose frequencies are f1+f2+…fn
x̅ = ∑ fixi / N, Where N = f1+f2+…fn
è When the given raw data are arranged in ascending order (or) descending order.
è The central value (or) the middle most value is called median of the data.
Mode is also a measure of central tendency.
=> In a set of individual observations, Mode is defined as the value which occurs most often.
=> If the data are arranged in the form of a frequency table, the class corresponding to the maximum frequency is called the Modal Class.
Standard deviation is the positive square root of the mean of the squared deviations of the data from the mean.
The square of the standard deviation denoted by σ2.
Coefficient of variation:
σ = Variance
x̅ = Arithmetic Mean
C.V = (σ / x̅)*100%