Monday, June 28, 2010

Bar chart

A bar chart or bar graph is a chart with rectangular bars with lengths proportional to the values that they represent. The bars can also be plotted horizontally.
Bar charts are used for plotting discrete (or 'discontinuous') data i.e. data which has discrete values and is not continuous. Some examples of discontinuous data include 'shoe size' or 'eye colour', for which you would use a bar chart. In contrast, some examples of continuous data would be 'height' or 'weight'. A bar chart is very useful if you are trying to record certain information whether it is continuous or not continuous data.

Example

The following table lists the number of seats allocated to each party group in European elections in 1999 and 2004. The results of 1999 have been multiplied by 1.16933, to compensate for the change in number of seats between those years. Sometimes it can be horizontal.
This bar chart shows both the results of 2004, and those of 1999:

Pie chart

A pie chart (or a circle graph) is a circular chart divided into sectors, illustrating proportion. In a pie chart, the arc length of each sector (and consequently its central angle and area), is proportional to the quantity it represents. When angles are measured with 1 turn as unit then a number of percent is identified with the same number of centiturns. Together, the sectors create a full disk. It is named for its resemblance to a pie which has been sliced. The earliest known pie chart is generally credited to William Playfair's Statistical Breviary of 1801.[1][2]
The pie chart is perhaps the most ubiquitous statistical chart in the business world and the mass media.[3] However, it has been criticized,[4] and some recommend avoiding it[5][6][7], pointing out in particular that it is difficult to compare different sections of a given pie chart, or to compare data across different pie charts. Pie charts can be an effective way of displaying information in some cases, in particular if the intent is to compare the size of a slice with the whole pie, rather than comparing the slices among them.[1] Pie charts work particularly well when the slices represent 25 to 50% of the data,[8] but in general, other plots such as the bar chart or the dot plot, or non-graphical methods such as tables, may be more adapted for representing certain information.

Use, effectiveness and visual perception

Three sets of data plotted using pie charts and bar charts.
Pie charts are common in business and journalism, perhaps because they are perceived as being less "geeky" than other types of graph. However statisticians generally regard pie charts as a poor method of displaying information, and they are uncommon in scientific literature. One reason is that it is more difficult for comparisons to be made between the size of items in a chart when area is used instead of length and when different items are shown as different shapes. Stevens' power law states that visual area is perceived with a power of 0.7, compared to a power of 1.0 for length. This suggests that length is a better scale to use, since perceived differences would be linearly related to actual differences.
Further, in research performed at AT&T Bell Laboratories, it was shown that comparison by angle was less accurate than comparison by length. This can be illustrated with the diagram to the right, showing three pie charts, and, below each of them, the corresponding bar chart representing the same data. Most subjects have difficulty ordering the slices in the pie chart by size; when the bar chart is used the comparison is much easier.[9]. Similarly, comparisons between data sets are easier using the bar chart. However, if the goal is to compare a given category (a slice of the pie) with the total (the whole pie) in a single chart and the multiple is close to 25 or 50 percent, then a pie chart can often be more effective than a bar graph.

Variants and similar charts

"Diagram of the causes of mortality in the army in the East" by Florence Nightingale.

Polar area pie chart

Florence Nightingale is credited with developing a form of the pie chart now known as the polar area diagram, though there are earlier uses. André-Michel Guerry invented the "rose diagram" form, used in an 1829 paper showing frequency of events for cyclic phenomena.[citation needed] Léon Lalanne later used a polar diagram to show the frequency of wind directions around compass points in 1843. The wind rose is still used by meteorologists. The polar area diagram is similar to a usual pie chart, except that the sectors are equal angles and differ rather in how far each sector extends from the center of the circle, enabling multiple comparisons on one diagram.
Nightingale published her rose diagram in 1858. The name "coxcomb" is sometimes used erroneously. This was the name Nightingale used to refer to a book containing the diagrams rather than the diagrams themselves.[10] It has been suggested [by whom?] that most of Nightingale's early reputation was built on her ability to give clear and concise presentations of data.

Multi-level pie chart

Ring chart of Linux file system
Multi-level pie chart, also known as a radial tree chart is used to visualize hierarchical data, depicted by concentric circles.[11] The circle in the centre represents the root node, with the hierarchy moving outward from the center. A segment of the inner circle bears a hierarchical relationship to those segments of the outer circle which lie within the angular sweep of the parent segment.[12]

Exploded pie chart

A chart with one or more sectors separated from the rest of the disk is known as an exploded pie chart. This effect is used to either highlight a sector, or to highlight smaller segments of the chart with small proportions.

3-D pie chart

A perspective (3D) pie chart is used to give the chart a 3D look. Often used for aesthetic reasons, the third dimension does not improve the reading of the data; on the contrary, these plots are difficult to interpret because of the distorted effect of perspective associated with the third dimension. The use of superfluous dimensions not used to display the data of interest is discouraged for charts in general, not only for pie charts

Tuesday, June 15, 2010

Cross Sectional data


Cross-sectional data or cross section (of a study population) in statistics and econometrics is a type of one-dimensional data set. Cross-sectional data refers to data collected by observing many subjects (such as individuals, firms or countries/regions) at the same point of time, or without regard to differences in time. Analysis of cross-sectional data usually consists of comparing the differences among the subjects.
For example, we want to measure current obesity levels in a population. We could draw a sample of 1,000 people randomly from that population (also known as a cross section of that population), measure their weight and height, and calculate what percentage of that sample is categorized as obese. For example, 30% of our sample were categorized as obese. This cross-sectional sample provides us with a snapshot of that population, at that one point in time. Note that we do not know based on one cross-sectional sample if obesity is increasing or decreasing; we can only describe the current proportion.
Cross-sectional data differs from time series data also known as longitudinal data, which follows one subject's changes over the course of time. Another variant, panel data (or time-series cross-sectional (TSCS) data), combines both and looks at multiple subjects and how they change over the course of time. Panel analysis uses panel data to examine changes in variables over time and differences in variables between subjects.
In a rolling cross-section, both the presence of an individual in the sample and the time at which the individual is included in the sample are determined randomly. For example, a political poll may decide to interview 100,000 individuals. It first selects these individuals randomly from the entire population. It then assigns a random date to each individual. This is the random date on which that individual will be interviewed, and thus included in the survey

Tuesday, June 8, 2010

Secondary data

Secondary data is data collected by someone other than the user. Common sources of secondary data for social science include censuses, surveys, organizational records and data collected through qualitative methodologies or qualitative research. Primary data, by contrast, are collected by the investigator conducting the research.


Secondary data analysis saves time that would otherwise be spent collecting data and, particularly in the case of quantitative data, provides larger and higher-quality databases than would be unfeasible for any individual researcher to collect on their own. In addition to that, analysts of social and economic change consider secondary data essential, since it is impossible to conduct a new survey that can adequately capture past change and/or developments.

Sources of secondary data


As is the case in primary research, secondary data can be obtained from two different research strands:

Quantitative: Census, housing, social security as well as electoral statistics and other related databases.

Qualitative: Semi-structured and structured interviews, focus groups transcripts, field notes, observation records and other personal, research-related documents.

A clear benefit of using secondary data is that much of the background work needed has been already been carried out, for example: literature reviews, case studies might have been carried out, published texts and statistic could have been already used elsewhere, media promotion and personal contacts have also been utilized.


This wealth of background work means that secondary data generally have a pre-established degree of validity and reliability which need not be re-examined by the researcher who is re-using such data.

Furthermore, secondary data can also be helpful in the research design of subsequent primary research and can provide a baseline with which the collected primary data results can be compared to. Therefore, it is always wise to begin any research activity with a review of the secondary data.


Secondary analysis or re-use of qualitative data

Qualitative data re-use provides a unique opportunity to study the raw materials of the recent or more distant past to gain insights for both methodological and theoretical purposes.


In the secondary analysis of qualitative data, good documentation can not be underestimated as it provides necessary background and much needed context both of which make re-use a more worthwhile and systematic endeavour . Actually one could go as far as claim that qualitative secondary data analysis “can be understood, not so much as the analysis of pre-existing data; rather as involving a process of re-contextualising, and re-constructing, data”



Overall challenges of secondary data analysis

There are several things to take into consideration when using pre-existing data. Secondary data does not permit the progression from formulating a research question to designing methods to answer that question. It is also not feasible for a secondary data analyst to engage in the habitual process of making observations and developing concepts. These limitations hinder the ability of the researcher to focus on the original research question.

Data quality is always a concern because its source may not be trusted. Even data from official records may be unreliable because the data is only as good as the records themselves, in terms of methodological validity and reliability.

Furthermore, in the case of qualitative material, primary researchers are often reluctant to share “their less-than-polished early and intermediary materials, not wanting to expose false starts, mistakes, etc.”

So overall, there are six questions that a secondary analyst should be able to answer about the data they wish to analyze.

Case 2:-

Secondary data is the data that have been already collected by and readily available from other sources. Such data are cheaper and more quickly obtainable than the primary data and also may be available when primary data can not be obtained at all.


Advantages of Secondary data

It is economical. It saves efforts and expenses.

It is time saving.

It helps to make primary data collection more specific since with the help of secondary data, we are able to make out what are the gaps and deficiencies and what additional information needs to be collected.

It helps to improve the understanding of the problem.

It provides a basis for comparison for the data that is collected by the researcher.

Disadvantages of Secondary Data

Secondary data is something that seldom fits in the framework of the marketing research factors. Reasons for its non-fitting are:-

Unit of secondary data collection-Suppose you want information on disposable income, but the data is available on gross income. The information may not be same as we require.

Class Boundaries may be different when units are same.

Before 5 Years After 5 Years

2500- 5000   5000-  6000

5001-  7500   6001- 7000

7500- 10000  7001-10000

Thus the data collected earlier is of no use to you.

Accuracy of secondary data is not known.

Data may be outdated.

Evaluation of Secondary Data

Because of the above mentioned disadvantages of secondary data, we will lead to evaluation of secondary data. Evaluation means the following four requirements must be satisfied:-

Availability- It has to be seen that the kind of data you want is available or not. If it is not available then you have to go for primary data.

Relevance- It should be meeting the requirements of the problem. For this we have two criterion:-

Units of measurement should be the same.

Concepts used must be same and currency of data should not be outdated.

Accuracy- In order to find how accurate the data is, the following points must be considered: -

Specification and methodology used;

Margin of error should be examined;

The dependability of the source must be seen.

Sufficiency- Adequate data should be available.

Robert W Joselyn has classified the above discussion into eight steps. These eight steps are sub classified into three categories. He has given a detailed procedure for evaluating secondary data.

Applicability of research objective.

Cost of acquisition.

Accuracy of data.

Saturday, June 5, 2010

Primary data and Methods of collecting primary data

Data means collection of facts.it is also a statistical data with numeriacal statement of aggregates. They are obtained through properly organised statistical enquiries conducted by the investigators. Data can be collected  either from the primary sources or secondary sources.Accordingly they are called as primary and secondary data.

Primary data:- The data which are collected originally by an investigator or agency for the first time at first hand for any statistical investigation.

Prof. Werral define primary data as "Data originally collected in the process of investigation are known as a Primary Data.....

Merits:-
* Degree of Accuracy is High, reliable, depicts the data in great detail...
*Additional information can also be collected if necessary....
*primary data collections also includes certain terms & definitions which makes the data more accurate...

Demerits:-
* it requieres lot of time
*it is expensive
* it requires a lot of human labour,demands lot of skill, intelligence on the part of investigator...

In primary data collection, you collect the data yourself using methods such as interviews and questionnaires. The key point here is that the data you collect is unique to you and your research and, until you publish, no one else has access to it.
Methods of collecting Primary data:-

* Direct investigation/Personal investigation method
* Indirect personal invetigation
* Information through correspondents
* mailed questionnaire
* Telephone, TV, Internet & Teleconference etc....

Problems of Non-Covid Patients and Health Care Services during Pandemic Period: A Micro level Study with reference to Chennai City, Tamilnadu

  https://www.eurchembull.com/uploads/paper/92a2223312e11453a5559262c1cd4542.pdf ABSTRACT Background: COVID-19 has disrupted India's eco...