**You are designing a direct marketing campaign for an online clothing retailer. As part of your design, you quantify the expected response rates by ethnic group. Your definition of the term "ethnicity" follows that of the U.S. Census Bureau (e.g., Hispanic, Asian, African American, etc.). You want to test your campaign using 1,000 randomly selected households, but you want your sample to mimic the US population in terms of the proportion of different ethnicities (e.g., if Hispanics constitute about 12 percent of the US population, 12 percent of your sample should be Hispanic).**

**Assess the appropriateness of using a simple random sample as the test's sampling plan.****Evaluate other potential sampling plans and describe why such other sampling plans might be more or less appropriate than using a simple random sample. Support your discussion with relevant examples, research, and rationale.**

**The** **final paragraph** **(three or four sentences) of your initial post should summarize the one or two key points that you are making in your initial response.**

**Submission Details**

**Your posting should be the equivalent of 1 to 2 single-spaced pages (500–1000 words) in length.****Respond to 2 students responses**

## Sampling.html

### Sampling

The majority of business analyses are conducted using sample data. There are two reasons for that. First, it may be impractical or too expensive to measure an entire population statistic. (For example, measuring the tensile strength of an object may destroy the item being tested.) Secondly, data impurities, such as outlying or miscoded values, can more easily be addressed in a sample rather than in the population.

Samples can be obtained using a number of different schemes, known as sampling plans. The most common are simple random sampling, stratified random sampling, and cluster sampling plans.

Simple random sampling entails selecting a sample so that every possible sample with the same number of observations has an equal chance of being chosen. This type of sampling plan is the one that is least subject to biases and distortions and which can be executed at a low cost. However, it is not always feasible or desirable to implement it.

Stratified random samples are obtained by first separating, or stratifying, the population into mutually exclusive sets and then drawing random samples from each set, or stratum. This type of sampling plan can yield additional insights into the population as the strata can be created to yield particular types of information.

Cluster sampling is a simple random sample drawn from a group or cluster of elements. Clusters, or groups, within the population are sampled rather than elements drawn randomly from the population. This is done to reduce costs or to obtain more complete data when the members of the population are widely dispersed.

The use of a sample from available data, rather than use of all of the data, should not be viewed as a hindrance. In fact, quite frequently, it can improve the quality of analytic results. This idea will be explored in detail in later sections.

,

## Sampling vs. Nonsampling Error.html

### Sampling vs. Nonsampling Error

Previously, you learned about types of sampling and focused on the generalizability of conclusions drawn from samples. It might be helpful to think of those considerations in terms of pre-analysis setup. In this section, the focus will be on sampling-related considerations associated with the analysis of sample-based data.

One of the inescapable consequences of sampling is that there will be some degree of difference between the sample and the population, that is, sample-to-population difference, which is called sampling error. However, in practice, we rarely know the extent of that deviation, because we rarely know the parameters of the entire population. In business, companies have a lot of data regarding their own operations, practices, and results, but they rarely have the same level of detail for their competitors. Most companies are in a position to estimate the parameters of their own sample (for example, their customers) but not the parameters of the entire market or population. Does this mean that a marketing manager is essentially helpless in his effort to estimate the degree of possible sampling error? No, but instead of trying to estimate the sample-to-population differences, a marketing manager will likely employ sampling-error mitigation techniques, such as resampling and bootstrapping to take positive steps toward minimizing the magnitude of the error.

#### Nonsampling Error

What about the nonsampling error? The marketing manager would be most concerned with one particular manifestation of that deviation, something known as the self-selection bias. Self-selection bias is yet another reality of applied business analytics, as it is a reflection of a greater inclination on the part of less desirable respondents to respond to a particular promotion.

Consider the coupon redemption example discussed earlier. In general, brands hope to attract (with the help of coupons and other promotions) high-value, profitable customers; that is, those willing to repurchase the brand without incentives. Yet the consumers most likely to respond are the so-called switchers who generally will not repurchase without incentives. What happens when the lion's share of coupon responders are the undesirable switchers? It creates the possibility of nonsampling bias, which in turn might lead to attributing switcher-dominated response rate to all consumers, in effect, overstating it.

,

## Hypothesis Testing.html

### Hypothesis Testing

Progress in science is achieved by providing a tentative explanation—a hypothesis—for a phenomenon and then investigating and testing objective facts to determine if there is support for the proposed hypothesis. In statistics, hypothesis testing evaluates claims made about the value of a population parameter using sample statistics.

To test a hypothesis, it must be set up as a statement that can be determined to be true or not true. Some ideas are simply too vague to be tested and need to be restated in a way that enables them to be empirically validated. The difference between a general idea and a testable hypothesis is that the latter is stated in terms of a specific prediction which can be tested as being supported by data or not. For example, a general idea that promotional activities contribute to product sales can be restated as the following hypothesis:

Hypothesis 1: Purchase rate of Product A will be higher with the use of coupon incentives than without them.

The claim is stated in the form of two statements that are mutually exclusive: the null hypothesis and the alternative hypothesis.

Outcome 1 (null hypothesis): There are no differences in purchase rates with and without the use of coupon incentives.

Outcome 2 (alternative hypothesis): Purchase rates are higher when coupon promotions are used than when they are not used.

Hypothesis testing is one of the most complex statistical methods discussed in this course. The ability to draw statistically valid conclusions is a key goal of this course. There are many forms of hypothesis testing, but all of them share one thing in common. The procedure involves calculating a test statistic, and comparing it to a critical value, which is usually found in the back of the textbook, in order to draw statistically valid conclusions.

There are many other observations that need to be made, including discussion of (Type I and Type II) errors and the t-test. See the Supplemental Media entitled “More on Hypothesis Testing” to see these topics discussed in greater detail.

#### Additional Materials

## media/transcripts/SUO_MBA5008 W4 L1 More on Hypothesis Testing.pdf

More on Hypothesis Testing Errors and Tests

Consider the following example. A brand manager ran two different marketing campaigns, one of which generated 2.1 percent incremental sales, while the other generated 2.5 percent incremental sales. The manager believes that the 0.4 percent differential is significant enough to conclude that the second campaign significantly outperformed the first one. However, his boss suspects that the difference is due to chance or probability, but not the direct cause of the independent variable on the dependent variable. To decide which of the two beliefs is correct, an analyst sets up a test.

• Null hypothesis: The results of the two tests are the same (statistically).

H0: Mean1 – Mean2 = 0

• Alternative hypothesis: The results of the second test are better than the results of the first test.

H1: Mean2 – Mean1 > 0

The null hypothesis is always expressed as an equality, and it is always paired with another statement, the alternative hypothesis. It is an accepted convention in science that you begin with the premise that a given hypothesis is not true. A test or tests can then be carried out to see if there is convincing evidence to reject the null hypothesis. However, a null hypothesis is not deemed true if empirical evidence compels you to reject its being false. Failure to reject a null hypothesis means only that you have failed to prove the alternative hypothesis.

Note that the alternative hypothesis is always stated as an inequality, that is, as a "greater than" or "less than" statement. The structure of the statistical test itself is used to measure the direction of the postulated relationship. For example, a "greater than" relationship is tested by determining whether an observed difference is sufficiently large to warrant that conclusion.

Given the nature of hypothesis testing, there is always a chance that the conclusion arrived at is incorrect. That is, there is a probability that a null hypothesis will be rejected when it should not be rejected (known as a Type I error), or that the null hypothesis will not be rejected when in fact it should be rejected (known as a Type II error).

The probabilities of committing a Type I or Type II error are interdependent. Chances of incorrectly rejecting the null hypothesis (Type I) are directly related to the chances of incorrectly accepting the alternative hypothesis (Type II). In applied business analysis, the focus is primarily on Type I error because these errors tend to have more severe implications than Type II errors.

Once again, let's consider the campaign example, in which the first campaign yielded a response rate of 2.1 percent, while the second campaign yielded a response rate of 2.5 percent. Here, Type I error would refer to the possibility that statistically significant differences were detected between the two means (the null

2 More on Hypothesis Testing

Errors and Tests

hypothesis was rejected), when in fact the differences noted were due purely to random factors. Resources might be redirected from the first campaign to the second campaign, which is erroneously presumed more effective. This may have significant business consequences, which is why Type I error is the focal point in business analytics.

t-Test: Example

Last week, you were introduced to the t-test based on the student's t distribution. This week, we look at how this test is applied in the context of hypothesis testing.

You are a marketing manager for a brand of ready-to-eat cereal. Your product team just developed a low- sugar version of your product. You need to create a direct marketing campaign to support the sales of the new brand extension. You decide to pretest two different versions of the purchase offer—one featuring a $0.45 discount on the purchase of a single box and the second featuring a $1 discount on the purchase of two boxes. Your pretests suggest a purchase rate of 3.2 percent for the first offer and 3.6 percent for the second. You are inclined to choose the higher-yielding $1 discount offer for the wide-scale national rollout campaign, but you want to ensure that the 0.4 percent differential (3.6 percent – 3.2 percent) really does indicate a higher appeal of the $1 discount offer rather than random fluctuations in the sample data.

The null hypothesis is that the means do not differ from each other; the difference between the means is equal to zero. The alternative hypothesis is that the mean of the second offer is greater than the mean of the first offer. That is because the purpose of the survey suggests that the alternative hypothesis be formulated as a "greater than" conclusion, requiring a one-tailed test. Had the purpose of the survey been to detect only whether the difference between the means was significant, a two-tailed test would have been appropriate. The latter case would also be used in determining whether a value falls outside previously established norms or ranges.

Use of the t-test is appropriate when the population standard deviation is not known and therefore the sample deviation must be used as a proxy for the population standard deviation, and when the sample sizes are relatively small. In using the t-test, it is common practice to set the significance level at 95 percent. This means that you are willing to accept a 5 percent chance of the test's conclusion being incorrect. In other words, the probability of a Type I error is 5%.

To reiterate, the t-test is useful in comparing the magnitudes of two continuous quantities and drawing conclusions about whether the two quantities are the same or different from each other, or if one is bigger (or smaller) than the other. Being the same means that the feasible range of values for both quantities overlaps, and the observed differences are due to sampling fluctuations rather than to fundamental differences in quantity or quality (such as the appeal of an offer in the example on the previous page). The t-test considers the difference between the two means and the sample size–adjusted product of the variability of the two quantities (i.e., variance divided by sample size). A statistical conclusion is arrived at by comparing the test statistic (the t-test value) to the critical value using an appropriate distribution, generally at the 95% confidence level (as discussed on the previous page).

Though the t-test is a simple method and provides advantageous insights, it can be applied to only two quantities at a time. When there is a need to compare more than two alternatives (e.g., a pretest), you must either compute multiple t-tests or employ a multivariate extension of t-test, known as analysis of variance (ANOVA). Using multiple t-tests is usually not advisable as it inflates experiment-wide errors and can also

Page 2 of 4

Quantitative Analysis and Decision Making

©2017 South University

3 More on Hypothesis Testing

Errors and Tests

be cumbersome. For example, if we use two t-tests, each of which has a 5% chance of a Type I error, our overall experiment (which includes both t-tests) now has more than a 5% chance of a Type I error occurring.

ANOVA is used when the intent is to compare results when segments of a population receive different "treatments," that is, are subjected to different conditions. The term "treatments" stems from the initial development of ANOVA techniques to analyze the reaction of plants to different levels of applications of (treatments with) fertilizer.

T-test and ANOVA are important statistical techniques that are grounded in several assumptions about the underlying shape of the distribution in a population (e.g., normal distribution) and other parameters (e.g., means, standard deviation, etc.). Accordingly, these types of techniques are considered "parametric" tests (derived from the term parameter). However, what happens if such normal distribution doesn't exist in a population? More accurately, what if we don't have any assumptions about such parameters? Fortunately, another set of statistical techniques called "nonparametric" tests allows us to analyze data.

Definitions of nonparametric (and parametric, for that matter) can be elusive or at least inconsistent among sources. There are several ways we can categorize or provide examples of nonparametric tests. One way is through the type of data most often used when utilizing nonparametric tests. During the Week 1 lecture, we noted that four types of data exist: nominal, ordinal, interval, and ratio. Let's start with ratio data.

Ratio data represents numbers in an ordered relationship with an absolute zero point. For example, when you start counting from 0, 1, 2, 3, 4, 5, and so on, the data has a zero point, because zero means the absence of the thing you are counting. Furthermore, the distance between values is presumed to be equal. In other words the distance between 1 and 2 is the same as between 2 and 3, etc.

Interval data is similar to ratio data as far as the presumed equality of distance between the assigned values. However, the zero point is not absolute. It is rather arbitrary. Take a look at a thermometer; whether Fahrenheit or Centrigrade, both scales have an arbitrary zero point. An absolute zero would mean the absolute absence of heat, which is not true in either scale.

When we move to nominal and ordinal data something even more distinctive occurs. In the ordinal scale, the values are presumed to be "in order" but it is unknown whether the distance between the values is equal. Consider for a moment a scale of 1=some; 2=a little bit; 3= not very much; 4=none. The values seem to be in order, but is the difference between "some" and "a little bit" the same as between "not very much" and "none"? We don't really know.

Nominal moves even farther away as the values don't even have clear order. Instead they are assigned arbitrarily. For example, imagine 1= blue eyes; 2=brown eyes; 3=green eyes.

Although it is not the only way to distinguish parametric from nonparametric tests, researchers often turn to parametric tools when data is of the nominal or ordinal variety. The Chi-squared test is one example of a nonparametric tool.

Let's continue with the example of the marketing manager for a brand of ready-to-eat breakfast cereal. The product team recently completed the development of a low-sugar version of the product. The pretests suggested a purchase rate of 3.2 percent for the first offer (a $0.45 discount on the purchase of a single box)

Page 3 of 4

Quantitative Analysis and Decision Making

©2017 South University

4 More on Hypothesis Testing

Errors and Tests

and 3.6 percent for the second offer (a $1 discount on the purchase of two boxes). The pretests also suggested that the second offer yielded a higher trial rate among the company's high-value customers. This is important information to consider in your promotional planning.

As a next step, you decide to investigate if the second offer—the one with a higher yielding pretest (3.6 percent)—also had a higher purchase rate among high-value customers. A chi-squared test is the appropriate test to use in comparing the number of high-value customer purchases associated with the two offers. In this case, the comparison will be between the proportion of high-value customers who made purchases after viewing the second offer and the proportion who did not make a purchase. Conceptually, both the chi-squared and the t-test methods are similar as both compare two quantities. Unlike the t-test, however, the chi-squared test deals with discrete rather than continuous values, typically frequencies. The chi-squared test is measured by the sum of observed or expected frequencies, squared and divided by expected frequencies. Like the t-test, the chi-squared test results enable you to draw probabilistic conclusions.

In this week's lecture on hypothesis testing, sample-to-population differences were discussed. The point was made that the relationship between the population and a sample from the population is one of the key determinants of the quality of insights obtained from the analysis of the sample data. That is because the sample data is generally used to extrapolate conclusions to the population as a whole.

An important aspect of considering sample and population is that the population of interest is user defined. "Population" is the combination of information that users and analysts decide constitutes the population of interest. For example, for an automotive insurance carrier, the population of interest consists of the owners of automobiles, not all individuals. A sample of the population of automobile owners should comprise only automobile owners and not non-owners.

Drawing inferences from the population, as opposed to a sample of the population, is not the same thing. Differences between the two will result from sampling error. t-Samples convey imperfect descriptions of the characteristics of the population. In business analysis, population-derived values are facts, while sample-derived values are approximations of facts. Consider the following example.

Assume a trade association has a complete database of all homeowners in the U.S. The association determines that the average US house occupies a 2,123 square-foot area. This value is a fact as all homes have been accounted for and measured. However, this still leaves an open question: Should the average value include extremely large mansions and microhomes? That is: Do these extreme values skew the overall average up or down?

In contrast, if a builder secured a random sample of the overall homeowner's database, she might use this sample to estimate the average home size to be 2,574 square feet. The builder's dataset represents a subset of the entire population; hence, the estimate is not a fact but an approximation. Thus, the builder would be well advised to express the average size as a range, taking into account sampling imperfections (error).

Page 4 of 4

Quantitative Analysis and Decision Making

©2017 South University