When they do not have the census of the points of sale that they cover, OpenHealth and its European partners allow their customers to follow modeled data on a national basis, ie that is, extrapolated from a sample of points of sale. While these extrapolated data provide our users with a very solid basis for their market analyzes, it nevertheless carries a margin of statistical uncertainty, the magnitude of which depends on several factors detailed below.
Definitions
Confidence interval:
A confidence interval frames a real value that we seek to estimate using measurements taken by a random process. This concept makes it possible to define a statistical uncertainty margin.
Confidence level:
A confidence level represents the level of certainty and is expressed in%. A 95% confidence level is most commonly used in statistical studies.
Factors impacting the size of the interval for a given confidence level
There are 4 factors that determine the size of the confidence interval for a given confidence level:
Sample size
The percentage
The size of the population
The time period
The size of the sample
The larger the sample size, the more the results will truly reflect the population. This indicates that for a given confidence level, the larger the sample size, the smaller the confidence interval. However, the relationship is not linear (ie, doubling the sample size does not halve the confidence interval).
The percentage
Precision also depends on the percentage of the sample that chooses a particular answer. If 99% of the sample answered "Yes" and 1% answered "No", the chances of statistical uncertainty are low, regardless of the sample size. However, if the percentages are 51% and 49%, the chances of statistical uncertainty are much greater. Extreme responses are easier to be sure than intermediate responses.
The size of the population
Population size is only likely to be a factor when working with a relatively small population.
The time period
The Selling Digital Distribution will depend on the time period studied. A DNV will be lower daily and therefore greater uncertainty.
Sample size formula
Z = Z value (eg 1.96 for 95% confidence level)
p = percentage picking a choice, expressed as decimal (.5 used for sample size needed)
c = confidence interval, expressed as decimal (eg, .04 = ± 4)
Correction formula for the finite population
Limitations
Confidence interval calculations assume that you have a true random sample of the affected population.
If your sample is not truly random, you cannot trust the intervals.
illustrations
For mainland France except Corsica:
If my product has a DNV of 100% and extrapolated sales of 100 units, a confidence interval of 0.68% means that there is 95% of lucky that my actual sales are between 99.32 units and 100.68 units. The uncertainty is low.
If my product has a DNV of 1% and extrapolated sales of 100 units, a confidence interval of 9.05% means there are 95 % chance that my actual sales are between 90.95 units and 109.05 units. The uncertainty is greater.