Suppose I'm running an experiment that can have 2 outcomes, and I'm assuming that the underlying "true" distribution of the 2 outcomes is a binomial distribution with parameters $n$ and $p$: ${\rm Binomial}(n, p)$. I can compute the standard error, $SE_X = \frac{\sigma_X}{\sqrt{n}}$, from the form of the variance of ${\rm Binomial}(n, p)$: $$ \sigma^{2}_{X} = npq$$ where $q = 1-p$. So, $\sigma_X=\sqrt{npq}$. For the standard error I get: $SE_X=\sqrt{pq}$, but I've seen somewhere that $SE_X = \sqrt{\frac{pq}{n}}$. What did I do wrong?...Read more

I'm more of a programmer than a statistician, so I hope this question isn't too naive.It happens in sampling program executions at random times. If I take N=10 random-time samples of the program's state, I could see function Foo being executed on, for example, I=3 of those samples. I'm interested in what that tells me about the actual fraction of time F that Foo is in execution.I understand that I is binomially distributed with mean F*N. I also know that, given I and N, F follows a beta distribution. In fact I've verified by program the relatio...Read more

Consider a pair of RVs $X$ and $Y$, with the following conditional distributions:$$X | Y=y \sim Binom(L, y)$$$$Y | X=x \sim Beta(\alpha + x, \nu)$$where $L$, $\alpha$, and $\nu$; are all positive ($L$ is an integer of course). Is there a name for the joint distribution of $(X,Y)$? Or perhaps for the marginal distribution of $Y$? I think that if $x$ is eliminated from the shape "parameter" of the beta distribution, then $X$ is beta-binomial distributed. But in the above bivariate model, $X=x$ affects the shape parameter for the conditiona...Read more

I would like to know the limiting distribution when $k \uparrow \infty$ and $k/n \rightarrow \lambda$ of$$ \max(X_1, \ldots, X_k), \text{ where $X_i$ are IID $B(n,p)$}.$$This is most likely a Gumbel distribution. If this is indeed the case, what matters the most for me is to know the parameters of this Gumbel as a function of $(k, n, p)$....Read more

This is a followup question to Standard error for the mean of a sample of binomial random variablesThe previous answer mentions that the standard error of the sampling distribution is sqrt(kpq/n), where k is the number of trials in each binomial experiment, p is the probability of success, q is (1-p) and n is the number of experiments in generating the sampling distribution.For the normal approximation of the binomial confidence interval, the standard error is sqrt(pq/n). Does that mean the normal approximation is achieving this approximation b...Read more

I 'm looking to find a general formula for the following problem:The events:Let each event have two possible outcomes (e.g. success and failure)The events are independentScenario: Let the number of trials be YLet the number of successes be NHere's the tough part:Over Y trials, what is the probability of there being X successes in a row given that the total number of successes is N?...Read more

Let's say I have 1 success in 4 bernoulli trials, and I wish to plot the distribution of the parameter $p$ of the corresponding binomial distribution. I'm using R.The probability of seeing 1 sucess and 3 failures in 4 tests for $p=0.25$ is, for these parameters:> n <- 4> p <- 0.25> dbinom(1, n, p)[1] 0.421875To get the distribution for the parameter, I use a beta distribution $Beta(k+1, n-k+1)$. But when I try to calculate the value for $p=0.25$, I get a different result:> k <- 1> dbeta(p, k+1, n-k+1)> [1] 2.109375I t...Read more

This is a pretty basic question, I think, but I'm finding it difficult to locate an answer or decent treatment of it on the web.For a binary dataset (binomial) with some unknown parameter p (probability of "heads" or success or something observed), how does the accuracy of the estimated p depend on the number of trials?I've read a number of ways to calculate confidence intervals around the estimated p, but these seem to all depend on assuming a large number of trials and a normal distribution. I'm interested in particular here with small numbe...Read more

This is a Binomial question. I am required to find P(2X>4), and given that X~B(10,0.2), how do I convert it to 2X~B(n,p)?Have tried converting from P(2X>4) to P(X>2), but understand that it doesn't work that way.Do i have to use integration or any formula that i should refer?...Read more

The CDF for $X \sim \operatorname{Bin}(n,p)$ is $I_{1-p}(n-k,1+k)$. What does the $I$ mean?...Read more

Using the standard definition of the sample proportion random variable:We first suppose $X \sim binomial(n, p)$ is the count of "successes" in $n$ trials where each trial has probability $p$ of success. The sample proportion random variable is $\hat{p} = X/n$, the proportion of successes. If we are given data on several trial runs, it is possible to talk about the mean and variance of $\hat{p}$.My question is: is there a meaningful way to talk about the mean and variance of $\hat{p}$ if the number of $n$ changes from trial to trial? For example...Read more

I ran scipy.stats.binom.cdf(500006, 1000000, 0.5) and it took less than a milisecond. This is crazy as binomial CDF involves summing up a bunch of binomial coefficients. What approximation algorithm goes behind the implementation? Thanks!...Read more

If I know the probability of success for an event, the number of time that I want to observe the event at a certain confidence level how do I calculate the number of trials needed?...Read more

I know that negbin can approximate the betabin distribution, especially when the probability of hitting the max is low (events are more rare). If the offset of a negative binomial regression transforms the outcome to be between 0 and 1, is it substantially different from using beta-binomial regression? For context: My research question is: "Are my predictors associated with the number of risk behaviors (out of a possible 22) that participants reported doing in the last thirty days?"Therefore, my DV is a count/proportion of risk behaviors out of...Read more

The following appeared on an assignment of mine (already turned in). I contend that not enough information is given to provide an answer.... it seems pretty cut and clear to me. However, instructor insisted it's solvable in minitab. Can you help me figure out what I'm not understanding? How do you solve this without a model of distribution of weekly demand, or at least an average value to use as constant approximation. I must be missing something simple. The problem: Consider a service company. 10% of the weekly demand is for a service cate...Read more