CH3. Method for Generating Random variables

R/Statistical Computation and Practice

CH3. Method for Generating Random variables

Abokadoh 2023. 10. 14. 19:31

Statistical Computing with R

나는 3장부터 시작하겠다. 공부를 시작해보자.

본 3장의 내용은 통계 계산에 필요한 기본 도구 중 하나로 확률 분포에서 무작위 변수를 시뮬레이션하는 기능을 다룬다.

가장 간단한 경우에서 유한 모집단에서 무작위 샘플을 추출하는 시뮬레이션을 하려면 discrete uniform dist로부터 무작위 샘플을 생성하는 방법이 필요하다. 따라서 균일한 의사 난수를 생성하는 적절한 generator가 필수적이다.

R의 uniform pseudo-random number generator는 runif(n,a,b) 가 있다.

runif(n,a,b)

n x m 매트릭스로 0과 1 사이에 난수를 생성하기 위해서는 아래와 같은 코드를 사용할 수 있다.

matrix(runif(n*m), nrow = n, ncol = m) # nrow, ncol 생략 가능!

example에서는 conti or discrete prob dist로부터 random variates들을 생성하는 함수를 다룰 것이다.

most of the examples 에서는 generated된 sample의 dist와 theoretical dist를 비교하는 내용이 있다.

R은 믿을만한 프로그램이기 때문에 뭐 이렇게까지 확인검증을 하냐 싶긴 하다.!

어차피 거의 비슷하거나 완벽할거잖아 !

histogram, density curve, QQ plot, summary statistics(such as sample moments, percentiles, empirical dist)등을 통해 비교해보자.

Example 3.1

(Sampling from a finite population). The sample function can be used to sample from a finite population, with or without replacement

finite population에서의 샘플링

sample() 함수는 replace parameter를 TRUE, FALSE를 조건으로 하여 샘플을 추출할 수 있다.

#toss some coins

sample(0:1, size = 10, replace = TRUE)
# [1] 0 0 1 1 1 0 0 0 1 0
 
#choose some lottery numbers
sample(1:100, size = 6, replace = FALSE)
# [1] 31 78  3 23 91 35

#permuation of letter a-z
sample(letters)
#  [1] "s" "u" "f" "w" "y" "h" "n" "o" "l" "a" "j" "z" "d"
# [14] "g" "i" "p" "t" "e" "m" "k" "b" "c" "v" "x" "q" "r"

#sample from a multinomial dist
x <- sample(1:3, size = 100, replace = TRUE,
			prob = c(0.2, 0.3, 0.5))
table(x)
# x
# 1  2  3 
# 15 23 62 

x
#  [1] 2 3 3 2 3 3 3 2 2 3 2 3 3 3 3 3 3 2 3 3 3 3 3 2 1 3 1
# [28] 2 3 1 3 1 3 1 3 1 3 3 3 1 2 3 3 2 2 2 2 3 3 3 3 1 2 3
# [55] 1 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 1 2 3 1 1 3 3 3 3
# [82] 3 2 2 3 1 3 3 2 1 2 3 3 3 2 1 3 2 3 2

specified probability dist로 random variates를 생성하는 방법을 공부해보자.

R에서 사용할 수 있는 몇 가지 확률 함수부터 요약하자면 일반적으로 많이 사용되는 pmf, pdf, cdf, quantile function, random generator 가 있다.

dbinom(x, size, prob, log = FALSE)
pbinom(q, size, prob, lower.tail = TRUE, log.p = FALSE)
qbinom(p, size, prob, lower.tail = TRUE, log.p = FALSE)
rbinom(n, size, prob)

이 외에도 여러가지 함수가 있지만, 다른 확률 분포에도 동일한 패턴이 적용될 것이다.