### 1. Generating observations from a mixture of 3 one-dimensional Gaussians

• we specify the mean values, standard deviations, and weights for the 3 components:
mu = c(5, 25, 40)           # mean values for 3 components
sigma = c(5, 10, 4)         # standard deviations
alpha = c(1/2, 1/4, 1/4)    # component weights
sum(alpha) == 1             # must be TRUE
## [1] TRUE
• using the built-inrnorm() function, we conduct a two-step experiment to generate sample points from a Gaussian mixture
• first, we randomly choose one of the 3 components
• subsequently, we draw a sample point from the chosen component
• this is realized in the following loop:
• (note that all code snippets are designed for comprehensibility, not for efficiency!)
K = 3       # number of components
N = 1000    # number of observations to generate

x = rep(NA, N)                                  # init vector of observations
cols = rep(NA, N)                               # color recalls which Gaussian was chosen
color_palette = c("red", "darkgreen", "blue")   # different color for each component

for (i in 1:N) {
component = sample(1:K, size = 1, prob = alpha)       # draw a component with probability alpha
x[i] = rnorm(1, mu[component], sigma[component])      # generate one sample point from the chosen Gaussian
cols[i] = color_palette[component]                    # assign a color to the sample point
}
length(x)  # the random sample points are stored in the vector x
## [1] 1000
• to see the empirical distribution of the generated sample, a histogram can be plotted
• we add a small colored tick to the x-axis in order to identify the component each sample point came from
• Important: in practice, we do not know from which component a sample point originated from (the origins are just the “hidden variables”)
• here is a little function that creates the (colored) ticks on the x-axis:
ticks <- function(x, height = max(x)/30, colvec = rep("black", length(x))) {
for (i in 1:length(x)) segments(x0 = x[i], y0 = -height/2, x1 = x[i], y1 = height/2, col = colvec[i])
}
• we need also a function that calculates the probability density of a Gaussian mix:
gaussmix <- function(x, mu, sigma, alpha) {
if(length(mu) != length(sigma)) stop(" 'mu' and 'sigma' must have equal length'! \n")
if(length(mu) != length(alpha)) stop(" 'mu' and 'alpha' must have equal length'! \n")
K = length(mu)
mix = 0
for (k in 1:K) mix = mix + alpha[k]*dnorm(x, mu[k], sigma[k])
return(mix)
}
• we can also add to the histogram the curves representing the “true” probability density, i.e. the curves based on the parameters we have chosen to generate the sample:
• the dashed curves represent the Gaussians based on the 1st guess (multiplied with the corresponding weights)
• the bold green curve is the true solution, i.e. the superposition of the other curves
h <- hist(x, breaks = 25, plot = FALSE)
height = max(h$density)/30 plot(h, main = "Observations", font.main = 1, col = "lightgrey", freq = FALSE) ticks(x, height = height, colvec = cols) xmin = min(h$mids)
xmax = max(h$mids) for (k in 1:K) curve(alpha[k]*dnorm(x, mu[k], sigma[k]), from = xmin, to = xmax, n = 300, col = color_palette[k], lwd = 2, lty = 3, add = TRUE, yaxt = "n") # add superposition: curve(gaussmix(x, mu, sigma, alpha), from = xmin, to = xmax, n = 300, col = "green", lwd = 2, lty = 1, add = TRUE, yaxt = "n") ### 2. Estimation of the parameters from the sample points using EM #### 2.1. Initialization • in practice, the origins of the sample points are not known • that means we have to guess the parameters from a plot of the sample points without knowledge of the hidden variables (without the colored ticks): hist(x, breaks = 25, col = "grey", main = "Observations from Gaussian mixture", font.main = 1, freq = FALSE) • we have to know (or to guess) the number of components K • also, we define how many iterations we want to carry out at most K = 3 # we have to know (or to guess) the number of Gaussians max_iter = 200 # maximum number of iterations • we label the first guess with the additional characters “_t“: mu_t = c(5, 20, 40) # 1st guess for mean values sigma_t = c(10, 10, 10) # 1st guess for standard deviations, hard to guess alpha_t = c(1/3, 1/3, 1/3) # 1st guess for cluster weights sum(alpha_t) == 1 # must be TRUE ## [1] TRUE • it can be interesting to see a histogram of the “observed” sample points together with a density plot of the probability distribution that follows from our first guess (red line) • because we produced the sample points ourselves, and thereby know the true parameters, we can cheat a little bit and add the density plots based on these true parameter values (green line) • this might give us a clue if the guess if fairly well … or completely aside h <- hist(x, breaks = 25, plot = FALSE) height = max(h$density)/30
plot(h, main = "Observations", font.main = 1, col = "lightgrey", freq = FALSE)
xmin = min(h$mids) ; xmax = max(h$mids)
curve(gaussmix(x, mu_t, sigma_t, alpha_t), from = xmin, to = xmax, n = 300, col = "red", lwd = 2, lty = 1, add = TRUE, yaxt = "n") # probability density distribution that follows from the 1st guess
curve(gaussmix(x, mu, sigma, alpha), from = xmin, to = xmax, n = 300, col = "green", lwd = 2, lty = 1, add = TRUE, yaxt = "n")  # true solution (with parameters used for generation of the sample)

• the red density plot stands for the first guess, the green density line shows the true distribution
• the first guess seems to be poor
• we can also calculate the log likelihood of the samples based on the 1st guess:
logL = 0
for (k in 1:K) logL = logL + alpha_t[k] * dnorm(x, mu_t[k], sigma_t[k])  # k-sum
logL = log(logL)
logL = sum(logL)    # i-sum
cat(paste(" Log likelihood after 1st guess:", signif(logL,4), "\n"))
##  Log likelihood after 1st guess: -4119
Lvector = logL      # store the first value, append logL-values during the EM loop below
• reset the parameter vectors (to “NA”):
mu = rep(NA, K)         # K = number of clusters
sigma = rep(NA, K)
alpha = rep(NA, K)         
• we must not forget to define maximum relative errors for the parameters:
e_mu_max = 0.001       # means
e_sigma_max = 0.001    # covariance matrices
e_alpha_max = 0.001    # mixture weights

#### 2.2. Looping through E- and M-steps

• now, we can alternate through the E- and M-step
• during the E-step, we calculate the “membership probabilities” for the sample points, $$\omega_{ik}$$:

$\omega_{ik} = \; \frac{ \alpha_k^t \cdot f_k \left( x_i \mid \mu_k^t, \sigma_k^t \right) }{\sum_{k} \alpha_k^t \cdot f_k \left( x_i \mid \mu_k^t, \sigma_k^t \right)}$

• during the M-step, we update the parameter values (to maximize $$Q_1(\theta, \theta_t$$):

$\mu_m = \frac{\sum_i^N \omega_{im}\cdot x_i}{\sum_i^N \omega_{im}}$

$\sigma_m^2 = \frac{\sum_i \omega_{im}\cdot \left( x_i - \mu_m \right)^2 }{\sum_i \omega_{im}}$

$\alpha_m = \frac{1}{N} \cdot \sum_{i=1}^N \omega_{im}$

• note that all code snippets are designed for comprehensibility, not for efficiency!
for (iter in 1:max_iter) {

# --- E-step
omega = matrix(NA, nrow = N, ncol = K)
for (k in 1:K) omega[,k] = alpha_t[k] * dnorm(x, mu_t[k], sigma_t[k])     # numerator for each k
denominator  = 0; for (k in 1:K) denominator = denominator + omega[,k]    # denominator
for (k in 1:K) omega[,k] = omega[,k] / denominator
if(!all(round(rowSums(omega),4) == 1)) stop("All rowsums of omega must be 1!\n")
if(round(sum(rowSums(omega)), 3) != N)
stop("Sum over all elements of omega must be equal to the number of observations!\n")
if(any(omega < 0)) stop("No element of omega can be negative! \n")

# --- M-step

# update estimates for the parameters:
for (k in 1:K) mu[k] = sum(omega[,k] * x) / sum(omega[,k])
for (k in 1:K) sigma[k] = sqrt(sum(omega[,k] * (x - mu_t[k])^2) / sum(omega[,k]))
for (k in 1:K) alpha[k] = sum(omega[,k]) / N

# new value for log-likelihood:
logL = rep(0 , N)
for (k in 1:K) logL = logL + alpha[k] * dnorm(x, mu[k], sigma[k])  # k-sum
logL = log(logL)
logL = sum(logL) # i-sum
cat(paste("Log likelihood after iteration", iter, ":", signif(logL,4), "\n"))
Lvector = c(Lvector, logL)    # append the new value

# calculate relative errors:
e_mu     = max(abs((mu - mu_t) / (mu_t + 1e-50)))
e_sigma  = max(abs((sigma - sigma_t) / (sigma_t + 1e-50)))
e_alpha  = max(abs((alpha - alpha_t) / (alpha_t + 1e-50)))
cat(paste("Loop", iter, ":  e_mu =", signif(e_mu,4), "e_sigma =", signif(e_sigma,4), "e_alpha =", signif(e_alpha,5), "\n"))

# need another loop?:
cond1 = e_mu <= e_mu_max          # condition 1, for mu
cond2 = e_sigma <= e_sigma_max    # condition 2, for sigma
cond3 = e_alpha <= e_alpha_max    # condition 3, for alpha
if(cond1 & cond2 & cond3) break   # converged!

# otherwise, continue to next loop:
mu_t = mu
sigma_t = sigma
alpha_t = alpha

} # EM-loop
## Log likelihood after iteration 1 : -3917
## Loop 1 :  e_mu = 0.1234 e_sigma = 0.3755 e_alpha = 0.21419
## Log likelihood after iteration 2 : -3879
## Loop 2 :  e_mu = 0.01209 e_sigma = 0.1926 e_alpha = 0.10784
## Log likelihood after iteration 3 : -3868
## Loop 3 :  e_mu = 0.02953 e_sigma = 0.09795 e_alpha = 0.035248
## Log likelihood after iteration 4 : -3863
## Loop 4 :  e_mu = 0.03159 e_sigma = 0.05879 e_alpha = 0.0081374
## Log likelihood after iteration 5 : -3861
## Loop 5 :  e_mu = 0.02685 e_sigma = 0.03562 e_alpha = 0.010089
## Log likelihood after iteration 6 : -3860
## Loop 6 :  e_mu = 0.01998 e_sigma = 0.02167 e_alpha = 0.0082208
## Log likelihood after iteration 7 : -3860
## Loop 7 :  e_mu = 0.01351 e_sigma = 0.01455 e_alpha = 0.0055981
## Log likelihood after iteration 8 : -3859
## Loop 8 :  e_mu = 0.009768 e_sigma = 0.009986 e_alpha = 0.0040304
## Log likelihood after iteration 9 : -3859
## Loop 9 :  e_mu = 0.007952 e_sigma = 0.0071 e_alpha = 0.0028762
## Log likelihood after iteration 10 : -3859
## Loop 10 :  e_mu = 0.00666 e_sigma = 0.005291 e_alpha = 0.0029458
## Log likelihood after iteration 11 : -3859
## Loop 11 :  e_mu = 0.005749 e_sigma = 0.004166 e_alpha = 0.0035253
## Log likelihood after iteration 12 : -3859
## Loop 12 :  e_mu = 0.0051 e_sigma = 0.00347 e_alpha = 0.0040538
## Log likelihood after iteration 13 : -3859
## Loop 13 :  e_mu = 0.004622 e_sigma = 0.003044 e_alpha = 0.0043344
## Log likelihood after iteration 14 : -3859
## Loop 14 :  e_mu = 0.004258 e_sigma = 0.002785 e_alpha = 0.0044582
## Log likelihood after iteration 15 : -3859
## Loop 15 :  e_mu = 0.003967 e_sigma = 0.002629 e_alpha = 0.0044847
## Log likelihood after iteration 16 : -3859
## Loop 16 :  e_mu = 0.003726 e_sigma = 0.002536 e_alpha = 0.0044521
## Log likelihood after iteration 17 : -3859
## Loop 17 :  e_mu = 0.003517 e_sigma = 0.002479 e_alpha = 0.0043843
## Log likelihood after iteration 18 : -3859
## Loop 18 :  e_mu = 0.003333 e_sigma = 0.002445 e_alpha = 0.0042965
## Log likelihood after iteration 19 : -3859
## Loop 19 :  e_mu = 0.003166 e_sigma = 0.002423 e_alpha = 0.0041982
## Log likelihood after iteration 20 : -3859
## Loop 20 :  e_mu = 0.003013 e_sigma = 0.002407 e_alpha = 0.0040951
## Log likelihood after iteration 21 : -3859
## Loop 21 :  e_mu = 0.002871 e_sigma = 0.002394 e_alpha = 0.0039907
## Log likelihood after iteration 22 : -3859
## Loop 22 :  e_mu = 0.002738 e_sigma = 0.002383 e_alpha = 0.0038871
## Log likelihood after iteration 23 : -3859
## Loop 23 :  e_mu = 0.002613 e_sigma = 0.002371 e_alpha = 0.0037855
## Log likelihood after iteration 24 : -3859
## Loop 24 :  e_mu = 0.002495 e_sigma = 0.002359 e_alpha = 0.0036868
## Log likelihood after iteration 25 : -3859
## Loop 25 :  e_mu = 0.002384 e_sigma = 0.002345 e_alpha = 0.0035911
## Log likelihood after iteration 26 : -3859
## Loop 26 :  e_mu = 0.002279 e_sigma = 0.00233 e_alpha = 0.0034987
## Log likelihood after iteration 27 : -3859
## Loop 27 :  e_mu = 0.00218 e_sigma = 0.002314 e_alpha = 0.0034095
## Log likelihood after iteration 28 : -3859
## Loop 28 :  e_mu = 0.002086 e_sigma = 0.002297 e_alpha = 0.0033236
## Log likelihood after iteration 29 : -3859
## Loop 29 :  e_mu = 0.001997 e_sigma = 0.002278 e_alpha = 0.0032409
## Log likelihood after iteration 30 : -3859
## Loop 30 :  e_mu = 0.001912 e_sigma = 0.002259 e_alpha = 0.0031611
## Log likelihood after iteration 31 : -3859
## Loop 31 :  e_mu = 0.001832 e_sigma = 0.002239 e_alpha = 0.0030843
## Log likelihood after iteration 32 : -3859
## Loop 32 :  e_mu = 0.001756 e_sigma = 0.002218 e_alpha = 0.0030103
## Log likelihood after iteration 33 : -3859
## Loop 33 :  e_mu = 0.001684 e_sigma = 0.002197 e_alpha = 0.002939
## Log likelihood after iteration 34 : -3859
## Loop 34 :  e_mu = 0.001616 e_sigma = 0.002175 e_alpha = 0.0028702
## Log likelihood after iteration 35 : -3859
## Loop 35 :  e_mu = 0.001551 e_sigma = 0.002152 e_alpha = 0.0028039
## Log likelihood after iteration 36 : -3859
## Loop 36 :  e_mu = 0.00149 e_sigma = 0.002129 e_alpha = 0.0027399
## Log likelihood after iteration 37 : -3859
## Loop 37 :  e_mu = 0.001431 e_sigma = 0.002106 e_alpha = 0.0026781
## Log likelihood after iteration 38 : -3859
## Loop 38 :  e_mu = 0.001376 e_sigma = 0.002083 e_alpha = 0.0026185
## Log likelihood after iteration 39 : -3859
## Loop 39 :  e_mu = 0.001323 e_sigma = 0.002059 e_alpha = 0.0025609
## Log likelihood after iteration 40 : -3859
## Loop 40 :  e_mu = 0.001273 e_sigma = 0.002035 e_alpha = 0.0025052
## Log likelihood after iteration 41 : -3859
## Loop 41 :  e_mu = 0.001225 e_sigma = 0.002011 e_alpha = 0.0024514
## Log likelihood after iteration 42 : -3859
## Loop 42 :  e_mu = 0.00118 e_sigma = 0.001987 e_alpha = 0.0023993
## Log likelihood after iteration 43 : -3859
## Loop 43 :  e_mu = 0.001137 e_sigma = 0.001964 e_alpha = 0.0023489
## Log likelihood after iteration 44 : -3859
## Loop 44 :  e_mu = 0.001096 e_sigma = 0.00194 e_alpha = 0.0023002
## Log likelihood after iteration 45 : -3859
## Loop 45 :  e_mu = 0.001057 e_sigma = 0.001916 e_alpha = 0.002253
## Log likelihood after iteration 46 : -3859
## Loop 46 :  e_mu = 0.00102 e_sigma = 0.001892 e_alpha = 0.0022073
## Log likelihood after iteration 47 : -3859
## Loop 47 :  e_mu = 0.0009844 e_sigma = 0.001869 e_alpha = 0.002163
## Log likelihood after iteration 48 : -3859
## Loop 48 :  e_mu = 0.0009507 e_sigma = 0.001845 e_alpha = 0.0021201
## Log likelihood after iteration 49 : -3859
## Loop 49 :  e_mu = 0.0009186 e_sigma = 0.001822 e_alpha = 0.0020785
## Log likelihood after iteration 50 : -3859
## Loop 50 :  e_mu = 0.000888 e_sigma = 0.001799 e_alpha = 0.0020381
## Log likelihood after iteration 51 : -3859
## Loop 51 :  e_mu = 0.0008588 e_sigma = 0.001776 e_alpha = 0.001999
## Log likelihood after iteration 52 : -3859
## Loop 52 :  e_mu = 0.0008309 e_sigma = 0.001753 e_alpha = 0.001961
## Log likelihood after iteration 53 : -3859
## Loop 53 :  e_mu = 0.0008043 e_sigma = 0.00173 e_alpha = 0.0019241
## Log likelihood after iteration 54 : -3859
## Loop 54 :  e_mu = 0.0007789 e_sigma = 0.001708 e_alpha = 0.0018882
## Log likelihood after iteration 55 : -3859
## Loop 55 :  e_mu = 0.0007547 e_sigma = 0.001686 e_alpha = 0.0018534
## Log likelihood after iteration 56 : -3859
## Loop 56 :  e_mu = 0.0007315 e_sigma = 0.001664 e_alpha = 0.0018196
## Log likelihood after iteration 57 : -3859
## Loop 57 :  e_mu = 0.0007094 e_sigma = 0.001643 e_alpha = 0.0017867
## Log likelihood after iteration 58 : -3859
## Loop 58 :  e_mu = 0.0006882 e_sigma = 0.001621 e_alpha = 0.0017547
## Log likelihood after iteration 59 : -3859
## Loop 59 :  e_mu = 0.000668 e_sigma = 0.0016 e_alpha = 0.0017236
## Log likelihood after iteration 60 : -3859
## Loop 60 :  e_mu = 0.0006486 e_sigma = 0.00158 e_alpha = 0.0016933
## Log likelihood after iteration 61 : -3859
## Loop 61 :  e_mu = 0.00063 e_sigma = 0.001559 e_alpha = 0.0016638
## Log likelihood after iteration 62 : -3859
## Loop 62 :  e_mu = 0.0006123 e_sigma = 0.001539 e_alpha = 0.0016352
## Log likelihood after iteration 63 : -3859
## Loop 63 :  e_mu = 0.0005952 e_sigma = 0.001519 e_alpha = 0.0016072
## Log likelihood after iteration 64 : -3859
## Loop 64 :  e_mu = 0.0005789 e_sigma = 0.001499 e_alpha = 0.00158
## Log likelihood after iteration 65 : -3859
## Loop 65 :  e_mu = 0.0005633 e_sigma = 0.00148 e_alpha = 0.0015535
## Log likelihood after iteration 66 : -3859
## Loop 66 :  e_mu = 0.0005482 e_sigma = 0.00146 e_alpha = 0.0015276
## Log likelihood after iteration 67 : -3859
## Loop 67 :  e_mu = 0.0005338 e_sigma = 0.001441 e_alpha = 0.0015024
## Log likelihood after iteration 68 : -3859
## Loop 68 :  e_mu = 0.00052 e_sigma = 0.001423 e_alpha = 0.0014778
## Log likelihood after iteration 69 : -3859
## Loop 69 :  e_mu = 0.0005067 e_sigma = 0.001404 e_alpha = 0.0014538
## Log likelihood after iteration 70 : -3859
## Loop 70 :  e_mu = 0.0004939 e_sigma = 0.001386 e_alpha = 0.0014304
## Log likelihood after iteration 71 : -3859
## Loop 71 :  e_mu = 0.0004816 e_sigma = 0.001368 e_alpha = 0.0014076
## Log likelihood after iteration 72 : -3859
## Loop 72 :  e_mu = 0.0004698 e_sigma = 0.001351 e_alpha = 0.0013853
## Log likelihood after iteration 73 : -3859
## Loop 73 :  e_mu = 0.0004584 e_sigma = 0.001333 e_alpha = 0.0013635
## Log likelihood after iteration 74 : -3859
## Loop 74 :  e_mu = 0.0004475 e_sigma = 0.001316 e_alpha = 0.0013422
## Log likelihood after iteration 75 : -3859
## Loop 75 :  e_mu = 0.0004369 e_sigma = 0.0013 e_alpha = 0.0013214
## Log likelihood after iteration 76 : -3859
## Loop 76 :  e_mu = 0.0004268 e_sigma = 0.001283 e_alpha = 0.0013011
## Log likelihood after iteration 77 : -3859
## Loop 77 :  e_mu = 0.000417 e_sigma = 0.001267 e_alpha = 0.0012813
## Log likelihood after iteration 78 : -3859
## Loop 78 :  e_mu = 0.0004075 e_sigma = 0.00125 e_alpha = 0.0012618
## Log likelihood after iteration 79 : -3859
## Loop 79 :  e_mu = 0.0003984 e_sigma = 0.001235 e_alpha = 0.0012428
## Log likelihood after iteration 80 : -3859
## Loop 80 :  e_mu = 0.0003896 e_sigma = 0.001219 e_alpha = 0.0012243
## Log likelihood after iteration 81 : -3859
## Loop 81 :  e_mu = 0.0003811 e_sigma = 0.001204 e_alpha = 0.0012061
## Log likelihood after iteration 82 : -3859
## Loop 82 :  e_mu = 0.0003729 e_sigma = 0.001188 e_alpha = 0.0011883
## Log likelihood after iteration 83 : -3859
## Loop 83 :  e_mu = 0.0003649 e_sigma = 0.001173 e_alpha = 0.0011709
## Log likelihood after iteration 84 : -3859
## Loop 84 :  e_mu = 0.0003572 e_sigma = 0.001159 e_alpha = 0.0011538
## Log likelihood after iteration 85 : -3859
## Loop 85 :  e_mu = 0.0003498 e_sigma = 0.001144 e_alpha = 0.0011371
## Log likelihood after iteration 86 : -3859
## Loop 86 :  e_mu = 0.0003426 e_sigma = 0.00113 e_alpha = 0.0011207
## Log likelihood after iteration 87 : -3859
## Loop 87 :  e_mu = 0.0003357 e_sigma = 0.001116 e_alpha = 0.0011047
## Log likelihood after iteration 88 : -3859
## Loop 88 :  e_mu = 0.0003289 e_sigma = 0.001102 e_alpha = 0.001089
## Log likelihood after iteration 89 : -3859
## Loop 89 :  e_mu = 0.0003224 e_sigma = 0.001088 e_alpha = 0.0010736
## Log likelihood after iteration 90 : -3859
## Loop 90 :  e_mu = 0.000316 e_sigma = 0.001075 e_alpha = 0.0010585
## Log likelihood after iteration 91 : -3859
## Loop 91 :  e_mu = 0.0003099 e_sigma = 0.001061 e_alpha = 0.0010437
## Log likelihood after iteration 92 : -3859
## Loop 92 :  e_mu = 0.0003039 e_sigma = 0.001048 e_alpha = 0.0010292
## Log likelihood after iteration 93 : -3859
## Loop 93 :  e_mu = 0.0002981 e_sigma = 0.001036 e_alpha = 0.0010149
## Log likelihood after iteration 94 : -3859
## Loop 94 :  e_mu = 0.0002925 e_sigma = 0.001023 e_alpha = 0.0010009
## Log likelihood after iteration 95 : -3859
## Loop 95 :  e_mu = 0.0002871 e_sigma = 0.00101 e_alpha = 0.00098723
## Log likelihood after iteration 96 : -3859
## Loop 96 :  e_mu = 0.0002818 e_sigma = 0.0009979 e_alpha = 0.00097378

### 3. Looking at the results

• note that you’ll never get the same results when runs are repeated because the sample points are randomly generated
• the estimated parameter values:
if(iter == max_iter) {
warning("Not converged!\n\n")
} else {
cat("Converged after", iter, "iterations.\n\n")
}
## Converged after 96 iterations.
## List results:
cat(paste("Mean values of the", K, "clusters:\n")) 
## Mean values of the 3 clusters:
for (k in 1:K) cat(paste("  Cluster", k, "  mu =", signif(mu[k],4), "\n"))
##   Cluster 1   mu = 5.046
##   Cluster 2   mu = 21.47
##   Cluster 3   mu = 39.83
cat(paste("Standard deviations and correlation of the", K, "clusters:\n"))
## Standard deviations and correlation of the 3 clusters:
for (k in 1:K) cat(paste("  Cluster", k, "  sigma =", signif(sigma[k],4), "\n"))
##   Cluster 1   sigma = 4.841
##   Cluster 2   sigma = 8.353
##   Cluster 3   sigma = 3.945
cat(paste("Cluster weigths of the", K, "clusters:\n"))
## Cluster weigths of the 3 clusters:
for (k in 1:K) cat(paste("  Cluster", k, "  alpha =", signif(alpha[k], 4), "\n"))
##   Cluster 1   alpha = 0.4847
##   Cluster 2   alpha = 0.2075
##   Cluster 3   alpha = 0.3079
• a plot of the log-likelihood vs. iteration number
• according to the theory, the log-likelihood can never decrease during any iteration
  plot(Lvector, ylab = "log(L)", main = "Log-likelihood", font.main = 1, xlab = "iter", col = "blue", lwd = 1.5 )
mtext(side = 3, "First value based on first guess", col = "blue", cex = 0.9)

• a histogram of observations, together with the density curves based on the parameter estimations
  h <- hist(x, breaks = 25, plot = FALSE)
height = max(h$density)/30 plot(h, main = "Observations", font.main = 1, col = "lightgrey", freq = FALSE) xmin = min(h$mids)  # objekt h from histogram above
xmax = max(h\$mids)  # objekt h from histogram above
curve(gaussmix(x, mu, sigma, alpha), from = xmin, to = xmax, n = 300, col = "green", lwd = 2, add = TRUE, yaxt = "n")`