# Fastest algorithm to find the weights to maximize the log likelihood function

Dear all,

For example, n and K are given integer (you can assume n=K), also given all the values of p_ik, I want to find the weights from w_1 to w_K (the constraint is w1 + w2 + w3 ... + w_K =1), such that the below function f (which is actually the log likelihood function) is maximized,

What is the fastest algorithm (and/or Fortran code) to find these weights w_1, w_2, …, w_K?

PS.

One slow solution could be using the below relation to do iteration and find each w_k,

In the problem, we usually assume n=K. Therefore the above iteration algorithm is slow especially if n is big. It is O( # of Iteration * n^2) algorithm.

I would search the max of A, B, C… and put its corresponding weight to 1, and all the other weights to zero.
But I guess you mean something more complicated in your question…

1 Like

Thank you so much @vmagnin , LOL
Right right right, I just got up then, and that naïve eqution is really naive, LOL.
I updated my question. Thank you very much for pointing that out, you solution for my initial question is absolutely right

1 Like

Less trivial…

My intuition is that some mathematical treatment could help. I would try to take and maximize the exponential of f, which would give in a very simplified notation:

e^f = e^{\sum ln} = \prod e^{ln} = \prod (\sum w_k p_{ik} )

That is a polynomial of K variables. It looks as a more sympathetic starting point…

1 Like

Thank you very much @vmagnin
Yeah you got the point, the f function is actually log likelihood.
The purpose is really to maximize the likelihood which is exactly what you are calculating e^f.

One possible issue of calculating e^f is that, if

is small, say, 10^-10, then if we have 100 product of 10^-10, it may underflow. Similarly, it may also overflow.
While using log likelihood, product become sum, such things could be greatly prevented.

By the way, how did you type Latex here? Thanks

For LaTeX, there is a plugin in the Discourse. Just type something between $and$:

$e^f = ...$


Concerning your problem, you won’t have to compute e^f in the program. You would just have to maximize your polynomial \prod (\sum w_k p_{ik}). The w_k values maximizing it will also maximize f (as exponential is monotonically increasing).
Maybe some more mathematics are possible before looking for the optimizing algorithm?

With you constraint, you have K-1 unknown w_k.

1 Like

Awesome thank you @vmagnin ! You are absolutely right, there are K-1 unknow w_k.
This is an optimization problem, with K-1 unknown parameters(weights) w_k.

In fact there are also parameters in p_{ik}. I mean p_{ik} is actually p_{ik}(\phi_{ik}) where \phi_{ik} are the parameters.

I actually use simulated annealing (SA) to do this problem to maximize function f.
I need to find all the optimum parameters \phi_{ik} in p_{ik}, as well as K-1 unknown weights w_k.
In SA, the penalty is, if for some proposed weight w_l, the last weight which is the K^{th} weight w_K=1-\sum_{k=1}^{K-1} w_k <0 , such a proposed weight w_l will be rejected.

But I was thinking other ways of doing SA. We always need to calculate the objective function f which depend on parameters \phi_{ik} in p_{ik} as well as weights.
Now, instead of varying \phi_{ik} in p_{ik} and the weights together and then calculate the f, I only vary the \phi_{ik} in all the p_{ik}, then based on these parameters (you can think of all the p_{ik} are fixed now) calculate the optimum weights. This is the motivation of this question.

However, for SA, perhaps it is better just to vary the parameters \phi_{ik} in p_{ik} and the weights together, instead of ‘calculate’ the weights. Because it need to vary \phi_{ik} in p_{ik} many many times, and if each time the weights needs to be calcualted, it will be very slow. Perhaps it is better just to vary the parameters \phi_{ik} in p_{ik} and the weights together. But again, like you said, there are K-1 unknow w_k.

Have you tried GAMS (https://gams.nist.gov/)?

A search for “likelihood” produces several hits:
https://gams.nist.gov/cgi-bin/serve.cgi/NamedModules?mode=search&Name=&package=&class=&language=&keyword=likelihood

1 Like

I don’t know which way is the best for your problem, and which optimization method will be the fastest.

If you work from the multivariate polynomial, and supposing the p_{ik} fixed, you know that the partial derivatives can be computed everywhere. That may help in choosing an optimization method.

1 Like