Fastest algorithm to find the weights to maximize the log likelihood function

Dear all,

For example, n and K are given integer (you can assume n=K), also given all the values of p_ik, I want to find the weights from w_1 to w_K (the constraint is w1 + w2 + w3 ... + w_K =1), such that the below function f (which is actually the log likelihood function) is maximized,

What is the fastest algorithm (and/or Fortran code) to find these weights w_1, w_2, …, w_K?

Thanks much in advance!


One slow solution could be using the below relation to do iteration and find each w_k,


In the problem, we usually assume n=K. Therefore the above iteration algorithm is slow especially if n is big. It is O( # of Iteration * n^2) algorithm.

I would search the max of A, B, C… and put its corresponding weight to 1, and all the other weights to zero. :slight_smile:
But I guess you mean something more complicated in your question…

1 Like

Thank you so much @vmagnin , :+1: LOL :rofl:
Right right right, I just got up then, and that naïve eqution is really naive, LOL. :rofl:
I updated my question. Thank you very much for pointing that out, you solution for my initial question is absolutely right :laughing:

1 Like

Less trivial… :grin:

My intuition is that some mathematical treatment could help. I would try to take and maximize the exponential of f, which would give in a very simplified notation:

e^f = e^{\sum ln} = \prod e^{ln} = \prod (\sum w_k p_{ik} )

That is a polynomial of K variables. It looks as a more sympathetic starting point…

1 Like

Thank you very much @vmagnin :laughing:
Yeah you got the point, the f function is actually log likelihood.
The purpose is really to maximize the likelihood which is exactly what you are calculating e^f.

One possible issue of calculating e^f is that, if
is small, say, 10^-10, then if we have 100 product of 10^-10, it may underflow. Similarly, it may also overflow.
While using log likelihood, product become sum, such things could be greatly prevented.

By the way, how did you type Latex here? Thanks :+1:

For LaTeX, there is a plugin in the Discourse. Just type something between $ and $:

$ e^f = ... $

Concerning your problem, you won’t have to compute e^f in the program. You would just have to maximize your polynomial \prod (\sum w_k p_{ik}). The w_k values maximizing it will also maximize f (as exponential is monotonically increasing).
Maybe some more mathematics are possible before looking for the optimizing algorithm?

With you constraint, you have K-1 unknown w_k.

1 Like

Awesome thank you @vmagnin ! You are absolutely right, there are K-1 unknow w_k.
This is an optimization problem, with K-1 unknown parameters(weights) w_k.

In fact there are also parameters in p_{ik}. I mean p_{ik} is actually p_{ik}(\phi_{ik}) where \phi_{ik} are the parameters.

I actually use simulated annealing (SA) to do this problem to maximize function f.
I need to find all the optimum parameters \phi_{ik} in p_{ik}, as well as K-1 unknown weights w_k.
In SA, the penalty is, if for some proposed weight w_l, the last weight which is the K^{th} weight w_K=1-\sum_{k=1}^{K-1} w_k <0 , such a proposed weight w_l will be rejected.

But I was thinking other ways of doing SA. We always need to calculate the objective function f which depend on parameters \phi_{ik} in p_{ik} as well as weights.
Now, instead of varying \phi_{ik} in p_{ik} and the weights together and then calculate the f, I only vary the \phi_{ik} in all the p_{ik}, then based on these parameters (you can think of all the p_{ik} are fixed now) calculate the optimum weights. This is the motivation of this question.

However, for SA, perhaps it is better just to vary the parameters \phi_{ik} in p_{ik} and the weights together, instead of ‘calculate’ the weights. Because it need to vary \phi_{ik} in p_{ik} many many times, and if each time the weights needs to be calcualted, it will be very slow. Perhaps it is better just to vary the parameters \phi_{ik} in p_{ik} and the weights together. But again, like you said, there are K-1 unknow w_k. :+1:

Have you tried GAMS (

A search for “likelihood” produces several hits:

1 Like

I don’t know which way is the best for your problem, and which optimization method will be the fastest.

If you work from the multivariate polynomial, and supposing the p_{ik} fixed, you know that the partial derivatives can be computed everywhere. That may help in choosing an optimization method.

1 Like