




Fortran Programming Language









many "ifs" versus linked list of true cases
I want to solve a large set, say 5000, of numbered, quite similar problems but with different values of certain arguments in each case. Solution of each case is by iteration; the number of iterations required for a given case could be large or small, depending on the values of parameters. The maximum number of necessary iterations is perhaps 50, but only a small proportion of cases are likely to require very many iterations. For particular reasons I want my outer loop to be iteration number; so that on a given iteration, I may or may not need to process a given case, according to whether or not it has been solved. My question is, which is more efficient: on each iteration, to ask in each of 5000 cases whether say "solved(case) == .TRUE.", or to maintain a linked list integer indices of sofarunsolved cases and cycle over that list each time, deleting cases from the list as they are solved? I have a suspicion that "it depends," but if anyone has a sense that one method is likely to be better, I'd like to know about it. There was some suggestion in earlier discussions here that allocations aren't all that cheap, so is the seeming elegance of not always asking "is this case solved?" is outweighed by the expense of that?
Mark Morss wrote: > I want to solve a large set, say 5000, of numbered, quite similar > problems but with different values of certain arguments in each case. > Solution of each case is by iteration; the number of iterations > required for a given case could be large or small, depending on the > values of parameters. The maximum number of necessary iterations is > perhaps 50, but only a small proportion of cases are likely to require > very many iterations. > For particular reasons I want my outer loop to be iteration number; so > that on a given iteration, I may or may not need to process a given > case, according to whether or not it has been solved. > My question is, which is more efficient: on each iteration, to ask in > each of 5000 cases whether say "solved(case) == .TRUE.", or to > maintain a linked list integer indices of sofarunsolved cases and > cycle over that list each time, deleting cases from the list as they > are solved? > I have a suspicion that "it depends," but if anyone has a sense that > one method is likely to be better, I'd like to know about it. There > was some suggestion in earlier discussions here that allocations > aren't all that cheap, so is the seeming elegance of not always asking > "is this case solved?" is outweighed by the expense of that?
The short answer is that 250,000 executions of an "if" statement might not cost you enough to make this worth worrying about. But you probably knew that. If you'd rather do the linked list, you don't have to allocate a node for each case; you allocate one integer array, next_unsolved(0:cases), initialize next_unsolved(i) = i + 1 for i < cases and next_unsolved(cases) = 1. Pretend case 0 is always unsolved. When next_unsolved(0) .eq. 1, you're done. Or something like that. Have fun. Louis
On 21 May, 19:59, Mark Morss <mfmo@aep.com> wrote:
> I want to solve a large set, say 5000, of numbered, quite similar > problems but with different values of certain arguments in each case. > Solution of each case is by iteration; the number of iterations > required for a given case could be large or small, depending on the > values of parameters. The maximum number of necessary iterations is > perhaps 50, but only a small proportion of cases are likely to require > very many iterations. > For particular reasons I want my outer loop to be iteration number; so > that on a given iteration, I may or may not need to process a given > case, according to whether or not it has been solved. > My question is, which is more efficient: on each iteration, to ask in > each of 5000 cases whether say "solved(case) == .TRUE.", or to > maintain a linked list integer indices of sofarunsolved cases and > cycle over that list each time, deleting cases from the list as they > are solved? > I have a suspicion that "it depends," but if anyone has a sense that > one method is likely to be better, I'd like to know about it. There > was some suggestion in earlier discussions here that allocations > aren't all that cheap, so is the seeming elegance of not always asking > "is this case solved?" is outweighed by the expense of that?
Why should it need allocation to ask if a particular case is solved? Is "solved(case)" a function, or just an array element? If it's an array then there need be no allocation in the loop  it's just a simple test on an array element. You can set up the array just once before the iteration starts. So, I'm not 100% clear what you are trying to do  but it seems you are trying to solve ~5000 similar problems all at once in one big iteration loop. Do you have a really good reason to try and do all 5000 cases all at once, instead of doing them one at a time?! In your first iteration, when *all* the problems are unsolved, you may need ~5000 times as much memory if you solve them all at once than if you solve them one after another! Can you keep it simple and save memory by solving them one at a time? Just do all the iterations on problem 1 until it's solved, then the same for problem 2, then 3,... The code might be shorter and simpler to test too. But anyway, let's assume you do *have *to solve them all at once, then: a) Computing 5000 'if' statements don't take long on a modern PC  just seconds or less. So, if checking whether a particular problem is solved with a single 'if' statement is a *substantial* part of the code run time then it seems each problem must run so quickly that it's not worth worrying about run time at all... (unless you're an optimisation fanatic for the love of it, in which case you will try both approaches and 10 others too just for fun :) ) or b) If each problem *does* take a while to solve, then the work required to perform check is likely to be *minuscule* compared to the work done solving the 5000 problems! So I would suggest you concentrate on improving the algorithm used to solve each problem and keep the other code as simple as possible. Linked list codes will be complex to write, test and document. Concentrate on coding and debugging time as well as run time  IMHO it would be very easy to waste days of coding in order to save a few seconds (and life's short enough already)! Good luck, andy
Mark Morss wrote: > I want to solve a large set, say 5000, of numbered, quite similar > problems but with different values of certain arguments in each case. > Solution of each case is by iteration; the number of iterations > required for a given case could be large or small, depending on the > values of parameters. The maximum number of necessary iterations is > perhaps 50, but only a small proportion of cases are likely to require > very many iterations. > For particular reasons I want my outer loop to be iteration number; so > that on a given iteration, I may or may not need to process a given > case, according to whether or not it has been solved. > My question is, which is more efficient: on each iteration, to ask in > each of 5000 cases whether say "solved(case) == .TRUE.", or to > maintain a linked list integer indices of sofarunsolved cases and > cycle over that list each time, deleting cases from the list as they > are solved? > I have a suspicion that "it depends," but if anyone has a sense that > one method is likely to be better, I'd like to know about it. There > was some suggestion in earlier discussions here that allocations > aren't all that cheap, so is the seeming elegance of not always asking > "is this case solved?" is outweighed by the expense of that?
Just to ensure I grok what you mean: You have to solve 5000 similar equations in an iterative loop and you want to check which of those have been solved in each loop trip since the number of iterations required may be slightly different for each equation? What about something like: INTEGER, PARAMETER :: N = 5000, MAX_ITER = 100 LOGICAL :: solved(N) = .FALSE. INTEGER :: unsolved(N) INTEGER :: i, m, iter iter = 0 iteration_loop: DO iter = iter + 1 IF ( ALL(solved) .OR. iter > MAX_ITER ) EXIT iteration_loop m = COUNT(.NOT. solved) unsolved(1:m) = PACK((/(i,i=1,N)/), .not.solved) ! Try to solve remaining unsolved equations unsolved_loop: DO i = 1, m ... solution code ... ! If equation was solved, update logical list ! and move to the next equation in this iteration IF ( ...solution reached... ) THEN solved(unsolved(i)) = .TRUE. CYCLE unsolved_loop END IF ... other stuff, e.g. updating parameters for next ... ... iteration for this as yet unsolved equation? ... END DO unsolved_loop END DO iteration_loop WRITE(*,*)ALL(solved) ! T if all solved, F if max iterations reached ? Of course, you would most likely replace the inner "unsolved_loop" with some fancy matrix manipulation but you get the idea. cheers, paulv  Paul van Delst Ride lots. CIMSS @ NOAA/NCEP/EMC Eddy Merckx
Mark Morss wrote: > I want to solve a large set, say 5000, of numbered, quite similar > problems but with different values of certain arguments in each case. > Solution of each case is by iteration; the number of iterations > required for a given case could be large or small, depending on the > values of parameters. The maximum number of necessary iterations is > perhaps 50, but only a small proportion of cases are likely to require > very many iterations. > For particular reasons I want my outer loop to be iteration number; so > that on a given iteration, I may or may not need to process a given > case, according to whether or not it has been solved. > My question is, which is more efficient: on each iteration, to ask in > each of 5000 cases whether say "solved(case) == .TRUE.", or to > maintain a linked list integer indices of sofarunsolved cases and > cycle over that list each time, deleting cases from the list as they > are solved? > I have a suspicion that "it depends," but if anyone has a sense that > one method is likely to be better, I'd like to know about it. There > was some suggestion in earlier discussions here that allocations > aren't all that cheap, so is the seeming elegance of not always asking > "is this case solved?" is outweighed by the expense of that?
The general rule for determining which method is faster is to try it both ways. If speed really is important the time spent coding will be insignificant. If speed isn't really important why worry about which method to use?
Mark Morss wrote: > My question is, which is more efficient: on each iteration, to ask in > each of 5000 cases whether say "solved(case) == .TRUE.", or to > maintain a linked list integer indices of sofarunsolved cases and > cycle over that list each time, deleting cases from the list as they > are solved? > I have a suspicion that "it depends," but if anyone has a sense that > one method is likely to be better, I'd like to know about it. There > was some suggestion in earlier discussions here that allocations > aren't all that cheap, so is the seeming elegance of not always asking > "is this case solved?" is outweighed by the expense of that?
Asking in 5000 cases whether solved(case) == .TRUE. will take an entirely negligible amount of time. The code for it is one line, perhaps three at the most, and trivally simple. Getting a linkedlist thingy to work will most likely take a nonnegligible amount of programming time, and will likely be buggy on the first draft. It cannot possibly save any significant amount of runtime, because there isn't a significant amount of runtime to be saved. I think that gives you a pretty clear winner. (It may be that the second one could be done fairly simply too, but I think I'd save that for after I had the system working.)  Brooks  The "bmosesnospam" address is valid; no unmunging needed.
On May 21, 2:59 pm, Mark Morss <mfmo@aep.com> wrote:
> I want to solve a large set, say 5000, of numbered, quite similar > problems but with different values of certain arguments in each case. > Solution of each case is by iteration; the number of iterations > required for a given case could be large or small, depending on the > values of parameters. The maximum number of necessary iterations is > perhaps 50, but only a small proportion of cases are likely to require > very many iterations. > For particular reasons I want my outer loop to be iteration number; so > that on a given iteration, I may or may not need to process a given > case, according to whether or not it has been solved. > My question is, which is more efficient: on each iteration, to ask in > each of 5000 cases whether say "solved(case) == .TRUE.", or to > maintain a linked list integer indices of sofarunsolved cases and > cycle over that list each time, deleting cases from the list as they > are solved? > I have a suspicion that "it depends," but if anyone has a sense that > one method is likely to be better, I'd like to know about it. There > was some suggestion in earlier discussions here that allocations > aren't all that cheap, so is the seeming elegance of not always asking > "is this case solved?" is outweighed by the expense of that?
Everyone gave excellent advice; thanks very much to all for taking the time.





