|
|
 |
 |
 |
 |
which one runs faster ??
which one runs faster ?? for(i=0;i<100;i++) for(j=0;j<10;j++) a[i][j]=0; OR for(j=0;j<10;j++) for(i=0;i<100;i++) a[i][j]=0;
"onkar" <onkar.@gmail.com> schrieb im Newsbeitrag news:1180524223.506558.134800@i38g2000prf.googlegroups.com... > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
time it and you'll know, for your platform, compiler and optimazation options Bye, Jojo
onkar wrote: > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
Why do you expect there is a difference? The inner loop looks identical. -- Tor <torust [at] online [dot] no>
On 30 May, 12:23, onkar <onkar.@gmail.com> wrote: > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
Short Answer: It Depends. Less Short Answer: It depends on a bunch of things, none of which is directly related to standard C. Pragmatic Answer: Try both and see. What I'd Probably Do<tm>: memset(a, 0x00, sizeof a[0][0] * 1000); Thought Experiment: What would each of your methods do if every array element happened to be around half a memory page in size? Assuming you are on a system with virtual memory, of course.
On May 30, 12:23 pm, onkar <onkar.@gmail.com> wrote: > which one runs faster ??
Yes. > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
1. You haven't told me what a[rows][columns] is or what the cache line size on your machine. If a is a 2-D array of bytes, and the cache line was > 1000 bytes, then it probably won't matter. 2. This is not really about C as such. However, as C is a Row-Major language (see http://en.wikipedia.org/wiki/Row-major_order), it's normally taken to be better to vary the column index faster than the row index, or more generally to order index variation from the right.
onkar wrote: > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
memset(a,0,1000*sizeof(a[0])); That will be optimized in assembly by the system provider, and it will be faster than your loops.
jacob navia wrote: > onkar wrote: >> which one runs faster ?? >> for(i=0;i<100;i++) >> for(j=0;j<10;j++) >> a[i][j]=0; >> OR >> for(j=0;j<10;j++) >> for(i=0;i<100;i++) >> a[i][j]=0; > memset(a,0,1000*sizeof(a[0])); > That will be optimized in assembly by the > system provider, and it will be faster than > your loops.
It will also be wrong. R-O-N-G, wrong. (Hint: On the assumption that the original loops do not invoke undefined behavior, is sizeof a[0] == sizeof a[0][0] possible?) Even after the obvious repair it could still be wrong. No, I'm not talking about exotic machines where all-bits-zero is not "zero" for the type in question. (Hint: which elements of `unsigned char a[120][150]' are affected by the loops and which are affected by memset?) -- Eric Sosman esos@acm-dot-org.invalid
"onkar" <onkar. @gmail.com> wrote: > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
(The following assumes that "a" has been allocated as a single contiguous block of memory, and that the dimensions of a are [100][10]. You didn't state either. If these assumptions are not true, the following does not hold.) Your first set of nested loops might run faster, because it involves a single traverse through the memory, one byte increments in the same direction. The second jumps back and forth. A compiler might translate the first to machine laguage as "rep stosl" (writing all the bytes in a single rapid-fire burst). The second, however, might get tranlated to nested loops, which would be slower. -- Cheers, Robbie Hatley East Tustin, CA, USA lonewolf aatt well dott com triple-dubya dott tustinfreezone dott org
On May 30, 4:23 pm, onkar <onkar.@gmail.com> wrote: > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
check it yourself using clock() help - use man page for clock Bye Guru Jois
On 5 30 , 7 23 , onkar <onkar.@gmail.com> wrote: > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
I thought them run in the same time, haha , I am new at c/c++;
onkar said: > which one runs faster ??
T a[100][10] = {0}; -- Richard Heathfield "Usenet is a strange place" - dmr 29/7/1999 http://www.cpax.org.uk email: rjh at the above domain, - www.
mark_blue @pobox.com wrote: > On May 30, 12:23 pm, onkar <onkar. @gmail.com> wrote: >> which one runs faster ?? > Yes. >> for(i=0;i<100;i++) >> for(j=0;j<10;j++) >> a[i][j]=0; >> OR >> for(j=0;j<10;j++) >> for(i=0;i<100;i++) >> a[i][j]=0; > 1. You haven't told me what a[rows][columns] is or what the cache line > size on your machine. If a is a 2-D array of bytes, and the cache line > was > 1000 bytes, then it probably won't matter.
For tuning, yes cache issues are important. There is a technique called *loop blocking*, which address this. For example, consider a more interesting loop: for (i=0; i<MAX; i++) for (j=0; j<MAX; j++) A[i][j] = A[i][j] * B[j][i]; Now let us rewrite the loop, and select 'block_size' such that the A and B memory chunks fit a cache line: for (i=0; i<MAX; i+=block_size) for (j=0; j<MAX; j+=block_size) for (k=i; k<i+block_size; k++) for (l=j; l<j+block_size; l++) A[k][l] = A[k][l] * B[l][k]; This sort of cache optimization, is what a optimizing C compiler might do behind the scene. > 2. This is not really about C as such. However, as C is a Row-Major > language (see http://en.wikipedia.org/wiki/Row-major_order), it's > normally taken to be better to vary the column index faster than the > row index, or more generally to order index variation from the right. IF you want to make the job easier for the C optimizer, yes then the inner index should be over 'j'. However, for OP example, I be very surprised if a modern optimizing C compiler really care which order the programmer arrange 'i' and 'j'. More important is the *readability* of the code, since the "standard" way to code such a loop is: for (i=0; i<ROW_MAX; i++) for (j=0; j<COL_MAX; j++) I like to have a very good reason before doing it the other way around. -- Tor <torust [at] online [dot] no>
Richard Heathfield wrote: > onkar said: >> which one runs faster ?? > T a[100][10] = {0};
Chapter and verse please. :P -- Tor <torust [at] online [dot] no>
Tor Rustad wrote: > Richard Heathfield wrote: > > onkar said: > >> which one runs faster ?? > > T a[100][10] = {0}; > Chapter and verse please. :P
It doesn't matter which one of those two, runs faster. They're not the same. This one is a declaration. The other one was a statement. -- pete
On May 30, 12:17 pm, Tor Rustad <tor_rus@hotmail.com> wrote: > Chapter and verse please. :P
Let's make it: static T a[100][10] = {0}; Then we have: "5.1.2 Execution environments 1 Two execution environments are defined: freestanding and hosted. In both cases, program startup occurs when a designated C function is called by the execution environment. All objects with static storage duration shall be initialized (set to their initial values) before program startup. The manner and timing of such initialization are otherwise unspecified. Program termination returns control to the execution environment." Giving a time of zero seconds.
Guru Jois <guru.j @gmail.com> writes: > On May 30, 4:23 pm, onkar <onkar. @gmail.com> wrote: >> which one runs faster ?? >> for(i=0;i<100;i++) >> for(j=0;j<10;j++) >> a[i][j]=0; >> OR >> for(j=0;j<10;j++) >> for(i=0;i<100;i++) >> a[i][j]=0; > check it yourself using clock() > help - use man page for clock
Better yet, use a profiler if your system provides one. (Details of profiler use are off-topic.) -- Keith Thompson (The_Other_Keith) k@mib.org <http://www.ghoti.net/~kst> San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst> "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister"
On May 30, 12:23 pm, onkar <onkar.@gmail.com> wrote: > which one runs faster ?? > for(i=0;i<100;i++) > for(j=0;j<10;j++) > a[i][j]=0; > OR > for(j=0;j<10;j++) > for(i=0;i<100;i++) > a[i][j]=0;
If you want to actually learn something: Write a program that executes this code and measure the execution time. Then vary the numbers (100 and 10). Study what happens. Check if anything unusual happens. If anything unusual happens, write down exactly what happens, post a complete program, and someone will explain to you _why_ it happens. The code that you posted is useless, because we don't actually know what it is doing. Is a an array of arrays of int, or is it an array of pointers to int, or is it a pointer to arrays of ints, or a pointer to an array of pointers to int? Huge difference.
Tak <kakat@gmail.com> schrieb: > On 5??30??, ????7??23??, onkar <onkar. @gmail.com> wrote: >> which one runs faster ?? >> for(i=0;i<100;i++) >> for(j=0;j<10;j++) >> a[i][j]=0; >> OR >> for(j=0;j<10;j++) >> for(i=0;i<100;i++) >> a[i][j]=0; > I thought them run in the same time, haha , I am new at c/c++;
leaving the topic of clc (that is: speaking OT), in practice, both can differ significantly, but they do not need to. If the compiler translates these "literally", you might even end up thrashing your virtual memory (making this statement even more OT), resulting in factors of 1000 or more between both statements. Of course, for this to happen, it is likely that the values 100 and 10 must be much bigger than in the "samples" above. Ok, now I'll shut my OT mouth. ;) Regards, Spiro. -- Spiro R. Trikaliotis http://opencbm.sf.net/ http://www.trikaliotis.net/ http://www.viceteam.org/
|
 |
 |
 |
 |
|