|
|
 |
 |
 |
 |
TCL(Tool Command Language) Scripting
|
 |
 |
 |
 |
 |
 |
 |
 |
Array defaults vs exception overhead
Hi, The following idiom happens often in my code: if {[info exists a($x)]} {do_something $a($x)} Problem: the above does the hash lookup twice. Argh. A natural answer is: catch {do_something $a($x)} But unfortunately, this is even slower (from 7.6 to 8.4.14) due to the exception unwinding overhead. Moreover, it is not strictly equivalent since it loses any exception from [do_something]. To restore equivalence, we'd need pattern matching on the error... even slower ! Any better idiom doing just one hash lookup ? Another approach would be to have "array defaults": an option to substitute a common value to all undefined slots of an array instead of raising an exception. This one would also allow things like a blind "incr a($x)". I remember an old discussion on this. What's the current status ? Yet another approach would be more general, and akin to a Bourne Shell mechanism: "variable defaults". The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined. The natural extension to arrays yields do_something ${a($x):-nope} (here [do_something] is supposed to handle 'nope' innocuously, at least faster than a 2nd hash lookup) Reactions ? -Alex
"Alexandre Ferrieux" <alexandre.ferri @gmail.com> wrote in message news:1178838889.535545.242780@o5g2000hsb.googlegroups.com... > Hi, > The following idiom happens often in my code: > if {[info exists a($x)]} {do_something $a($x)} > Problem: the above does the hash lookup twice. Argh. > A natural answer is: > catch {do_something $a($x)} > But unfortunately, this is even slower (from 7.6 to 8.4.14) due to the > exception unwinding overhead.
It sort of depends on what percentage of time you expect the array slot to be undefined. If slot is usually defined then the catch is overall less overhead, otherwise info exist wins. I have also long wished for a more concise solution to this situation. But speed is usually not an issue as [info exists] is really quite fast. Still you end up coding the array reference twice which can be annoying when the index is built of 3 or 4 substrings. Sometimes I use a macro to automatically test and fetch in one call. I wonder, does the Tcl interp "cache" array lookups so that immediately following identical look ups are faster? Doubtful. > Moreover, it is not strictly equivalent since it loses any exception > from [do_something]. > To restore equivalence, we'd need pattern matching on the error... > even slower ! > Any better idiom doing just one hash lookup ? > Another approach would be to have "array defaults": an option to > substitute a common value to all undefined slots of an array instead > of raising an exception. This one would also allow things like a blind > "incr a($x)". > I remember an old discussion on this. What's the current status ?
Re incr, that issue is addressed in 8.5 (it auto creates the variable). I can't see "array defaults" going anywhere, but who knows. My thought (all of 10 seconds): info value a(x) defaultValue Could have info exists baked-in and if non-existent, just return the supplied default otherwise return the wanted value. > Yet another approach would be more general, and akin to a Bourne Shell > mechanism: "variable defaults". > The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined. > The natural extension to arrays yields > do_something ${a($x):-nope}
Interesting. However I'd don't think I'd enjoy reading Tcl Core comments on a TIP of this sort :^( Roy
> (here [do_something] is supposed to handle 'nope' innocuously, at > least faster than a 2nd hash lookup) > Reactions ? > -Alex
> The following idiom happens often in my code: > if {[info exists a($x)]} {do_something $a($x)} > Problem: the above does the hash lookup twice. Argh. > A natural answer is: > catch {do_something $a($x)} > But unfortunately, this is even slower (from 7.6 to 8.4.14) due to the > exception unwinding overhead. > Moreover, it is not strictly equivalent since it loses any exception > from [do_something]. > To restore equivalence, we'd need pattern matching on the error... > even slower ! > Any better idiom doing just one hash lookup ?
upvar 0 a($x) val if {[info exists val]} {do_something $val} I've not tested whether it's faster or slower... -- Koen
Alexandre Ferrieux wrote: > Hi, > The following idiom happens often in my code: > if {[info exists a($x)]} {do_something $a($x)} > Problem: the above does the hash lookup twice. Argh.
Is this really a problem? If it really is, then you could use [array get]: foreach {_ val} [array get a $x] { do_something $val } Alternatively, if you want a default value, then you can do: do_something [append a($x) ""] do_something [incr a($x)] ;# 8.5 ... > Yet another approach would be more general, and akin to a Bourne Shell > mechanism: "variable defaults". > The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined. > The natural extension to arrays yields > do_something ${a($x):-nope}
Ewwwww. Yuck! :) -- Neil
On May 11, 4:32 pm, Neil Madden <n@cs.nott.ac.uk> wrote: > Alexandre Ferrieux wrote: > > Yet another approach would be more general, and akin to a Bourne Shell > > mechanism: "variable defaults". > > The idea is to have ${foo:-bar} yield $foo or bar if foo is undefined. > > The natural extension to arrays yields > > do_something ${a($x):-nope} > Ewwwww. Yuck! :) > -- Neil
Yucky only if implemented as a syntax extension. It looks clean implemented in tcl: proc get {var {default {}}} { upvar 1 $var v if {[info exists v]} { return $v } return $default } # Usage (as per original example): do_something [get a($x) -nope]
On May 11, 10:32 am, Neil Madden <n@cs.nott.ac.uk> wrote: > Alexandre Ferrieux wrote: > > if {[info exists a($x)]} {do_something $a($x)} > > Problem: the above does the hash lookup twice. Argh. > Is this really a problem? If it really is, then you could use [array get]: > foreach {_ val} [array get a $x] { do_something $val }
Does [array get a $x] optimize for the case where $x has no wildcards ? If yes, you win. If no, then [array get a $x] will replace a hash lookup by a linear search. Which yucks, as you say ;-) > Alternatively, if you want a default value, then you can do: > do_something [append a($x) ""] > do_something [incr a($x)] ;# 8.5
I don't want side-effects populating the array ! The above will fill the hashtable... > > do_something ${a($x):-nope} > Ewwwww. Yuck! :)
Do you Yuck at the precise {:-} syntax, or at the fact that it is a syntax extension ? -Alex
On May 11, 7:05 pm, Alexandre Ferrieux <alexandre.ferri@gmail.com> wrote: > On May 11, 10:32 am, Neil Madden <n @cs.nott.ac.uk> wrote: > > Alexandre Ferrieux wrote: > > > do_something ${a($x):-nope} > > Ewwwww. Yuck! :) > Do you Yuck at the precise {:-} syntax, or at the fact that it is a > syntax extension ?
Don't know about Neil but for me personally YUCK for both. There is a very simple rule in Tcl: {} is non-subst quoting. Even when we use it as a syntax to disambiguate $ substitution like ${x}y the behavior of {} haven't changed, it's still non-subst quoting. Making it behave any other way will lead to quoting hell. I've had enough of that from "" thankyouverymuch, now don't go messing with my {}! Concerning the fact that it's a syntax extension, we can easily achieve the same effect, even preserving the format of the syntax, using a regular proc. Lispers consider doing it with regular proc "syntax extension". Of course, since it looks so ordinary and easy in tcl, Tclers don't consider it a "syntax extension". No, instead every once in a while we get suggestions to introduce even more special characters to tcl. Perhaps we really should be thinking more like the Lisp people and do it with procs whenever we feel like "extending" the syntax. Only in cases where it cannot be implemented as procs/commands should we consider "real" syntax extension.
On May 11, 12:20 pm, "slebet@yahoo.com" <slebet@gmail.com> wrote: > On May 11, 4:32 pm, Neil Madden <n @cs.nott.ac.uk> wrote: > > Alexandre Ferrieux wrote: > > > do_something ${a($x):-nope} > > Ewwwww. Yuck! :) > Yucky only if implemented as a syntax extension. It looks clean > implemented in tcl: > do_something [get a($x) -nope]
The idea of moving out of syntax extensions is good. (I mean, I'd have preferred the {:-} but it seems /bin/sh has fewer fans than I expected). Now, to implement efficiently your [get], even at the C level, we need to cancel the overhead of exception unwinding. I have no definitive explanation, but I suspect it is linked with the buildup of ::errorInfo. So we'd need an internal flag saying "don't build errorInfo when crawling back from exception". And come to think of it, this flag could even be put to use at the tcl level: fastcatch {code} err --> like [catch], but without building errorInfo. Of course this flag would be set/reset in nested fashion by the various [catch] and [fatscatch]: catch { f fastcatch { g catch { h } ;# here I have errorInfo from h } ;# here I have no errorInfo } ;# here I have errorInfo from f but not from g Reactions ? -Alex
slebet @yahoo.com wrote: > Concerning the fact that it's a syntax extension, we can easily > achieve the same effect, even preserving the format of the syntax, > using a regular proc. Lispers consider doing it with regular proc > "syntax extension". Of course, since it looks so ordinary and easy in > tcl, Tclers don't consider it a "syntax extension". No, instead every > once in a while we get suggestions to introduce even more special > characters to tcl. Perhaps we really should be thinking more like the > Lisp people and do it with procs whenever we feel like "extending" the > syntax. Only in cases where it cannot be implemented as procs/commands > should we consider "real" syntax extension. Right! That's why {*} went in. Interpolation of a list as multiple arguments to a command was something that couldn't be neatened up with a proc; anything that was available was really no better than [eval [linsert $args 0 $command]]... with all the dangers of [eval]. Fear not, most of the TCT understand Lisp. Well, a Smug Lisp Weenie would say that if we *really* understood Lisp, we'd abandon Tcl and switch to Lisp. We tend to be more open-minded. (I can just hear the Smug Lisp Weenies snorting and muttering, "empty-headed, rather!") -- 73 de ke9tv/2, Kevin
Alexandre Ferrieux wrote: > On May 11, 10:32 am, Neil Madden <n @cs.nott.ac.uk> wrote: >> Alexandre Ferrieux wrote: >>> if {[info exists a($x)]} {do_something $a($x)} >>> Problem: the above does the hash lookup twice. Argh. >> Is this really a problem? If it really is, then you could use [array get]: >> foreach {_ val} [array get a $x] { do_something $val } > Does [array get a $x] optimize for the case where $x has no > wildcards ? > If yes, you win. > If no, then [array get a $x] will replace a hash lookup by a linear > search. > Which yucks, as you say ;-)
Good point. :-S >> Alternatively, if you want a default value, then you can do: >> do_something [append a($x) ""] >> do_something [incr a($x)] ;# 8.5 > I don't want side-effects populating the array ! The above will fill > the hashtable...
Details, details... >>> do_something ${a($x):-nope} >> Ewwwww. Yuck! :) > Do you Yuck at the precise {:-} syntax, or at the fact that it is a > syntax extension ?
Mostly both. I'm not entirely against syntax extensions (I'd still like syntax for [expr]...), but I don't think this needs it. I also think {:-} is particularly yucky :) There are a number of places where it might be nice to avoid two hash-lookups: array ifpresent $arr $key $cmd; # call $cmd with value if present array default $arr $key $default; # return val or default array replace $arr $key $value; # return old value array setifabsent $arr $key $val array setifpresent $arr $key $val Along with equivalents for [dict]. However, I'm not sure that the cost is worth it. A much better solution would be to figure out if the result of the lookup can be cached in the key Tcl_Obj so that subsequent lookups are faster. I thought dicts already did this, but I can't see anything in the code. Thinking about it, those kinds of atomic operations are also what you'd want in a thread-shared array. I wonder if the Thread extension contains any useful code for this? -- Neil
Neil Madden wrote: >>> do_something [append a($x) ""] >>> do_something [incr a($x)] ;# 8.5 >> I don't want side-effects populating the array ! The above will fill >> the hashtable... > Details, details...
How about: set myArray(some) thing lindex [concat [array get myArray some] default: theThingValue] 1 lindex [concat [array get myArray noSuchIndex] default: theNoneValue] 1 Also a little bit long, but no exception, no array population and no [info exists]... Besides this, I also would like to have defaults for none existing array values, but I also would like to avoid syntax extensions. I'd prefer a [array] option for this, e.g. array get -default "theValue" myArray theIndex should return {theIndex theValue}. Regards Stephan
On May 11, 9:25 pm, Neil Madden <n@cs.nott.ac.uk> wrote: > >> do_something [append a($x) ""] > >> do_something [incr a($x)] ;# 8.5 > > I don't want side-effects populating the array ! The above will fill > > the hashtable... > Details, details...
Well, if the hit-miss ratio is large, that makes a hefty detail... > {:-} is particularly yucky :)
Look at that line ! Two smileys fighting one-on-one ! > There are a number of places where it might be nice to avoid two > hash-lookups: > array ifpresent $arr $key $cmd; # call $cmd with value if present > ... > Along with equivalents for [dict]. However, I'm not sure that the cost > is worth it.
Then, what about the [fastcatch] in my other post ? A single primitive allowing to write all these variants (array and dict as well) in pure Tcl ! > [cached lookups]
Elegant, but hides a few quirks. If you store the key with the pointer to the collision list element, you get extra housekeeping at deletion time (at least); if instead you store it with a bucket index, you still have to scan the collision list... But I'm not saying this is redhibitory; I'm ready to be convinced. Which one is simplest to implement: fatscatch or cached lookups ? -Alex
Stephan Kuhagen wrote: > Neil Madden wrote: >>>> do_something [append a($x) ""] >>>> do_something [incr a($x)] ;# 8.5 >>>I don't want side-effects populating the array ! The above will fill >>>the hashtable... >>Details, details... > How about: > set myArray(some) thing > lindex [concat [array get myArray some] default: theThingValue] 1 > lindex [concat [array get myArray noSuchIndex] default: theNoneValue] 1 > Also a little bit long, but no exception, no array population and no [info > exists]... > Besides this, I also would like to have defaults for none existing array > values, but I also would like to avoid syntax extensions. I'd prefer a > [array] option for this, e.g. > array get -default "theValue" myArray theIndex
the possibility to trace on {element|variable} not found trace add variable notfound ... would be a boon anyways . currently if you want to use a sparse array that is filled on request you have to trace the complete array. uwe
|
 |
 |
 |
 |
|