|
|
 |
 |
 |
 |
Perl Programming Language
|
 |
 |
 |
 |
 |
 |
 |
 |
duplicates
I would like to find and delete duplicates to reduce the number of entries I have for each word. Need your help to achieve that. Input dog -> doggy dog -> dogs want -> wants want -> wanting want -> wanted eat -> eaten eat -> eating eat -> eated output dog -> doggy dogs want -> wants wanting wanted. Thanks
On May 29, 1:33 pm, julia_2@hotmail.com wrote: > I would like to find and delete duplicates to reduce the number of > entries I have for each word. Need your help to achieve that. > Input > dog -> doggy > dog -> dogs > want -> wants > want -> wanting > want -> wanted > eat -> eaten > eat -> eating > eat -> eated > output > dog -> doggy dogs > want -> wants wanting wanted.
What have you tried so far? How did it not meet your expectations? >From where does this data come, and in what structure do you have it
stored? Have you searched the FAQ? $ perldoc -q duplicate Found in /opt2/Perl5_8_4/lib/perl5/5.8.4/pod/perlfaq4.pod How can I remove duplicate elements from a list or array? Paul Lalli
On May 29, 10:33 am, julia_2@hotmail.com wrote: > I would like to find and delete duplicates to reduce the number of > entries I have for each word. Need your help to achieve that.
How about this: #!/usr/bin/perl use strict; use warnings; my %word; while (<DATA>) { chomp; my ($word1, $word2) = split (/ -> /, $_); push @{$word{$word1}}, $word2; } print($_," -> ",join(" ",@{$word{$_}}),"\n") for keys %word; __DATA__ dog -> doggy dog -> dogs want -> wants want -> wanting want -> wanted eat -> eaten eat -> eating eat -> eated OUTPUT eat -> eaten eating eated want -> wants wanting wanted dog -> doggy dogs >> eat -> eated
eated??? -- The best way to get a good answer is to ask a good question. David Filmer (http://DavidFilmer.com)
In article <1180460033.481270.310@o11g2000prd.googlegroups.com>, <julia_2 @hotmail.com> wrote: > I would like to find and delete duplicates to reduce the number of > entries I have for each word. Need your help to achieve that. > Input > dog -> doggy > dog -> dogs > want -> wants > want -> wanting > want -> wanted > eat -> eaten > eat -> eating > eat -> eated > output > dog -> doggy dogs > want -> wants wanting wanted. Here's a fish: #!/usr/local/bin/perl # use warnings; use strict; my %hash; while(<DATA>) { chomp; my($key,$val) = split(/\s*->\s*/); push(@{$hash{$key}},$val); }
for my $key (sort keys %hash) { print "$key -> @{$hash{$key}}\n"; }
__DATA__ dog -> doggy dog -> dogs want -> wants want -> wanting want -> wanted eat -> eaten eat -> eating eat -> eated Posted Via Usenet.com Premium Usenet Newsgroup Services ---------------------------------------------------------- ** SPEED ** RETENTION ** COMPLETION ** ANONYMITY ** ---------------------------------------------------------- http://www.usenet.com
On Tue, 29 May 2007 11:38:42 -0700, Jim Gibson <jgib @mail.arc.nasa.gov> wrote: >Here's a fish: What's of the elitist clpmisc attitude? ;-) Michele -- {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB=' .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_, 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
On May 29, 4:08 pm, Michele Dondi <bik.m@tiscalinet.it> wrote: > On Tue, 29 May 2007 11:38:42 -0700, Jim Gibson > <jgib@mail.arc.nasa.gov> wrote: > >Here's a fish: > What's of the elitist clpmisc attitude? ;-)
s/of/with/; -- elitist (alleged) English speaker -- Brad
julia_2 @hotmail.com wrote: > I would like to find and delete duplicates to reduce the number of > entries I have for each word. Need your help to achieve that. > Input > dog -> doggy > dog -> dogs > want -> wants > want -> wanting > want -> wanted > eat -> eaten > eat -> eating > eat -> eated > output > dog -> doggy dogs > want -> wants wanting wanted. > Thanks
put into a hash if exists $hash{dog} .= " dogs";
Brad Baxter <baxter.b @gmail.com> wrote: > On May 29, 4:08 pm, Michele Dondi <bik.m @tiscalinet.it> wrote: >> On Tue, 29 May 2007 11:38:42 -0700, Jim Gibson >> <jgib@mail.arc.nasa.gov> wrote: >> >Here's a fish: >> What's of the elitist clpmisc attitude? ;-) > s/of/with/;
Due to the smiley, I think s/of/become of/; was the intent. > -- elitist (alleged) English speaker
I only speak 'merican though, so I may be way off base... -- Tad McClellan SGML consulting t@augustmail.com Perl programming Fort Worth, Texas
use @DavidFilmer.com <use @DavidFilmer.com> wrote: > On May 29, 10:33 am, julia_2 @hotmail.com wrote: >>> eat -> eated > eated???
I think that's when something is warmed up in England... -- Tad McClellan SGML consulting t@augustmail.com Perl programming Fort Worth, Texas
On May 29, 11:28 pm, use@DavidFilmer.com wrote:
> On May 29, 10:33 am, julia_2 @hotmail.com wrote: > > I would like to find and delete duplicates to reduce the number of > > entries I have for each word. Need your help to achieve that. > How about this: > #!/usr/bin/perl > use strict; > use warnings; > my %word; > while (<DATA>) { > chomp; > my ($word1, $word2) = split (/ -> /, $_); > push @{$word{$word1}}, $word2; > } > print($_," -> ",join(" ",@{$word{$_}}),"\n") for keys %word; > __DATA__ > dog -> doggy > dog -> dogs > want -> wants > want -> wanting > want -> wanted > eat -> eaten > eat -> eating > eat -> eated > OUTPUT > eat -> eaten eating eated > want -> wants wanting wanted > dog -> doggy dogs > >> eat -> eated > eated??? > -- > The best way to get a good answer is to ask a good question. > David Filmer (http://DavidFilmer.com)
Hi David .. Sorry but i dint follow the syntax you used like you used push (@{$hash{$key}}, $value); How does this work.. I mean i can understand when one uses @{$arr_ref} to derefenciate an array ref. but what does @{$hash{$key}} stands for... IF you can point me to some data or link that will be fine too...
On May 30, 2:55 am, jeevs <jeevan.ing@gmail.com> wrote:
> On May 29, 11:28 pm, use @DavidFilmer.com wrote: > > my %word; > > while (<DATA>) { > > chomp; > > my ($word1, $word2) = split (/ -> /, $_); > > push @{$word{$word1}}, $word2; > > } > > print($_," -> ",join(" ",@{$word{$_}}),"\n") for keys %word; > Hi David .. Sorry but i dint follow the syntax you used > like you used > push (@{$hash{$key}}, $value); > How does this work.. I mean i can understand when one uses @{$arr_ref} > to derefenciate an array ref. > but what does @{$hash{$key}} stands for...
It's exactly the same thing - @{REF} dereferences the array reference REF. In @{$arr_ref}, $arr_ref is the array reference. In @{$hash{$key}}, $hash{$key} is the array reference. That is, the value of %hash whose key is $key is a reference to an array. Hashes and arrays can store *any* kind of scalar variable - integers, floats, strings, and references too. > IF you can point me to some > data or link that will be fine too...
perldoc perllol perldoc perldsc Paul Lalli
On 29 May 2007 23:55:16 -0700, jeevs <jeevan.ing@gmail.com> wrote: >push (@{$hash{$key}}, $value); >How does this work.. I mean i can understand when one uses @{$arr_ref} >to derefenciate an array ref. >but what does @{$hash{$key}} stands for... IF you can point me to some
It's the same as my $arr_ref=$hash{$key}; push @{$arr_ref}, $value; >data or link that will be fine too...
perldoc perlref is the starting point. Michele -- {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB=' .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_, 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
On Tue, 29 May 2007 20:58:25 -0500, Tad McClellan <t @augustmail.com> wrote: >>> What's of the elitist clpmisc attitude? ;-) >> s/of/with/; >Due to the smiley, I think > s/of/become of/; >was the intent.
Yep, that's what I meant. >> -- elitist (alleged) English speaker >I only speak 'merican though, so I may be way off base...
Ah wid huv thought ye wir scattish. (Fae Alba, ken?) Michele -- {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB=' .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_, 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
On May 29, 11:55 pm, jeevs <jeevan.ing@gmail.com> wrote: > Hi David .. Sorry but i dint follow the syntax you used > like you used > push (@{$hash{$key}}, $value); > How does this work.. I mean i can understand when one uses @{$arr_ref} > to derefenciate an array ref. > but what does @{$hash{$key}} stands for...
$hash{$key} has a value, which must (if defined) be a scalar. In this case, the scalar value is an array reference, so we are creating a hash of arrays (HoA), or, more specifically (and correctly), a hash of array references. perldoc perlreftut perldoc perlref -- The best way to get a good answer is to ask a good question. David Filmer (http://DavidFilmer.com)
julia_2 @hotmail.com wrote: >I would like to find and delete duplicates to reduce the number of >entries I have for each word. Need your help to achieve that. >Input >dog -> doggy >dog -> dogs >want -> wants >want -> wanting >want -> wanted >eat -> eaten >eat -> eating >eat -> eated >output >dog -> doggy dogs >want -> wants wanting wanted. What you appear to be after, is finding the stem of words. I believe the author of the search engine index module KinoSearch, Marvin Humphrey AKA creamygoodness, also supports a stemmer module on CPAN, which BTW gets used by the indexer -- but you can use it independently, too. Home page: http://www.rectangular.com/kinosearch/ CPAN page: http://search.cpan.org/dist/KinoSearch/ The basic stemmer module is Lingua::Stem::Snowball, http://search.cpan.org/perldoc?Lingua::Stem::Snowball -- Bart.
In article <712p53d2kn6l3g3vutfemi4nm7rp2gl@4ax.com>, Michele Dondi <bik.m @tiscalinet.it> wrote: > On Tue, 29 May 2007 11:38:42 -0700, Jim Gibson > <jgib @mail.arc.nasa.gov> wrote: > >Here's a fish: > What's of the elitist clpmisc attitude? ;-)
I wasn't trying to be elitist. Some regulars on clpm frown on simply providing programs on demand to posters, and Paul Lalli had already responded with a inquiry of what this OP had done in her own behalf. "Here's a fish" was just shorthand for "you really should follow the guidelines for this group, attempt to solve the problem on your own, and post your program if you run into trouble, but I am feeling generous and bored with what I should be doing and your problem looks like an interesting challenge that I can whip out in a few minutes, so here it is." My apologies if I offended anybody. -- Jim Gibson Posted Via Usenet.com Premium Usenet Newsgroup Services ---------------------------------------------------------- ** SPEED ** RETENTION ** COMPLETION ** ANONYMITY ** ---------------------------------------------------------- http://www.usenet.com
On Wed, 30 May 2007 15:12:26 -0700, Jim Gibson <jgib @mail.arc.nasa.gov> wrote: >> >Here's a fish: >> What's of the elitist clpmisc attitude? ;-) >I wasn't trying to be elitist. Some regulars on clpm frown on simply
I know. In fact I was "complaining" you weren't. The rationale being that clpmisc regulars do not give fishes, period! ;-) >My apologies if I offended anybody.
No, you didn't! Michele -- {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB=' .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_, 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
jeevs <jeevan.ing @gmail.com> wrote: > Sorry but i dint follow the syntax you used > like you used > push (@{$hash{$key}}, $value); > How does this work.. I mean i can understand when one uses @{$arr_ref} > to derefenciate an array ref. > but what does @{$hash{$key}} stands for...
It stands for the same thing, only the array ref is stored in $hash{$key} rather than in $arr_ref. > IF you can point me to some > data or link that will be fine too...
perldoc perlreftut -- Tad McClellan SGML consulting t@augustmail.com Perl programming Fort Worth, Texas
|
 |
 |
 |
 |
|