Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

TCL(Tool Command Language) Scripting

How to remove a single line from a flat file


Hi,
   I want to remove a single line from a flat file using TCL.  My file
looks like this.

123096     Kumar     3
111111     Kiran       4
323456     AAAA      4

If the user has given input as 123096, The script should remove the
entire line (with 123096).  How can i do this.?

-Swaroop

Swaroop wrote:
> Hi,
>    I want to remove a single line from a flat file using TCL.  My file
> looks like this.

> 123096     Kumar     3
> 111111     Kiran       4
> 323456     AAAA      4

> If the user has given input as 123096, The script should remove the
> entire line (with 123096).  How can i do this.?

> -Swaroop

I assume that you look for the first token in the line which is
delimited by whitespace. In this case I interpret the input line as a
list, hence comparing the first list element with the pattern. I do this
for all lines in the inputfile and copy them to another outputfile.

set ifp [open {C:\InputFile.txt} r]
set ofp [open {C:\OutputFile.txt} w]

set pattern 123096

while {[gets $ifp line] >= 0} {

     if {[lindex $line 0] == $pattern} {puts $ofp $line}

}

close $ifp
close $ofp
exit

Regards - Leo

On May 15, 10:35 am, Leopold Gerlinger <leopold.gerlin@siemens.com>
wrote:

Hi,
    If i am right, by doing like above, duplicate files will be
created.  To avoid this do i need to move the output file to inputfile
after the script.  Moreover, i guess i should use " if {[lindex $line
0] != $pattern} {puts $ofp $line} " to skip the matching line. [I have
replaced == with !=]

-Swaroop

On May 15, 1:43 pm, Swaroop <swaroop.t@gmail.com> wrote:

Yes. You can do this from the tcl script itself by:

  file rename $output_filename $input_filename

> Moreover, i guess i should use " if {[lindex $line
> 0] != $pattern} {puts $ofp $line} " to skip the matching line. [I have
> replaced == with !=]

In this case I guess it's safe to use lindex directly. And I admit
that I often write code that uses lindex directly on input data. But
you should be aware that lindex is sensitive to unbalanced ", { and }.
By sensitive I mean that your program will abort immediately when
lindex throws and error (unless you  [catch] it of course).

If you can't control the input data format then I'd suggest:

 if {[lindex [split $line] 0] != $pattern} {...

or

  if {[regexp -inline {^\d+}] != $pattern} {...

Swaroop <swaroop.t@gmail.com> wrote:
>     If i am right, by doing like above, duplicate files will be
> created.

If the files are sooo large that this is a concern, then the
processing is probably already so slow, that the whole task is
next to infeasible, anyway.   :-)

You could also edit in-place:
  either you just overwrite the portion with dummy-chars, e.g. spaces,
or you shift the whole block of data that follows.

The former is a bit easier, but you need to be extra careful with
positioning for overwrite (seek, tell), and determining the number
of spaces to write (this depends on both the encoding of input file
and the length of the current matched line!)

The latter requires opening the file with "r+", and once the matching
line is found, repeated seek-read-tell-seek-puts-tell. The encoding
*might* be irrelevant there (but no guarantees).

On 15 Mai, 07:43, Swaroop <swaroop.t@gmail.com> wrote:
>     If i am right, by doing like above, duplicate files will be
> created.  

Not necessarily. You could do it like this:
set fp [open $filename]
set data [read $fp]
close $fp
set fp [open $filename w]
foreach line [split $data \n] {
   if {[lindex $line 0] ne $deletekey} {puts $fp $line}
}

close $fp

Then your data file exists only in one instance - but you must have
enough memory to hold all data...

On May 14, 11:58 pm, Swaroop <swaroop.t@gmail.com> wrote:

>    I want to remove a single line from a flat file using TCL.

Okay, the first thing to realize is that flat files, at least under
linux, unix, and windows, have no special access routines. This means
that one has to read in the entire file, then write out the parts of
the file that you want out.

Given that there are no silver bullets, there are several ways you
could go at this task:

1. read the entire file into memory, then write everything out to a
new, temporary file, then rename the original, rename the temporary,
and delete the original. This technique keeps the original around
until the last moment, so that, in case of a power failure or some
other problem, you still have the original data available. You are,
however, left with a brief moment (truly less than a second, assuming
decent access to your files), where there is no file by the original
filename present.  This would be a problem if the file is critical
(say, a password file, etc.)

2. Read the file a line at a time, writing out a line at a time.
Again, you have to deal with the "write to a temporary file" issues,
but if the original file is very large, then you don't take up as much
memory.

3. Open the original file read, read through to the point where you
want to delete, save the offset from the beginning, read the next line
then open the file a second time, in read/write mode, seek to the
saved offset, and write out the next record, and continue reading from
the first descriptor and writing to the second.  WARNING! If you
experience a power outage, program crash, user interference, network
loss, etc. you would end up with an incomplete file. However, the file
does remain in place at all times.

4. you could read in the file, write it out to a database (one record
per line), delete the record required, then read back through the
database, writing out to the original file. Again, you remove the
temporary file, but you again could experience a truncated original
file in the case of a power outage, program crash, etc.

Basically, there is no _safe_ way to do this and ensure that what you
want to do gets done completely in the case of extreme problems.  I'd
go with version 1 above, typically.

In fact, I'd do such things not in Tcl but with utilities like gawk
(provided you have *n*x or Cygwin):

mv datafile t
gawk '$1!="123096"' t > datafile

On May 15, 8:26 am, suchenwi <richard.suchenwirth-

bauersa@siemens.com> wrote:
> In fact, I'd do such things not in Tcl but with utilities like gawk
> (provided you have *n*x or Cygwin):

> mv datafile t
> gawk '$1!="123096"' t > datafile

Which still has the problem of leaving the system without the file for
a period of time. P.S. that can be done on windows as well - take a
look at any of the windows unix-utility suites like UWIN, Cygwin,
Microsoft's Interopt/SFU software for Windows XP and Windows Server,
MKS toolkit and quite a number of other alternatives.  One of these
days, maybe I'll get around to gathering information about all of
these into a page on the wiki...

There aren't many operating systems out there which allow you to just
go into a plain text file to delete lines.

If this is not a one time affair, but something that you need to do
frequently, you might want to consider changing over to use a database
that permits trivial row deletion (which is, I'd guess, most of
them ;-)

Hi,

Larry W. Virden wrote :

> .....
> 3. Open the original file read, read through to the point where you
> want to delete, save the offset from the beginning, read the next line
> then open the file a second time, in read/write mode, seek to the
> saved offset, and write out the next record, and continue reading from
> the first descriptor and writing to the second.  WARNING! If you
> experience a power outage, program crash, user interference, network
> loss, etc. you would end up with an incomplete file. However, the file
> does remain in place at all times.

However, once the copy in place is completed, file would have to be
truncated to current write offset. While there are many usual situations
where one need to truncate a file at arbitrary position (see ftruncate()
POSIX/SV function), and this operation is supported by most modern
operating system/file systems , this is unfortunatly still impossible
with current Tcl stable release (8.4), so this approach is not
applicable.  TIP #208 introduces a new "chan" command, available in Tcl
8.5, and more especially "chan truncate channelId ?length?" subcommand.

<OT>

Working with very large files (say several GBytes) was probably not very
frequent in 2002 (when Tcl 8.4.0 was released). But with storage getting
less and less expensive, and with most filesystems and OS supporting
large files, it is quite natural TIP #206 (later merged into TIP #208)
was proposed some time later (proposed june 2004, accepted november 2004).

My feeling is that this is just an example, among many others, of Tcl
currently getting little by little out of sync with some developers'
needs. Among all goodies part of Tcl 8.5 (see http://wiki.tcl.tk/10630
), many offers solutions to immediate problems developers are faced
with. Having them still unavailable into stable (and most widely used)
branch, 3 to 5 years later, doesn't help providing a dynamic "brand
image" of Tcl.

I have no jugdement about 8.5 roadmap, I understand core devels are
already making impressive work on it, and I'm not advocating here for a
quick release. I'm only concerned about the opportunity to introduce new
features in Tcl more often than once every 5 years or so, so a larger
community can view Tcl as an agile language, brought by a dynamic
community, and offering practical solutions to their needs.

No doubt some features introduced in 8.5 need to wait for a major
release, because they  break compatibility, need long validation, or
because they imply refactoring of Tcl core code. But others could be
very easily backported to (or even just put in) 8.4. I'm thinking of a
Tcl/TK based on 8.4 for stability, with e.g. new commands and
subcommands like chan, dict, lassign, lrepeat, string reverse, encoding
dirs, binary with new formats, maybe Xft support, etc...

Maybe this possibility has been already discussed among TCT members?

</OT>

Eric

-----
Eric Hassold
Evolane - http://www.evolane.com/

On May 15, 10:25 pm, "Larry W. Virden" <lvir@gmail.com> wrote:

Well.. the only dangerous part is:

  mv datafile t

On most modern filesystems this is fairly atomic and safe, just like a
database, since it only involves changing the file's name. On a
journaled filesystem, if this operation happens to fail then on next
powerup the file name will be restored to its original name.

So if you're worried about this then don't use a filesystem like
FAT32. Instead use NTFS or ext3 or HFS+ (and remember to turn on
journaling for HFS+).

Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc