Home     |     .Net Programming    |     cSharp Home    |     Sql Server Home    |     Javascript / Client Side Development     |     Ajax Programming

Ruby on Rails Development     |     Perl Programming     |     C Programming Language     |     C++ Programming     |     IT Jobs

Python Programming Language     |     Laptop Suggestions?    |     TCL Scripting     |     Fortran Programming     |     Scheme Programming Language


 
 
Cervo Technologies
The Right Source to Outsource

MS Dynamics CRM 3.0

Python Programming Language

How to parse usenet urls?


I'm trying to parse newsgroup messages, and I need to follow URLs in
this format: news://some.server. I can past them into a newsreader
with no problem, but I want to do it programatically.

I can't figure out how to follow these links - anyone have any ideas?

In article <1180573018.786188.220@h2g2000hsg.googlegroups.com>,

 "snewma@gmail.com" <snewma@gmail.com> wrote:
> I'm trying to parse newsgroup messages, and I need to follow URLs in
> this format: news://some.server. I can past them into a newsreader
> with no problem, but I want to do it programatically.

> I can't figure out how to follow these links - anyone have any ideas?

Are you aware of nntplib?

http://docs.python.org/lib/module-nntplib.html

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more

> Are you aware of nntplib?

> http://docs.python.org/lib/module-nntplib.html

I am, but I once I got into the article itself, I couldn't figure out
how to "call" a link inside the resulting message text:

import nntplib
username = my username
password = my password
nntp_server = 'newsclip.ap.org'
n = nntplib.NNTP(nntp_server, 119, username, password)
n.group('ap.spanish.online.headlines')

m_id = n.next()[1]
n.article(m_id)

I'll get output like this headline and full story message link:
(truncated for length)

>>> ... 'Castro: Bush desea mi muerte, pero las ideas no se matan', 'news://newsclip.ap.org/D8PE2G6O0@news.ap.org', ...

How can I take the message link 'news://newsclip.ap.org/
D8PE2G@news.ap.org' and follow it?
In article <1180581992.342613.302@k79g2000hse.googlegroups.com>,

 "snewma@gmail.com" <snewma@gmail.com> wrote:
> > Are you aware of nntplib?

> > http://docs.python.org/lib/module-nntplib.html

> I am, but I once I got into the article itself, I couldn't figure out
> how to "call" a link inside the resulting message text:

> >>> ... 'Castro: Bush desea mi muerte, pero las ideas no se matan',
> >>> 'news://newsclip.ap.org/D8PE2G6O0@news.ap.org', ...

> How can I take the message link 'news://newsclip.ap.org/
> D8PE2G@news.ap.org' and follow it?

OK, gotcha. I misunderstood your original question. Perhaps this is just
a synonym for "nntp:"? THis sounds like a dangerous assumption and
hopefully someone more knowledgeable will come along and shoot me down.
=) But when I fire up Ethereal and paste that news: URL into my browser,
Firefox launches my newsreader client and Ethereal reports that my
client connects to the remote server at the NNTP port (119), sends an
NNTP LIST command and Ethereal identifies the subsequent conversation as
NNTP.

If I were you I'd try handling news: URLs with nttplib. I bet it will
work.

Sorry I couldn't provide more than guesses. Good luck!

--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more

Add to del.icio.us | Digg this | Stumble it | Powered by Megasolutions Inc