S.Selvam Siva wrote: > On Tue, Jan 13, 2009 at 1:50 PM, Chris Rebert <c...@rebertia.com> wrote: >> On Mon, Jan 12, 2009 at 11:46 PM, S.Selvam Siva <s.selvams...@gmail.com> >> wrote: >>> Hi all, >>> >>> I need to extract the domain-name from a given url(without sub-domains). >>> With urlparse, i am able to fetch only the domain-name(which includes the >>> sub-domain also). >>> >>> eg: >>> http://feeds.huffingtonpost.com/posts/ , http://www.huffingtonpost.de/, >>> .... all must lead to huffingtonpost.com or huffingtonpost.de >>> >>> Please suggest me some ideas regarding this problem. >> That would require (pardon the pun) domain-specific logic. For most >> TLDs (e.g. .com, .org) the domain name is just blah.com, blah.org, >> etc. But for ccTLDs, often only second-level registrations are >> allowed, e.g. for www.bbc.co.uk, so the main domain name would be >> bbc.co.uk I think a few TLDs have even more complicated rules. >> >> I doubt anyone's created a general ready-made solution for this, you'd >> have to code it yourself. >> To handle the common case, you can cheat and just .split() at the >> periods and then slice and rejoin the list of domain parts, ex: >> '.'.join(domain.split('.')[-2:]) >> >> Cheers, >> Chris > > > Thank you Chris Rebert, > Actually i tried with domain specific logic.Having 200 TLD like > .com,co.in,co.uk and tried to extract the domain name. > But my boss want more reliable solution than this method,any way i > will try to find some alternative solution. > If you post a good first try, opening the source, I would be surprised if others do not join your effort to establish suitable rules. This is somethjing that many people could doubtless use.
regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list