On Tue, Jan 13, 2009 at 1:50 PM, Chris Rebert <c...@rebertia.com> wrote: > > On Mon, Jan 12, 2009 at 11:46 PM, S.Selvam Siva <s.selvams...@gmail.com> > wrote: > > Hi all, > > > > I need to extract the domain-name from a given url(without sub-domains). > > With urlparse, i am able to fetch only the domain-name(which includes the > > sub-domain also). > > > > eg: > > http://feeds.huffingtonpost.com/posts/ , http://www.huffingtonpost.de/, > > .... all must lead to huffingtonpost.com or huffingtonpost.de > > > > Please suggest me some ideas regarding this problem. > > That would require (pardon the pun) domain-specific logic. For most > TLDs (e.g. .com, .org) the domain name is just blah.com, blah.org, > etc. But for ccTLDs, often only second-level registrations are > allowed, e.g. for www.bbc.co.uk, so the main domain name would be > bbc.co.uk I think a few TLDs have even more complicated rules. > > I doubt anyone's created a general ready-made solution for this, you'd > have to code it yourself. > To handle the common case, you can cheat and just .split() at the > periods and then slice and rejoin the list of domain parts, ex: > '.'.join(domain.split('.')[-2:]) > > Cheers, > Chris
Thank you Chris Rebert, Actually i tried with domain specific logic.Having 200 TLD like .com,co.in,co.uk and tried to extract the domain name. But my boss want more reliable solution than this method,any way i will try to find some alternative solution. -- Yours, S.Selvam -- http://mail.python.org/mailman/listinfo/python-list