Hi all,
I need to extract the domain-name from a given url(without sub-domains).
With urlparse, i am able to fetch only the domain-name(which includes the
sub-domain also).
eg:
http://feeds.huffingtonpost.com/posts/ , http://www.huffingtonpost.de/,
all must lead to *huffingtonpost.com or
On Tue, Jan 13, 2009 at 1:50 PM, Chris Rebert wrote:
>
> On Mon, Jan 12, 2009 at 11:46 PM, S.Selvam Siva
> wrote:
> > Hi all,
> >
> > I need to extract the domain-name from a given url(without sub-domains).
> > With urlparse, i am able to fetch only the domai
Hi all,
I am running a python script which parses nearly 22,000 html files locally
stored using BeautifulSoup.
The problem is the memory usage linearly increases as the files are being
parsed.
When the script has crossed parsing 200 files or so, it consumes all the
available RAM and The CPU usage
Hi all,
I have found the actual solution for this problem.
I tried using BeautifulSoup.SoupStrainer() and it improved memory usage
to the greatest extent.Now it uses max of 20 MB(earlier
it was >800 MB on 1GB RAM system).
thanks all.
--
Yours,
S.Selvam
--
http://mail.python.org/m
On Tue, Jan 20, 2009 at 7:27 PM, Tim Arnold wrote:
> I had the same problem you did, but then I changed the code to create a new
> soup object for each file.That drastically increased the speed. I don't
> know why, but it looks like the soup object just keeps getting bigger with
> each feed.
>
>
Hi all,
I am developing spell checker for my local language(tamil) using python.
I need to generate alternative word list for a miss-spelled word from the
dictionary of words.The alternatives must be as much as closer to the
miss-spelled word.As we know, ordinary string comparison wont work here .
Thank You Gabriel,
On Sun, Jan 25, 2009 at 7:12 AM, Gabriel Genellina
wrote:
> En Sat, 24 Jan 2009 15:08:08 -0200, S.Selvam Siva
> escribió:
>
>
> I am developing spell checker for my local language(tamil) using python.
>> I need to generate alternative word list for a m
Hi all,
I need to parse rss-feeds based on time stamp,But rss-feeds follow different
standards of date(IST,EST etc).
I dont know,how to standardize this standards.It will be helpful if you can
hint me.
--
Yours,
S.Selvam
--
http://mail.python.org/mailman/listinfo/python-list
On Thu, Jan 29, 2009 at 2:27 PM, M.-A. Lemburg wrote:
> On 2009-01-29 03:38, Gabriel Genellina wrote:
> > En Wed, 28 Jan 2009 18:55:21 -0200, S.Selvam Siva
> > escribió:
> >
> >> I need to parse rss-feeds based on time stamp,But rss-feeds follow
> >> diffe
Hi all,
I have a small query,
Consider there is a task A which i want to perform.
To perform it ,i have two option.
1)Writing a small piece of code(approx. 50 lines) as efficient as possible.
2)import a suitable module to perform task A.
I am eager to know,which method will produce best performa
On Mon, Feb 2, 2009 at 3:11 PM, Chris Rebert wrote:
> On Mon, Feb 2, 2009 at 1:29 AM, S.Selvam Siva
> wrote:
> > Hi all,
> > I have a small query,
> > Consider there is a task A which i want to perform.
> >
> > To perform it ,i have two option.
> >
Hi all,
I tried to do a string replace as follows,
>>> s="hi & people"
>>> s.replace("&","\&")
'hi \\& people'
>>>
but i was expecting 'hi \& people'.I dont know ,what is something different
here with escape sequence.
--
Yours,
S.Selvam
--
http://mail.python.org/mailman/listinfo/python-list
On Thu, Feb 5, 2009 at 5:59 PM, wrote:
> "S.Selvam Siva" wrote:
> > I tried to do a string replace as follows,
> >
> > >>> s="hi & people"
> > >>> s.replace("&","\&")
> > 'hi \\& peo
Hi all,
I need to parse feeds and post the data to SOLR.I want the special
characters(Unicode char) to be posted as numerical representation,
For eg,
*'* --> ’ (for which HTML equivalent is ’)
I used BeautifulSoup,which seems to be allowing conversion from ""(
numeric values )to unicode ch
Hi all,
I need some help.
I tried to find top n(eg. 5) similar words for a given word, from a
dictionary of 50,000 words.
I used python-levenshtein module,and sample code is as follow.
def foo(searchword):
disdict={}
for word in self.dictionary-words:
distance=Levenshte
On Sat, Feb 14, 2009 at 3:01 PM, Peter Otten <__pete...@web.de> wrote:
> Gabriel Genellina wrote:
>
> > En Fri, 13 Feb 2009 08:16:00 -0200, S.Selvam Siva <
> s.selvams...@gmail.com>
> > escribió:
> >
> >> I need some help.
> >> I tried to
I am trying to post file from python to php using HTTP POST method. I tried
mechanize but not able to pass the file object.
from mechanize import Browser
br=Browser()
response=br.open("http://localhost/test.php";)
br.select_form('form1')
br['uploadedfile']=open("C:/Documents and
Settings/user/Desk
ilename="newurl-ideas.txt",name="uploadedfile")
>
On Tue, Dec 2, 2008 at 1:33 PM, S.Selvam Siva <[EMAIL PROTECTED]>wrote:
> I am trying to post file from python to php using HTTP POST method. I tried
> mechanize but not able to pass the file object.
>
Hello,
i am in a process of writing spell checker for my local language(Tamil).
I wrote a plugin for gedit with pygtk for gui.Recently i came to know about
pygtkspell ,that can be used for spell checking and suggestion offering.
I am bit congused about it and could not able to get useful info by
go
I have to do a parsing on webpagesand fetch urls.My problem is ,many urls i
need to parse are dynamically loaded using javascript function
(onload()).How to fetch those links from python? Thanks in advance.
--
http://mail.python.org/mailman/listinfo/python-list
Hi all,
I have a dictionary in which each key is associated with a list as value.
eg: *dic={'a':['aa','ant','all']}*
The dictionary contains *1.5 lakh keys*.
Now i want to store it to a file,and need to be loaded to python program
during execution.
I expect your ideas/suggestions.
Note:I think
Hi all,
I want to upload a file from python to php/html form using urllib2,and my
code is below
PYTHON CODE:
import urllib
import urllib2,sys,traceback
url='http://localhost/index2.php'
values={}
f=open('addons.xcu','r')
values['datafile']=f.read() #is this correct ?
values['Submit']='True'
data
22 matches
Mail list logo