Hala Gamal wrote:
> thank you :)it worked well for small file but when i enter big file,, i
> obtain this error: "Traceback (most recent call last):
> File "D:\Python27\yarab (4).py", line 46, in
> writer.add_document(**doc)
> File "build\bdist.win32\egg\whoosh\filedb\filewriting.py", lin
thank you :)it worked well for small file but when i enter big file,, i obtain
this error:
"Traceback (most recent call last):
File "D:\Python27\yarab (4).py", line 46, in
writer.add_document(**doc)
File "build\bdist.win32\egg\whoosh\filedb\filewriting.py", line 369, in
add_document
On 2013-02-22 14:55, Hala Gamal wrote:
my code works well with english file but when i use text file encodede"utf-8" "my
file contain some arabic letters" it doesn't work.
my code:
# encoding: utf-8
from whoosh import fields, index
import os.path
import re,string
import codecs
from whoosh.qparse
Hala Gamal wrote:
> my code works well with english file but when i use text file
> encodede"utf-8" "my file contain some arabic letters" it doesn't work. my
> code:
> with codecs.open("tt.txt",encoding='utf-8') as txtfile:
Try encoding="utf-8-sig" in the above to remove the byte order mark (B
On 2/19/2013 8:07 PM, halagamal2...@gmail.com wrote:
UnicodeEncodeError: 'decimal' codec can't encode character u'\ufeff'
in position 0: invalid decimal Unicode string
I believe that is a byte-order mark, which should only be the first 2
bytes in the file and which should be removed if you use