Hello everybody,

I've fallen in love with Django two years ago and I've been using it for my 
job projects. In the past I found very useful information in this group, so 
a big thank you guys!

I have a little doubt.
I have to import in Django db (sqlite for local development, mySql on the 
server) about 1.000.000 xml documents.

The model class is the following:

class Doc(models.Model):
    doc_code =  models.CharField(max_length=20, unique=True, 
primary_key=True, db_index = True) 
    doc_text = models.TextField(null=True, blank=True) 
    related_doc= models.ManyToManyField('self', null=True, blank=True, 
db_index = True) 

>From what I know bulk insertion is not possibile because I have a 
ManyToManyField relation.

So I have this simple loop (in pseudo code)

for each xml:
   extract from the xml  date-> mydoc_code, mydoc_text, myRelated_doc_codes

   myDoc = Doc.object.get_or_create(doc_code = mydoc_code)[0]
   myDoc.doc_text = mydoc_text
   
   for reldoc_code in myRelated_doc_codes:
        myRelDoc =  Doc.object.get_or_create(doc_code = reldoc_code )[0]
        myDoc.related_doc.add(myRelDoc )

  myDoc.save()


I'm doing it right? Do you have some suggestions, recommendation? I fear 
that since I have 1.000.000 docs to import, it will take a loooot of time, 
especially during the get_or_create routines

thank you in advance everybody!

John




             

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/5b88deaf-d806-4a64-9e8d-528d95599c80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to