我有一个Scrapy项目,正在尝试将输出项保存为Django模型定义中的对象(我不使用DjangoItem)。
我正在按此处指定的方式导入Django设置。
def setup_django_env(path): import imp, os from django.core.management import setup_environ f, filename, desc = imp.find_module('settings', [path]) project = imp.load_module('settings', f, filename, desc) setup_environ(project) setup_django_env(PATH_TO_DJANGO_PROJECT)
在我的Scrapy项目中,我有一个管道类,该类最终处理所有项目并将其保存到数据库中:
from my_django_project.apps.my_books.models import Book, Category, Image class DjangoPipeline(object): def process_item(self, item, spider): category = Category.objects.get(name='Horror') book = Book(name='something', category=category) book.save() image = Image(name='something', book=book) image.save() return item
但是,发生了一些奇怪的事情,对于第一个项目,我得到了一个错误(请参阅下文)。对于其余项目,一切都很好。假设我要保存7个项目,因此我在第一个保存错误,而其他6个保存。
Traceback (most recent call last): File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/scrapy/middleware.py", line 54, in _process_chain return process_chain(self.methods[methodname], obj, *args) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/scrapy/utils/defer.py", line 65, in process_chain d.callback(input) File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/twisted/internet/defer.py", line 243, in callback self._startRunCallbacks(result) File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/twisted/internet/defer.py", line 312, in _startRunCallbacks self._runCallbacks() --- <exception caught here> --- File "/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "/users/ale/djcode/books/lib/scraper/scraper/djangopipeline.py", line 34, in process_item selected_category = Category.objects.get(name='Horror') File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/manager.py", line 132, in get return self.get_query_set().get(*args, **kwargs) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/query.py", line 333, in get clone = self.filter(*args, **kwargs) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/query.py", line 550, in filter return self._filter_or_exclude(False, *args, **kwargs) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/query.py", line 568, in _filter_or_exclude clone.query.add_q(Q(*args, **kwargs)) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1131, in add_q can_reuse=used_aliases) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1026, in add_filter negate=negate, process_extras=process_extras) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/sql/query.py", line 1182, in setup_joins field, model, direct, m2m = opts.get_field_by_name(name) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 291, in get_field_by_name cache = self.init_name_map() File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 321, in init_name_map for f, model in self.get_all_related_m2m_objects_with_model(): File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 396, in get_all_related_m2m_objects_with_model cache = self._fill_related_many_to_many_cache() File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/options.py", line 410, in _fill_related_many_to_many_cache for klass in get_models(): File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/loading.py", line 167, in get_models self._populate() File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/loading.py", line 61, in _populate self.load_app(app_name, True) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/db/models/loading.py", line 76, in load_app app_module = import_module(app_name) File "/users/ale/virtualenvs/books/lib/python2.6/site-packages/django/utils/importlib.py", line 35, in import_module __import__(name) exceptions.ImportError: No module named my_books
如果我做这样的事情,所有7项都将被保存:
from my_django_project.apps.my_app.models import Book, Category, Image class DjangoPipeline(object): def process_item(self, item, spider): try: category = Category.objects.get(name='something') except: category = Category.objects.get(name='something') book = Book(name='something', category=category) try: book.save() except: book.save() image = Image(name='something', book=book) try: image.save() except: image.save() return item
我不知道我在做什么错。有人可以帮我吗?
谢谢!
我遇到了同样的问题,并且找到了解决方案。至少,它对我有用。
在我的情况下,问题出在Django项目的setting.py文件中- 我没有在元组中添加我的应用程序的FQN(完全限定名称)INSTALLED_APPS,而是简称。
INSTALLED_APPS
谈到您的示例,可能是您添加INSTALLED_APPS了my_books元素,但没有添加my_django_project.apps.my_books。
my_books
my_django_project.apps.my_books