Browse Source

[fix] google: avoid unnecessary SearxEngineXPathException errors

Avoid SearxEngineXPathException errors when parsing non valid results::

    .//div[@class="yuRUbf"]//a/@href index 0 not found
    Traceback (most recent call last):
      File "./searx/engines/google.py", line 274, in response
        url = eval_xpath_getindex(result, href_xpath, 0)
      File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
        raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
    searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Markus Heiser 4 years ago
parent
commit
7f505bdc6f
1 changed files with 3 additions and 1 deletions
  1. 3 1
      searx/engines/google.py

+ 3 - 1
searx/engines/google.py

@@ -271,7 +271,9 @@ def response(resp):
                 logger.debug('ingoring <div class="g" ../> section: missing title')
                 logger.debug('ingoring <div class="g" ../> section: missing title')
                 continue
                 continue
             title = extract_text(title_tag)
             title = extract_text(title_tag)
-            url = eval_xpath_getindex(result, href_xpath, 0)
+            url = eval_xpath_getindex(result, href_xpath, 0, None)
+            if url is None:
+                continue
             content = extract_text(eval_xpath_getindex(result, content_xpath, 0, default=None), allow_none=True)
             content = extract_text(eval_xpath_getindex(result, content_xpath, 0, default=None), allow_none=True)
             results.append({
             results.append({
                 'url': url,
                 'url': url,