Browse Source

Merge branch 'master' of https://github.com/asciimoo/searx into code_results

Conflicts:
	searx/engines/searchcode_code.py
	searx/engines/searchcode_doc.py
	searx/static/oscar/js/searx.min.js
	searx/templates/oscar/result_templates/default.html
	searx/templates/oscar/result_templates/images.html
	searx/templates/oscar/result_templates/map.html
	searx/templates/oscar/result_templates/torrent.html
	searx/templates/oscar/result_templates/videos.html
Thomas Pointhuber 10 years ago
parent
commit
400b54191c
100 changed files with 611 additions and 120 deletions
  1. 1 1
      .travis.yml
  2. 3 0
      AUTHORS.rst
  3. 27 0
      CHANGELOG.rst
  4. 9 9
      Makefile
  5. 16 18
      README.rst
  6. 11 1
      searx/__init__.py
  7. 2 2
      searx/engines/500px.py
  8. 7 4
      searx/engines/__init__.py
  9. 11 4
      searx/engines/dailymotion.py
  10. 61 0
      searx/engines/deezer.py
  11. 70 0
      searx/engines/digg.py
  12. 1 5
      searx/engines/duckduckgo_definitions.py
  13. 95 0
      searx/engines/flickr-noapi.py
  14. 62 29
      searx/engines/flickr.py
  15. 3 2
      searx/engines/kickass.py
  16. 8 4
      searx/engines/searchcode_doc.py
  17. 12 2
      searx/engines/soundcloud.py
  18. 4 1
      searx/engines/startpage.py
  19. 78 0
      searx/engines/subtitleseeker.py
  20. 19 8
      searx/engines/twitter.py
  21. 14 12
      searx/engines/vimeo.py
  22. 14 0
      searx/engines/wikidata.py
  23. 11 2
      searx/engines/youtube.py
  24. 5 3
      searx/https_rewrite.py
  25. 16 8
      searx/search.py
  26. 49 2
      searx/settings.yml
  27. 0 1
      searx/static/oscar/js/searx.min.js
  28. 0 0
      searx/static/themes/courgette/css/style.css
  29. 0 0
      searx/static/themes/courgette/img/bg-body-index.jpg
  30. 0 0
      searx/static/themes/courgette/img/favicon.png
  31. 0 0
      searx/static/themes/courgette/img/github_ribbon.png
  32. 0 0
      searx/static/themes/courgette/img/icons/icon_dailymotion.ico
  33. 0 0
      searx/static/themes/courgette/img/icons/icon_deviantart.ico
  34. 0 0
      searx/static/themes/courgette/img/icons/icon_github.ico
  35. 0 0
      searx/static/themes/courgette/img/icons/icon_kickass.ico
  36. 0 0
      searx/static/themes/courgette/img/icons/icon_soundcloud.ico
  37. 0 0
      searx/static/themes/courgette/img/icons/icon_stackoverflow.ico
  38. 0 0
      searx/static/themes/courgette/img/icons/icon_twitter.ico
  39. 0 0
      searx/static/themes/courgette/img/icons/icon_vimeo.ico
  40. 0 0
      searx/static/themes/courgette/img/icons/icon_wikipedia.ico
  41. 0 0
      searx/static/themes/courgette/img/icons/icon_youtube.ico
  42. 0 0
      searx/static/themes/courgette/img/preference-icon.png
  43. 0 0
      searx/static/themes/courgette/img/search-icon.png
  44. 0 0
      searx/static/themes/courgette/img/searx-mobile.png
  45. 0 0
      searx/static/themes/courgette/img/searx.png
  46. 0 0
      searx/static/themes/courgette/img/searx_logo.svg
  47. 0 0
      searx/static/themes/courgette/js/mootools-autocompleter-1.1.2-min.js
  48. 0 0
      searx/static/themes/courgette/js/mootools-core-1.4.5-min.js
  49. 0 0
      searx/static/themes/courgette/js/searx.js
  50. 0 0
      searx/static/themes/default/css/style.css
  51. 0 0
      searx/static/themes/default/img/favicon.png
  52. 0 0
      searx/static/themes/default/img/github_ribbon.png
  53. 0 0
      searx/static/themes/default/img/icons/icon_dailymotion.ico
  54. 0 0
      searx/static/themes/default/img/icons/icon_deviantart.ico
  55. 0 0
      searx/static/themes/default/img/icons/icon_github.ico
  56. 0 0
      searx/static/themes/default/img/icons/icon_kickass.ico
  57. 0 0
      searx/static/themes/default/img/icons/icon_soundcloud.ico
  58. 0 0
      searx/static/themes/default/img/icons/icon_stackoverflow.ico
  59. 0 0
      searx/static/themes/default/img/icons/icon_twitter.ico
  60. 0 0
      searx/static/themes/default/img/icons/icon_vimeo.ico
  61. 0 0
      searx/static/themes/default/img/icons/icon_wikipedia.ico
  62. 0 0
      searx/static/themes/default/img/icons/icon_youtube.ico
  63. 0 0
      searx/static/themes/default/img/preference-icon.png
  64. 0 0
      searx/static/themes/default/img/search-icon.png
  65. 0 0
      searx/static/themes/default/img/searx.png
  66. 0 0
      searx/static/themes/default/img/searx_logo.svg
  67. 0 0
      searx/static/themes/default/js/mootools-autocompleter-1.1.2-min.js
  68. 0 0
      searx/static/themes/default/js/mootools-core-1.4.5-min.js
  69. 0 0
      searx/static/themes/default/js/searx.js
  70. 0 0
      searx/static/themes/default/less/autocompleter.less
  71. 0 0
      searx/static/themes/default/less/code.less
  72. 0 0
      searx/static/themes/default/less/definitions.less
  73. 0 0
      searx/static/themes/default/less/mixins.less
  74. 0 0
      searx/static/themes/default/less/search.less
  75. 0 0
      searx/static/themes/default/less/style.less
  76. 0 0
      searx/static/themes/oscar/.gitignore
  77. 2 2
      searx/static/themes/oscar/README.rst
  78. 0 0
      searx/static/themes/oscar/css/bootstrap.min.css
  79. 0 0
      searx/static/themes/oscar/css/leaflet.min.css
  80. 0 0
      searx/static/themes/oscar/css/oscar.min.css
  81. 0 0
      searx/static/themes/oscar/fonts/glyphicons-halflings-regular.eot
  82. 0 0
      searx/static/themes/oscar/fonts/glyphicons-halflings-regular.svg
  83. 0 0
      searx/static/themes/oscar/fonts/glyphicons-halflings-regular.ttf
  84. 0 0
      searx/static/themes/oscar/fonts/glyphicons-halflings-regular.woff
  85. 0 0
      searx/static/themes/oscar/gruntfile.js
  86. 0 0
      searx/static/themes/oscar/img/favicon.png
  87. 0 0
      searx/static/themes/oscar/img/icons/README.md
  88. 0 0
      searx/static/themes/oscar/img/icons/amazon.png
  89. 0 0
      searx/static/themes/oscar/img/icons/dailymotion.png
  90. 0 0
      searx/static/themes/oscar/img/icons/deviantart.png
  91. 0 0
      searx/static/themes/oscar/img/icons/facebook.png
  92. 0 0
      searx/static/themes/oscar/img/icons/flickr.png
  93. 0 0
      searx/static/themes/oscar/img/icons/github.png
  94. 0 0
      searx/static/themes/oscar/img/icons/kickass.png
  95. BIN
      searx/static/themes/oscar/img/icons/openstreetmap.png
  96. BIN
      searx/static/themes/oscar/img/icons/photon.png
  97. BIN
      searx/static/themes/oscar/img/icons/searchcode code.png
  98. BIN
      searx/static/themes/oscar/img/icons/searchcode doc.png
  99. 0 0
      searx/static/themes/oscar/img/icons/soundcloud.png
  100. 0 0
      searx/static/themes/oscar/img/icons/stackoverflow.png

+ 1 - 1
.travis.yml

@@ -5,7 +5,7 @@ before_install:
   - "export DISPLAY=:99.0"
   - "export DISPLAY=:99.0"
   - "sh -e /etc/init.d/xvfb start"
   - "sh -e /etc/init.d/xvfb start"
   - npm install -g less grunt-cli
   - npm install -g less grunt-cli
-  - ( cd searx/static/oscar;npm install )
+  - ( cd searx/static/themes/oscar;npm install )
 install:
 install:
   - "make"
   - "make"
   - pip install coveralls
   - pip install coveralls

+ 3 - 0
AUTHORS.rst

@@ -29,3 +29,6 @@ generally made searx better:
 - @kernc
 - @kernc
 - @Cqoicebordel
 - @Cqoicebordel
 - @Reventl0v
 - @Reventl0v
+- Caner Başaran
+- Benjamin Sonntag
+- @opi

+ 27 - 0
CHANGELOG.rst

@@ -0,0 +1,27 @@
+0.6.0 - 2014.12.25
+==================
+
+- Changelog added
+- New engines
+
+  - Flickr (api)
+  - Subtitleseeker
+  - photon
+  - 500px
+  - Searchcode
+  - Searchcode doc
+  - Kickass torrent
+- Precise search request timeout handling
+- Better favicon support
+- Stricter config parsing
+- Translation updates
+- Multiple ui fixes
+- Flickr (noapi) engine fix
+- Pep8 fixes
+
+
+News
+~~~~
+
+Health status of searx instances and engines: http://stats.searx.oe5tpo.com
+(source: https://github.com/pointhi/searx_stats)

+ 9 - 9
Makefile

@@ -18,10 +18,6 @@ $(python):
 	virtualenv -p python$(version) --no-site-packages .
 	virtualenv -p python$(version) --no-site-packages .
 	@touch $@
 	@touch $@
 
 
-tests: .installed.cfg
-	@bin/test
-	@grunt test --gruntfile searx/static/oscar/gruntfile.js
-
 robot: .installed.cfg
 robot: .installed.cfg
 	@bin/robot
 	@bin/robot
 
 
@@ -29,6 +25,10 @@ flake8: .installed.cfg
 	@bin/flake8 setup.py
 	@bin/flake8 setup.py
 	@bin/flake8 ./searx/
 	@bin/flake8 ./searx/
 
 
+tests: .installed.cfg flake8
+	@bin/test
+	@grunt test --gruntfile searx/static/themes/oscar/gruntfile.js
+
 coverage: .installed.cfg
 coverage: .installed.cfg
 	@bin/coverage run bin/test
 	@bin/coverage run bin/test
 	@bin/coverage report
 	@bin/coverage report
@@ -45,18 +45,18 @@ minimal: bin/buildout minimal.cfg setup.py
 	bin/buildout -c minimal.cfg $(options)
 	bin/buildout -c minimal.cfg $(options)
 
 
 styles:
 styles:
-	@lessc -x searx/static/default/less/style.less > searx/static/default/css/style.css
-	@lessc -x searx/static/oscar/less/bootstrap/bootstrap.less > searx/static/oscar/css/bootstrap.min.css
-	@lessc -x searx/static/oscar/less/oscar/oscar.less > searx/static/oscar/css/oscar.min.css
+	@lessc -x searx/static/themes/default/less/style.less > searx/static/themes/default/css/style.css
+	@lessc -x searx/static/themes/oscar/less/bootstrap/bootstrap.less > searx/static/themes/oscar/css/bootstrap.min.css
+	@lessc -x searx/static/themes/oscar/less/oscar/oscar.less > searx/static/themes/oscar/css/oscar.min.css
 
 
 grunt:
 grunt:
-	@grunt --gruntfile searx/static/oscar/gruntfile.js
+	@grunt --gruntfile searx/static/themes/oscar/gruntfile.js
 
 
 locales:
 locales:
 	@pybabel compile -d searx/translations
 	@pybabel compile -d searx/translations
 
 
 clean:
 clean:
 	@rm -rf .installed.cfg .mr.developer.cfg bin parts develop-eggs \
 	@rm -rf .installed.cfg .mr.developer.cfg bin parts develop-eggs \
-		searx.egg-info lib include .coverage coverage searx/static/default/css/*.css
+		searx.egg-info lib include .coverage coverage searx/static/themes/default/css/*.css
 
 
 .PHONY: all tests robot flake8 coverage production minimal styles locales clean
 .PHONY: all tests robot flake8 coverage production minimal styles locales clean

+ 16 - 18
README.rst

@@ -14,16 +14,17 @@ See the `wiki <https://github.com/asciimoo/searx/wiki>`__ for more information.
 Features
 Features
 ~~~~~~~~
 ~~~~~~~~
 
 
--  Tracking free
--  Supports multiple output formats
-    -  json ``curl https://searx.me/?format=json&q=[query]``
-    -  csv ``curl https://searx.me/?format=csv&q=[query]``
-    -  opensearch/rss ``curl https://searx.me/?format=rss&q=[query]``
--  Opensearch support (you can set as default search engine)
--  Configurable search engines/categories
--  Different search languages
--  Duckduckgo like !bang functionality with engine shortcuts
--  Parallel queries - relatively fast
+- Tracking free
+- Supports multiple output formats
+
+  - json ``curl https://searx.me/?format=json&q=[query]``
+  - csv ``curl https://searx.me/?format=csv&q=[query]``
+  - opensearch/rss ``curl https://searx.me/?format=rss&q=[query]``
+- Opensearch support (you can set as default search engine)
+- Configurable search engines/categories
+- Different search languages
+- Duckduckgo like !bang functionality with engine shortcuts
+- Parallel queries - relatively fast
 
 
 Installation
 Installation
 ~~~~~~~~~~~~
 ~~~~~~~~~~~~
@@ -131,14 +132,11 @@ next time you run any other ``make`` command it will rebuild everithing.
 TODO
 TODO
 ~~~~
 ~~~~
 
 
--  Moar engines
--  Better ui
--  Browser integration
--  Documentation
--  Fix ``flake8`` errors, ``make flake8`` will be merged into
-   ``make tests`` when it does not fail anymore
--  Tests
--  When we have more tests, we can integrate Travis-CI
+- Moar engines
+- Better ui
+- Browser integration
+- Documentation
+- Tests
 
 
 Bugs
 Bugs
 ~~~~
 ~~~~

+ 11 - 1
searx/__init__.py

@@ -15,9 +15,9 @@ along with searx. If not, see < http://www.gnu.org/licenses/ >.
 (C) 2013- by Adam Tauber, <asciimoo@gmail.com>
 (C) 2013- by Adam Tauber, <asciimoo@gmail.com>
 '''
 '''
 
 
+import logging
 from os import environ
 from os import environ
 from os.path import realpath, dirname, join, abspath
 from os.path import realpath, dirname, join, abspath
-from searx.https_rewrite import load_https_rules
 try:
 try:
     from yaml import load
     from yaml import load
 except:
 except:
@@ -45,7 +45,17 @@ else:
 with open(settings_path) as settings_yaml:
 with open(settings_path) as settings_yaml:
     settings = load(settings_yaml)
     settings = load(settings_yaml)
 
 
+if settings.get('server', {}).get('debug'):
+    logging.basicConfig(level=logging.DEBUG)
+else:
+    logging.basicConfig(level=logging.WARNING)
+
+logger = logging.getLogger('searx')
+
 # load https rules only if https rewrite is enabled
 # load https rules only if https rewrite is enabled
 if settings.get('server', {}).get('https_rewrite'):
 if settings.get('server', {}).get('https_rewrite'):
     # loade https rules
     # loade https rules
+    from searx.https_rewrite import load_https_rules
     load_https_rules(https_rewrite_path)
     load_https_rules(https_rewrite_path)
+
+logger.info('Initialisation done')

+ 2 - 2
searx/engines/500px.py

@@ -35,9 +35,9 @@ def request(query, params):
 # get response from search-request
 # get response from search-request
 def response(resp):
 def response(resp):
     results = []
     results = []
-    
+
     dom = html.fromstring(resp.text)
     dom = html.fromstring(resp.text)
-    
+
     # parse results
     # parse results
     for result in dom.xpath('//div[@class="photo"]'):
     for result in dom.xpath('//div[@class="photo"]'):
         link = result.xpath('.//a')[0]
         link = result.xpath('.//a')[0]

+ 7 - 4
searx/engines/__init__.py

@@ -22,6 +22,10 @@ from imp import load_source
 from flask.ext.babel import gettext
 from flask.ext.babel import gettext
 from operator import itemgetter
 from operator import itemgetter
 from searx import settings
 from searx import settings
+from searx import logger
+
+
+logger = logger.getChild('engines')
 
 
 engine_dir = dirname(realpath(__file__))
 engine_dir = dirname(realpath(__file__))
 
 
@@ -81,7 +85,7 @@ def load_engine(engine_data):
         if engine_attr.startswith('_'):
         if engine_attr.startswith('_'):
             continue
             continue
         if getattr(engine, engine_attr) is None:
         if getattr(engine, engine_attr) is None:
-            print('[E] Engine config error: Missing attribute "{0}.{1}"'\
+            logger.error('Missing engine config attribute: "{0}.{1}"'
                   .format(engine.name, engine_attr))
                   .format(engine.name, engine_attr))
             sys.exit(1)
             sys.exit(1)
 
 
@@ -100,9 +104,8 @@ def load_engine(engine_data):
         categories['general'].append(engine)
         categories['general'].append(engine)
 
 
     if engine.shortcut:
     if engine.shortcut:
-        # TODO check duplications
         if engine.shortcut in engine_shortcuts:
         if engine.shortcut in engine_shortcuts:
-            print('[E] Engine config error: ambigious shortcut: {0}'\
+            logger.error('Engine config error: ambigious shortcut: {0}'
                   .format(engine.shortcut))
                   .format(engine.shortcut))
             sys.exit(1)
             sys.exit(1)
         engine_shortcuts[engine.shortcut] = engine.name
         engine_shortcuts[engine.shortcut] = engine.name
@@ -199,7 +202,7 @@ def get_engines_stats():
 
 
 
 
 if 'engines' not in settings or not settings['engines']:
 if 'engines' not in settings or not settings['engines']:
-    print '[E] Error no engines found. Edit your settings.yml'
+    logger.error('No engines found. Edit your settings.yml')
     exit(2)
     exit(2)
 
 
 for engine_data in settings['engines']:
 for engine_data in settings['engines']:

+ 11 - 4
searx/engines/dailymotion.py

@@ -6,12 +6,14 @@
 # @using-api   yes
 # @using-api   yes
 # @results     JSON
 # @results     JSON
 # @stable      yes
 # @stable      yes
-# @parse       url, title, thumbnail
+# @parse       url, title, thumbnail, publishedDate, embedded
 #
 #
 # @todo        set content-parameter with correct data
 # @todo        set content-parameter with correct data
 
 
 from urllib import urlencode
 from urllib import urlencode
 from json import loads
 from json import loads
+from cgi import escape
+from datetime import datetime
 
 
 # engine dependent config
 # engine dependent config
 categories = ['videos']
 categories = ['videos']
@@ -20,7 +22,9 @@ language_support = True
 
 
 # search-url
 # search-url
 # see http://www.dailymotion.com/doc/api/obj-video.html
 # see http://www.dailymotion.com/doc/api/obj-video.html
-search_url = 'https://api.dailymotion.com/videos?fields=title,description,duration,url,thumbnail_360_url&sort=relevance&limit=5&page={pageno}&{query}'  # noqa
+search_url = 'https://api.dailymotion.com/videos?fields=created_time,title,description,duration,url,thumbnail_360_url,id&sort=relevance&limit=5&page={pageno}&{query}'  # noqa
+embedded_url = '<iframe frameborder="0" width="540" height="304" ' +\
+    'data-src="//www.dailymotion.com/embed/video/{videoid}" allowfullscreen></iframe>'
 
 
 
 
 # do search-request
 # do search-request
@@ -51,14 +55,17 @@ def response(resp):
     for res in search_res['list']:
     for res in search_res['list']:
         title = res['title']
         title = res['title']
         url = res['url']
         url = res['url']
-        #content = res['description']
-        content = ''
+        content = escape(res['description'])
         thumbnail = res['thumbnail_360_url']
         thumbnail = res['thumbnail_360_url']
+        publishedDate = datetime.fromtimestamp(res['created_time'], None)
+        embedded = embedded_url.format(videoid=res['id'])
 
 
         results.append({'template': 'videos.html',
         results.append({'template': 'videos.html',
                         'url': url,
                         'url': url,
                         'title': title,
                         'title': title,
                         'content': content,
                         'content': content,
+                        'publishedDate': publishedDate,
+                        'embedded': embedded,
                         'thumbnail': thumbnail})
                         'thumbnail': thumbnail})
 
 
     # return results
     # return results

+ 61 - 0
searx/engines/deezer.py

@@ -0,0 +1,61 @@
+## Deezer (Music)
+#
+# @website     https://deezer.com
+# @provide-api yes (http://developers.deezer.com/api/)
+#
+# @using-api   yes
+# @results     JSON
+# @stable      yes
+# @parse       url, title, content, embedded
+
+from json import loads
+from urllib import urlencode
+
+# engine dependent config
+categories = ['music']
+paging = True
+
+# search-url
+url = 'http://api.deezer.com/'
+search_url = url + 'search?{query}&index={offset}'
+
+embedded_url = '<iframe scrolling="no" frameborder="0" allowTransparency="true" ' +\
+    'data-src="http://www.deezer.com/plugins/player?type=tracks&id={audioid}" ' +\
+    'width="540" height="80"></iframe>'
+
+
+# do search-request
+def request(query, params):
+    offset = (params['pageno'] - 1) * 25
+
+    params['url'] = search_url.format(query=urlencode({'q': query}),
+                                      offset=offset)
+
+    return params
+
+
+# get response from search-request
+def response(resp):
+    results = []
+
+    search_res = loads(resp.text)
+
+    # parse results
+    for result in search_res.get('data', []):
+        if result['type'] == 'track':
+            title = result['title']
+            url = result['link']
+            content = result['artist']['name'] +\
+                " &bull; " +\
+                result['album']['title'] +\
+                " &bull; " + result['title']
+            embedded = embedded_url.format(audioid=result['id'])
+
+            # append result
+            results.append({'url': url,
+                            'title': title,
+                            'embedded': embedded,
+                            'content': content})
+
+    # return results
+    return results

+ 70 - 0
searx/engines/digg.py

@@ -0,0 +1,70 @@
+## Digg (News, Social media)
+#
+# @website     https://digg.com/
+# @provide-api no
+#
+# @using-api   no
+# @results     HTML (using search portal)
+# @stable      no (HTML can change)
+# @parse       url, title, content, publishedDate, thumbnail
+
+from urllib import quote_plus
+from json import loads
+from lxml import html
+from cgi import escape
+from dateutil import parser
+
+# engine dependent config
+categories = ['news', 'social media']
+paging = True
+
+# search-url
+base_url = 'https://digg.com/'
+search_url = base_url+'api/search/{query}.json?position={position}&format=html'
+
+# specific xpath variables
+results_xpath = '//article'
+link_xpath = './/small[@class="time"]//a'
+title_xpath = './/h2//a//text()'
+content_xpath = './/p//text()'
+pubdate_xpath = './/time'
+
+
+# do search-request
+def request(query, params):
+    offset = (params['pageno'] - 1) * 10
+    params['url'] = search_url.format(position=offset,
+                                      query=quote_plus(query))
+    return params
+
+
+# get response from search-request
+def response(resp):
+    results = []
+
+    search_result = loads(resp.text)
+
+    if search_result['html'] == '':
+        return results
+
+    dom = html.fromstring(search_result['html'])
+
+    # parse results
+    for result in dom.xpath(results_xpath):
+        url = result.attrib.get('data-contenturl')
+        thumbnail = result.xpath('.//img')[0].attrib.get('src')
+        title = ''.join(result.xpath(title_xpath))
+        content = escape(''.join(result.xpath(content_xpath)))
+        pubdate = result.xpath(pubdate_xpath)[0].attrib.get('datetime')
+        publishedDate = parser.parse(pubdate)
+
+        # append result
+        results.append({'url': url,
+                        'title': title,
+                        'content': content,
+                        'template': 'videos.html',
+                        'publishedDate': publishedDate,
+                        'thumbnail': thumbnail})
+
+    # return results
+    return results

+ 1 - 5
searx/engines/duckduckgo_definitions.py

@@ -1,6 +1,7 @@
 import json
 import json
 from urllib import urlencode
 from urllib import urlencode
 from lxml import html
 from lxml import html
+from searx.utils import html_to_text
 from searx.engines.xpath import extract_text
 from searx.engines.xpath import extract_text
 
 
 url = 'https://api.duckduckgo.com/'\
 url = 'https://api.duckduckgo.com/'\
@@ -17,11 +18,6 @@ def result_to_text(url, text, htmlResult):
         return text
         return text
 
 
 
 
-def html_to_text(htmlFragment):
-    dom = html.fromstring(htmlFragment)
-    return extract_text(dom)
-
-
 def request(query, params):
 def request(query, params):
     # TODO add kl={locale}
     # TODO add kl={locale}
     params['url'] = url.format(query=urlencode({'q': query}))
     params['url'] = url.format(query=urlencode({'q': query}))

+ 95 - 0
searx/engines/flickr-noapi.py

@@ -0,0 +1,95 @@
+#!/usr/bin/env python
+
+#  Flickr (Images)
+#
+# @website     https://www.flickr.com
+# @provide-api yes (https://secure.flickr.com/services/api/flickr.photos.search.html)
+#
+# @using-api   no
+# @results     HTML
+# @stable      no
+# @parse       url, title, thumbnail, img_src
+
+from urllib import urlencode
+from json import loads
+import re
+
+categories = ['images']
+
+url = 'https://secure.flickr.com/'
+search_url = url+'search/?{query}&page={page}'
+photo_url = 'https://www.flickr.com/photos/{userid}/{photoid}'
+regex = re.compile(r"\"search-photos-models\",\"photos\":(.*}),\"totalItems\":", re.DOTALL)
+image_sizes = ('o', 'k', 'h', 'b', 'c', 'z', 'n', 'm', 't', 'q', 's')
+
+paging = True
+
+
+def build_flickr_url(user_id, photo_id):
+    return photo_url.format(userid=user_id, photoid=photo_id)
+
+
+def request(query, params):
+    params['url'] = search_url.format(query=urlencode({'text': query}),
+                                      page=params['pageno'])
+    return params
+
+
+def response(resp):
+    results = []
+
+    matches = regex.search(resp.text)
+
+    if matches is None:
+        return results
+
+    match = matches.group(1)
+    search_results = loads(match)
+
+    if '_data' not in search_results:
+        return []
+
+    photos = search_results['_data']
+
+    for photo in photos:
+
+        # In paged configuration, the first pages' photos
+        # are represented by a None object
+        if photo is None:
+            continue
+
+        img_src = None
+        # From the biggest to the lowest format
+        for image_size in image_sizes:
+            if image_size in photo['sizes']:
+                img_src = photo['sizes'][image_size]['displayUrl']
+                break
+
+        if not img_src:
+            continue
+
+        if 'id' not in photo['owner']:
+            continue
+
+        url = build_flickr_url(photo['owner']['id'], photo['id'])
+
+        title = photo['title']
+
+        content = '<span class="photo-author">' +\
+                  photo['owner']['username'] +\
+                  '</span><br />'
+
+        if 'description' in photo:
+            content = content +\
+                '<span class="description">' +\
+                photo['description'] +\
+                '</span>'
+
+        # append result
+        results.append({'url': url,
+                        'title': title,
+                        'img_src': img_src,
+                        'content': content,
+                        'template': 'images.html'})
+
+    return results

+ 62 - 29
searx/engines/flickr.py

@@ -1,54 +1,87 @@
 #!/usr/bin/env python
 #!/usr/bin/env python
 
 
+## Flickr (Images)
+#
+# @website     https://www.flickr.com
+# @provide-api yes (https://secure.flickr.com/services/api/flickr.photos.search.html)
+#
+# @using-api   yes
+# @results     JSON
+# @stable      yes
+# @parse       url, title, thumbnail, img_src
+#More info on api-key : https://www.flickr.com/services/apps/create/
+
 from urllib import urlencode
 from urllib import urlencode
-#from json import loads
-from urlparse import urljoin
-from lxml import html
-from time import time
+from json import loads
 
 
 categories = ['images']
 categories = ['images']
 
 
-url = 'https://secure.flickr.com/'
-search_url = url+'search/?{query}&page={page}'
-results_xpath = '//div[@class="view display-item-tile"]/figure/div'
+nb_per_page = 15
+paging = True
+api_key = None
+
+
+url = 'https://api.flickr.com/services/rest/?method=flickr.photos.search' +\
+      '&api_key={api_key}&{text}&sort=relevance' +\
+      '&extras=description%2C+owner_name%2C+url_o%2C+url_z' +\
+      '&per_page={nb_per_page}&format=json&nojsoncallback=1&page={page}'
+photo_url = 'https://www.flickr.com/photos/{userid}/{photoid}'
 
 
 paging = True
 paging = True
 
 
 
 
+def build_flickr_url(user_id, photo_id):
+    return photo_url.format(userid=user_id, photoid=photo_id)
+
+
 def request(query, params):
 def request(query, params):
-    params['url'] = search_url.format(query=urlencode({'text': query}),
-                                      page=params['pageno'])
-    time_string = str(int(time())-3)
-    params['cookies']['BX'] = '3oqjr6d9nmpgl&b=3&s=dh'
-    params['cookies']['xb'] = '421409'
-    params['cookies']['localization'] = 'en-us'
-    params['cookies']['flrbp'] = time_string +\
-        '-3a8cdb85a427a33efda421fbda347b2eaf765a54'
-    params['cookies']['flrbs'] = time_string +\
-        '-ed142ae8765ee62c9ec92a9513665e0ee1ba6776'
-    params['cookies']['flrb'] = '9'
+    params['url'] = url.format(text=urlencode({'text': query}),
+                               api_key=api_key,
+                               nb_per_page=nb_per_page,
+                               page=params['pageno'])
     return params
     return params
 
 
 
 
 def response(resp):
 def response(resp):
     results = []
     results = []
-    dom = html.fromstring(resp.text)
-    for result in dom.xpath(results_xpath):
-        img = result.xpath('.//img')
 
 
-        if not img:
-            continue
+    search_results = loads(resp.text)
 
 
-        img = img[0]
-        img_src = 'https:'+img.attrib.get('src')
+    # return empty array if there are no results
+    if not 'photos' in search_results:
+        return []
 
 
-        if not img_src:
+    if not 'photo' in search_results['photos']:
+        return []
+
+    photos = search_results['photos']['photo']
+
+    # parse results
+    for photo in photos:
+        if 'url_o' in photo:
+            img_src = photo['url_o']
+        elif 'url_z' in photo:
+            img_src = photo['url_z']
+        else:
             continue
             continue
 
 
-        href = urljoin(url, result.xpath('.//a')[0].attrib.get('href'))
-        title = img.attrib.get('alt', '')
-        results.append({'url': href,
+        url = build_flickr_url(photo['owner'], photo['id'])
+
+        title = photo['title']
+
+        content = '<span class="photo-author">' +\
+                  photo['ownername'] +\
+                  '</span><br />' +\
+                  '<span class="description">' +\
+                  photo['description']['_content'] +\
+                  '</span>'
+
+        # append result
+        results.append({'url': url,
                         'title': title,
                         'title': title,
                         'img_src': img_src,
                         'img_src': img_src,
+                        'content': content,
                         'template': 'images.html'})
                         'template': 'images.html'})
+
+    # return results
     return results
     return results

+ 3 - 2
searx/engines/kickass.py

@@ -24,7 +24,7 @@ search_url = url + 'search/{search_term}/{pageno}/'
 
 
 # specific xpath variables
 # specific xpath variables
 magnet_xpath = './/a[@title="Torrent magnet link"]'
 magnet_xpath = './/a[@title="Torrent magnet link"]'
-#content_xpath = './/font[@class="detDesc"]//text()'
+content_xpath = './/span[@class="font11px lightgrey block"]'
 
 
 
 
 # do search-request
 # do search-request
@@ -56,7 +56,8 @@ def response(resp):
         link = result.xpath('.//a[@class="cellMainLink"]')[0]
         link = result.xpath('.//a[@class="cellMainLink"]')[0]
         href = urljoin(url, link.attrib['href'])
         href = urljoin(url, link.attrib['href'])
         title = ' '.join(link.xpath('.//text()'))
         title = ' '.join(link.xpath('.//text()'))
-        content = escape(html.tostring(result.xpath('.//span[@class="font11px lightgrey block"]')[0], method="text"))
+        content = escape(html.tostring(result.xpath(content_xpath)[0],
+                                       method="text"))
         seed = result.xpath('.//td[contains(@class, "green")]/text()')[0]
         seed = result.xpath('.//td[contains(@class, "green")]/text()')[0]
         leech = result.xpath('.//td[contains(@class, "red")]/text()')[0]
         leech = result.xpath('.//td[contains(@class, "red")]/text()')[0]
 
 

+ 8 - 4
searx/engines/searchcode_doc.py

@@ -38,10 +38,14 @@ def response(resp):
     for result in search_results['results']:
     for result in search_results['results']:
         href = result['url']
         href = result['url']
         title = "[" + result['type'] + "] " +\
         title = "[" + result['type'] + "] " +\
-                result['namespace'] + " " + result['name']
-        content = '<span class="highlight">[' + result['type'] + "] " +\
-                  result['name'] + " " + result['synopsis'] +\
-                  "</span><br />" + result['description']
+                result['namespace'] +\
+                " " + result['name']
+        content = '<span class="highlight">[' +\
+                  result['type'] + "] " +\
+                  result['name'] + " " +\
+                  result['synopsis'] +\
+                  "</span><br />" +\
+                  result['description']
 
 
         # append result
         # append result
         results.append({'url': href,
         results.append({'url': href,

+ 12 - 2
searx/engines/soundcloud.py

@@ -6,10 +6,11 @@
 # @using-api   yes
 # @using-api   yes
 # @results     JSON
 # @results     JSON
 # @stable      yes
 # @stable      yes
-# @parse       url, title, content
+# @parse       url, title, content, publishedDate, embedded
 
 
 from json import loads
 from json import loads
-from urllib import urlencode
+from urllib import urlencode, quote_plus
+from dateutil import parser
 
 
 # engine dependent config
 # engine dependent config
 categories = ['music']
 categories = ['music']
@@ -27,6 +28,10 @@ search_url = url + 'search?{query}'\
                          '&linked_partitioning=1'\
                          '&linked_partitioning=1'\
                          '&client_id={client_id}'   # noqa
                          '&client_id={client_id}'   # noqa
 
 
+embedded_url = '<iframe width="100%" height="166" ' +\
+    'scrolling="no" frameborder="no" ' +\
+    'data-src="https://w.soundcloud.com/player/?url={uri}"></iframe>'
+
 
 
 # do search-request
 # do search-request
 def request(query, params):
 def request(query, params):
@@ -50,10 +55,15 @@ def response(resp):
         if result['kind'] in ('track', 'playlist'):
         if result['kind'] in ('track', 'playlist'):
             title = result['title']
             title = result['title']
             content = result['description']
             content = result['description']
+            publishedDate = parser.parse(result['last_modified'])
+            uri = quote_plus(result['uri'])
+            embedded = embedded_url.format(uri=uri)
 
 
             # append result
             # append result
             results.append({'url': result['permalink_url'],
             results.append({'url': result['permalink_url'],
                             'title': title,
                             'title': title,
+                            'publishedDate': publishedDate,
+                            'embedded': embedded,
                             'content': content})
                             'content': content})
 
 
     # return results
     # return results

+ 4 - 1
searx/engines/startpage.py

@@ -66,7 +66,10 @@ def response(resp):
             continue
             continue
         link = links[0]
         link = links[0]
         url = link.attrib.get('href')
         url = link.attrib.get('href')
-        title = escape(link.text_content())
+        try:
+            title = escape(link.text_content())
+        except UnicodeDecodeError:
+            continue
 
 
         # block google-ad url's
         # block google-ad url's
         if re.match("^http(s|)://www.google.[a-z]+/aclk.*$", url):
         if re.match("^http(s|)://www.google.[a-z]+/aclk.*$", url):

+ 78 - 0
searx/engines/subtitleseeker.py

@@ -0,0 +1,78 @@
+## Subtitleseeker (Video)
+#
+# @website     http://www.subtitleseeker.com
+# @provide-api no
+#
+# @using-api   no
+# @results     HTML
+# @stable      no (HTML can change)
+# @parse       url, title, content
+
+from cgi import escape
+from urllib import quote_plus
+from lxml import html
+from searx.languages import language_codes
+
+# engine dependent config
+categories = ['videos']
+paging = True
+language = ""
+
+# search-url
+url = 'http://www.subtitleseeker.com/'
+search_url = url+'search/TITLES/{query}&p={pageno}'
+
+# specific xpath variables
+results_xpath = '//div[@class="boxRows"]'
+
+
+# do search-request
+def request(query, params):
+    params['url'] = search_url.format(query=quote_plus(query),
+                                      pageno=params['pageno'])
+    return params
+
+
+# get response from search-request
+def response(resp):
+    results = []
+
+    dom = html.fromstring(resp.text)
+
+    search_lang = ""
+
+    if resp.search_params['language'] != 'all':
+        search_lang = [lc[1]
+                       for lc in language_codes
+                       if lc[0][:2] == resp.search_params['language']][0]
+
+    # parse results
+    for result in dom.xpath(results_xpath):
+        link = result.xpath(".//a")[0]
+        href = link.attrib.get('href')
+
+        if language is not "":
+            href = href + language + '/'
+        elif search_lang:
+            href = href + search_lang + '/'
+
+        title = escape(link.xpath(".//text()")[0])
+
+        content = result.xpath('.//div[contains(@class,"red")]//text()')[0]
+        content = content + " - "
+        text = result.xpath('.//div[contains(@class,"grey-web")]')[0]
+        content = content + html.tostring(text, method='text')
+
+        if result.xpath(".//span") != []:
+            content = content +\
+                " - (" +\
+                result.xpath(".//span//text()")[0].strip() +\
+                ")"
+
+        # append result
+        results.append({'url': href,
+                        'title': title,
+                        'content': escape(content)})
+
+    # return results
+    return results

+ 19 - 8
searx/engines/twitter.py

@@ -1,6 +1,6 @@
 ## Twitter (Social media)
 ## Twitter (Social media)
 #
 #
-# @website     https://www.bing.com/news
+# @website     https://twitter.com/
 # @provide-api yes (https://dev.twitter.com/docs/using-search)
 # @provide-api yes (https://dev.twitter.com/docs/using-search)
 #
 #
 # @using-api   no
 # @using-api   no
@@ -14,6 +14,7 @@ from urlparse import urljoin
 from urllib import urlencode
 from urllib import urlencode
 from lxml import html
 from lxml import html
 from cgi import escape
 from cgi import escape
+from datetime import datetime
 
 
 # engine dependent config
 # engine dependent config
 categories = ['social media']
 categories = ['social media']
@@ -27,7 +28,8 @@ search_url = base_url+'search?'
 results_xpath = '//li[@data-item-type="tweet"]'
 results_xpath = '//li[@data-item-type="tweet"]'
 link_xpath = './/small[@class="time"]//a'
 link_xpath = './/small[@class="time"]//a'
 title_xpath = './/span[@class="username js-action-profile-name"]//text()'
 title_xpath = './/span[@class="username js-action-profile-name"]//text()'
-content_xpath = './/p[@class="js-tweet-text tweet-text"]//text()'
+content_xpath = './/p[@class="js-tweet-text tweet-text"]'
+timestamp_xpath = './/span[contains(@class,"_timestamp")]'
 
 
 
 
 # do search-request
 # do search-request
@@ -52,12 +54,21 @@ def response(resp):
         link = tweet.xpath(link_xpath)[0]
         link = tweet.xpath(link_xpath)[0]
         url = urljoin(base_url, link.attrib.get('href'))
         url = urljoin(base_url, link.attrib.get('href'))
         title = ''.join(tweet.xpath(title_xpath))
         title = ''.join(tweet.xpath(title_xpath))
-        content = escape(''.join(tweet.xpath(content_xpath)))
-
-        # append result
-        results.append({'url': url,
-                        'title': title,
-                        'content': content})
+        content = escape(html.tostring(tweet.xpath(content_xpath)[0], method='text', encoding='UTF-8').decode("utf-8"))
+        pubdate = tweet.xpath(timestamp_xpath)
+        if len(pubdate) > 0:
+            timestamp = float(pubdate[0].attrib.get('data-time'))
+            publishedDate = datetime.fromtimestamp(timestamp, None)
+            # append result
+            results.append({'url': url,
+                            'title': title,
+                            'content': content,
+                            'publishedDate': publishedDate})
+        else:
+            # append result
+            results.append({'url': url,
+                            'title': title,
+                            'content': content})
 
 
     # return results
     # return results
     return results
     return results

+ 14 - 12
searx/engines/vimeo.py

@@ -1,4 +1,4 @@
-## Vimeo (Videos)
+#  Vimeo (Videos)
 #
 #
 # @website     https://vimeo.com/
 # @website     https://vimeo.com/
 # @provide-api yes (http://developer.vimeo.com/api),
 # @provide-api yes (http://developer.vimeo.com/api),
@@ -7,14 +7,14 @@
 # @using-api   no (TODO, rewrite to api)
 # @using-api   no (TODO, rewrite to api)
 # @results     HTML (using search portal)
 # @results     HTML (using search portal)
 # @stable      no (HTML can change)
 # @stable      no (HTML can change)
-# @parse       url, title, publishedDate,  thumbnail
+# @parse       url, title, publishedDate,  thumbnail, embedded
 #
 #
 # @todo        rewrite to api
 # @todo        rewrite to api
 # @todo        set content-parameter with correct data
 # @todo        set content-parameter with correct data
 
 
 from urllib import urlencode
 from urllib import urlencode
-from HTMLParser import HTMLParser
 from lxml import html
 from lxml import html
+from HTMLParser import HTMLParser
 from searx.engines.xpath import extract_text
 from searx.engines.xpath import extract_text
 from dateutil import parser
 from dateutil import parser
 
 
@@ -23,26 +23,26 @@ categories = ['videos']
 paging = True
 paging = True
 
 
 # search-url
 # search-url
-base_url = 'https://vimeo.com'
+base_url = 'http://vimeo.com'
 search_url = base_url + '/search/page:{pageno}?{query}'
 search_url = base_url + '/search/page:{pageno}?{query}'
 
 
 # specific xpath variables
 # specific xpath variables
+results_xpath = '//div[@id="browse_content"]/ol/li'
 url_xpath = './a/@href'
 url_xpath = './a/@href'
+title_xpath = './a/div[@class="data"]/p[@class="title"]'
 content_xpath = './a/img/@src'
 content_xpath = './a/img/@src'
-title_xpath = './a/div[@class="data"]/p[@class="title"]/text()'
-results_xpath = '//div[@id="browse_content"]/ol/li'
 publishedDate_xpath = './/p[@class="meta"]//attribute::datetime'
 publishedDate_xpath = './/p[@class="meta"]//attribute::datetime'
 
 
+embedded_url = '<iframe data-src="//player.vimeo.com/video{videoid}" ' +\
+    'width="540" height="304" frameborder="0" ' +\
+    'webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>'
+
 
 
 # do search-request
 # do search-request
 def request(query, params):
 def request(query, params):
     params['url'] = search_url.format(pageno=params['pageno'],
     params['url'] = search_url.format(pageno=params['pageno'],
                                       query=urlencode({'q': query}))
                                       query=urlencode({'q': query}))
 
 
-    # TODO required?
-    params['cookies']['__utma'] =\
-        '00000000.000#0000000.0000000000.0000000000.0000000000.0'
-
     return params
     return params
 
 
 
 
@@ -51,16 +51,17 @@ def response(resp):
     results = []
     results = []
 
 
     dom = html.fromstring(resp.text)
     dom = html.fromstring(resp.text)
-
     p = HTMLParser()
     p = HTMLParser()
 
 
     # parse results
     # parse results
     for result in dom.xpath(results_xpath):
     for result in dom.xpath(results_xpath):
-        url = base_url + result.xpath(url_xpath)[0]
+        videoid = result.xpath(url_xpath)[0]
+        url = base_url + videoid
         title = p.unescape(extract_text(result.xpath(title_xpath)))
         title = p.unescape(extract_text(result.xpath(title_xpath)))
         thumbnail = extract_text(result.xpath(content_xpath)[0])
         thumbnail = extract_text(result.xpath(content_xpath)[0])
         publishedDate = parser.parse(extract_text(
         publishedDate = parser.parse(extract_text(
             result.xpath(publishedDate_xpath)[0]))
             result.xpath(publishedDate_xpath)[0]))
+        embedded = embedded_url.format(videoid=videoid)
 
 
         # append result
         # append result
         results.append({'url': url,
         results.append({'url': url,
@@ -68,6 +69,7 @@ def response(resp):
                         'content': '',
                         'content': '',
                         'template': 'videos.html',
                         'template': 'videos.html',
                         'publishedDate': publishedDate,
                         'publishedDate': publishedDate,
+                        'embedded': embedded,
                         'thumbnail': thumbnail})
                         'thumbnail': thumbnail})
 
 
     # return results
     # return results

+ 14 - 0
searx/engines/wikidata.py

@@ -1,6 +1,8 @@
 import json
 import json
 from requests import get
 from requests import get
 from urllib import urlencode
 from urllib import urlencode
+import locale
+import dateutil.parser
 
 
 result_count = 1
 result_count = 1
 wikidata_host = 'https://www.wikidata.org'
 wikidata_host = 'https://www.wikidata.org'
@@ -35,6 +37,16 @@ def response(resp):
     language = resp.search_params['language'].split('_')[0]
     language = resp.search_params['language'].split('_')[0]
     if language == 'all':
     if language == 'all':
         language = 'en'
         language = 'en'
+
+    try:
+        locale.setlocale(locale.LC_ALL, str(resp.search_params['language']))
+    except:
+        try:
+            locale.setlocale(locale.LC_ALL, 'en_US')
+        except:
+            pass
+        pass
+
     url = url_detail.format(query=urlencode({'ids': '|'.join(wikidata_ids),
     url = url_detail.format(query=urlencode({'ids': '|'.join(wikidata_ids),
                                             'languages': language + '|en'}))
                                             'languages': language + '|en'}))
 
 
@@ -164,10 +176,12 @@ def getDetail(jsonresponse, wikidata_id, language):
 
 
     date_of_birth = get_time(claims, 'P569', None)
     date_of_birth = get_time(claims, 'P569', None)
     if date_of_birth is not None:
     if date_of_birth is not None:
+        date_of_birth = dateutil.parser.parse(date_of_birth[8:]).strftime(locale.nl_langinfo(locale.D_FMT))
         attributes.append({'label': 'Date of birth', 'value': date_of_birth})
         attributes.append({'label': 'Date of birth', 'value': date_of_birth})
 
 
     date_of_death = get_time(claims, 'P570', None)
     date_of_death = get_time(claims, 'P570', None)
     if date_of_death is not None:
     if date_of_death is not None:
+        date_of_death = dateutil.parser.parse(date_of_death[8:]).strftime(locale.nl_langinfo(locale.D_FMT))
         attributes.append({'label': 'Date of death', 'value': date_of_death})
         attributes.append({'label': 'Date of death', 'value': date_of_death})
 
 
     if len(attributes) == 0 and len(urls) == 2 and len(description) == 0:
     if len(attributes) == 0 and len(urls) == 2 and len(description) == 0:

+ 11 - 2
searx/engines/youtube.py

@@ -6,7 +6,7 @@
 # @using-api   yes
 # @using-api   yes
 # @results     JSON
 # @results     JSON
 # @stable      yes
 # @stable      yes
-# @parse       url, title, content, publishedDate, thumbnail
+# @parse       url, title, content, publishedDate, thumbnail, embedded
 
 
 from json import loads
 from json import loads
 from urllib import urlencode
 from urllib import urlencode
@@ -19,7 +19,11 @@ language_support = True
 
 
 # search-url
 # search-url
 base_url = 'https://gdata.youtube.com/feeds/api/videos'
 base_url = 'https://gdata.youtube.com/feeds/api/videos'
-search_url = base_url + '?alt=json&{query}&start-index={index}&max-results=5'  # noqa
+search_url = base_url + '?alt=json&{query}&start-index={index}&max-results=5'
+
+embedded_url = '<iframe width="540" height="304" ' +\
+    'data-src="//www.youtube-nocookie.com/embed/{videoid}" ' +\
+    'frameborder="0" allowfullscreen></iframe>'
 
 
 
 
 # do search-request
 # do search-request
@@ -60,6 +64,8 @@ def response(resp):
         if url.endswith('&'):
         if url.endswith('&'):
             url = url[:-1]
             url = url[:-1]
 
 
+        videoid = url[32:]
+
         title = result['title']['$t']
         title = result['title']['$t']
         content = ''
         content = ''
         thumbnail = ''
         thumbnail = ''
@@ -72,12 +78,15 @@ def response(resp):
 
 
         content = result['content']['$t']
         content = result['content']['$t']
 
 
+        embedded = embedded_url.format(videoid=videoid)
+
         # append result
         # append result
         results.append({'url': url,
         results.append({'url': url,
                         'title': title,
                         'title': title,
                         'content': content,
                         'content': content,
                         'template': 'videos.html',
                         'template': 'videos.html',
                         'publishedDate': publishedDate,
                         'publishedDate': publishedDate,
+                        'embedded': embedded,
                         'thumbnail': thumbnail})
                         'thumbnail': thumbnail})
 
 
     # return results
     # return results

+ 5 - 3
searx/https_rewrite.py

@@ -20,8 +20,11 @@ from urlparse import urlparse
 from lxml import etree
 from lxml import etree
 from os import listdir
 from os import listdir
 from os.path import isfile, isdir, join
 from os.path import isfile, isdir, join
+from searx import logger
 
 
 
 
+logger = logger.getChild("https_rewrite")
+
 # https://gitweb.torproject.org/\
 # https://gitweb.torproject.org/\
 # pde/https-everywhere.git/tree/4.0:/src/chrome/content/rules
 # pde/https-everywhere.git/tree/4.0:/src/chrome/content/rules
 
 
@@ -131,7 +134,7 @@ def load_single_https_ruleset(filepath):
 def load_https_rules(rules_path):
 def load_https_rules(rules_path):
     # check if directory exists
     # check if directory exists
     if not isdir(rules_path):
     if not isdir(rules_path):
-        print("[E] directory not found: '" + rules_path + "'")
+        logger.error("directory not found: '" + rules_path + "'")
         return
         return
 
 
     # search all xml files which are stored in the https rule directory
     # search all xml files which are stored in the https rule directory
@@ -151,8 +154,7 @@ def load_https_rules(rules_path):
         # append ruleset
         # append ruleset
         https_rules.append(ruleset)
         https_rules.append(ruleset)
 
 
-    print(' * {n} https-rules loaded'.format(n=len(https_rules)))
-
+    logger.info('{n} rules loaded'.format(n=len(https_rules)))
 
 
 
 
 def https_url_rewrite(result):
 def https_url_rewrite(result):

+ 16 - 8
searx/search.py

@@ -29,21 +29,23 @@ from searx.engines import (
 from searx.languages import language_codes
 from searx.languages import language_codes
 from searx.utils import gen_useragent
 from searx.utils import gen_useragent
 from searx.query import Query
 from searx.query import Query
+from searx import logger
 
 
 
 
+logger = logger.getChild('search')
+
 number_of_searches = 0
 number_of_searches = 0
 
 
 
 
 def search_request_wrapper(fn, url, engine_name, **kwargs):
 def search_request_wrapper(fn, url, engine_name, **kwargs):
     try:
     try:
         return fn(url, **kwargs)
         return fn(url, **kwargs)
-    except Exception, e:
+    except:
         # increase errors stats
         # increase errors stats
         engines[engine_name].stats['errors'] += 1
         engines[engine_name].stats['errors'] += 1
 
 
         # print engine name and specific error message
         # print engine name and specific error message
-        print('[E] Error with engine "{0}":\n\t{1}'.format(
-            engine_name, str(e)))
+        logger.exception('engine crash: {0}'.format(engine_name))
         return
         return
 
 
 
 
@@ -66,14 +68,19 @@ def threaded_requests(requests):
             remaining_time = max(0.0, timeout_limit - (time() - search_start))
             remaining_time = max(0.0, timeout_limit - (time() - search_start))
             th.join(remaining_time)
             th.join(remaining_time)
             if th.isAlive():
             if th.isAlive():
-                print('engine timeout: {0}'.format(th._engine_name))
-
+                logger.warning('engine timeout: {0}'.format(th._engine_name))
 
 
 
 
 # get default reqest parameter
 # get default reqest parameter
 def default_request_params():
 def default_request_params():
     return {
     return {
-        'method': 'GET', 'headers': {}, 'data': {}, 'url': '', 'cookies': {}, 'verify': True}
+        'method': 'GET',
+        'headers': {},
+        'data': {},
+        'url': '',
+        'cookies': {},
+        'verify': True
+    }
 
 
 
 
 # create a callback wrapper for the search engine results
 # create a callback wrapper for the search engine results
@@ -487,14 +494,15 @@ class Search(object):
                 continue
                 continue
 
 
             # append request to list
             # append request to list
-            requests.append((req, request_params['url'], request_args, selected_engine['name']))
+            requests.append((req, request_params['url'],
+                             request_args,
+                             selected_engine['name']))
 
 
         if not requests:
         if not requests:
             return results, suggestions, answers, infoboxes
             return results, suggestions, answers, infoboxes
         # send all search-request
         # send all search-request
         threaded_requests(requests)
         threaded_requests(requests)
 
 
-
         while not results_queue.empty():
         while not results_queue.empty():
             engine_name, engine_results = results_queue.get_nowait()
             engine_name, engine_results = results_queue.get_nowait()
 
 

+ 49 - 2
searx/settings.yml

@@ -35,6 +35,10 @@ engines:
     engine : currency_convert
     engine : currency_convert
     categories : general
     categories : general
     shortcut : cc
     shortcut : cc
+    
+  - name : deezer
+    engine : deezer
+    shortcut : dz
 
 
   - name : deviantart
   - name : deviantart
     engine : deviantart
     engine : deviantart
@@ -44,6 +48,10 @@ engines:
   - name : ddg definitions
   - name : ddg definitions
     engine : duckduckgo_definitions
     engine : duckduckgo_definitions
     shortcut : ddd
     shortcut : ddd
+    
+  - name : digg
+    engine : digg
+    shortcut : dg
 
 
   - name : wikidata
   - name : wikidata
     engine : wikidata
     engine : wikidata
@@ -70,10 +78,14 @@ engines:
     shortcut : px
     shortcut : px
 
 
   - name : flickr
   - name : flickr
-    engine : flickr
     categories : images
     categories : images
     shortcut : fl
     shortcut : fl
-    timeout: 3.0
+# You can use the engine using the official stable API, but you need an API key
+# See : https://www.flickr.com/services/apps/create/
+#    engine : flickr
+#    api_key: 'apikey' # required!
+# Or you can use the html non-stable engine, activated by default
+    engine : flickr-noapi
 
 
   - name : general-file
   - name : general-file
     engine : generalfile
     engine : generalfile
@@ -95,6 +107,33 @@ engines:
     engine : google_news
     engine : google_news
     shortcut : gon
     shortcut : gon
 
 
+  - name : google play apps
+    engine        : xpath
+    search_url    : https://play.google.com/store/search?q={query}&c=apps
+    url_xpath     : //a[@class="title"]/@href
+    title_xpath   : //a[@class="title"]
+    content_xpath : //a[@class="subtitle"]
+    categories : files
+    shortcut : gpa
+    
+  - name : google play movies
+    engine        : xpath
+    search_url    : https://play.google.com/store/search?q={query}&c=movies
+    url_xpath     : //a[@class="title"]/@href
+    title_xpath   : //a[@class="title"]
+    content_xpath : //a[@class="subtitle"]
+    categories : videos
+    shortcut : gpm
+    
+  - name : google play music
+    engine        : xpath
+    search_url    : https://play.google.com/store/search?q={query}&c=music
+    url_xpath     : //a[@class="title"]/@href
+    title_xpath   : //a[@class="title"]
+    content_xpath : //a[@class="subtitle"]
+    categories : music
+    shortcut : gps
+    
   - name : openstreetmap
   - name : openstreetmap
     engine : openstreetmap
     engine : openstreetmap
     shortcut : osm
     shortcut : osm
@@ -127,6 +166,13 @@ engines:
     engine : searchcode_code
     engine : searchcode_code
     shortcut : scc
     shortcut : scc
 
 
+  - name : subtitleseeker
+    engine : subtitleseeker
+    shortcut : ss
+# The language is an option. You can put any language written in english
+# Examples : English, French, German, Hungarian, Chinese...
+#    language : English
+
   - name : startpage
   - name : startpage
     engine : startpage
     engine : startpage
     shortcut : sp
     shortcut : sp
@@ -194,3 +240,4 @@ locales:
     it : Italiano
     it : Italiano
     nl : Nederlands
     nl : Nederlands
     ja : 日本語 (Japanese)
     ja : 日本語 (Japanese)
+    tr : Türkçe

File diff suppressed because it is too large
+ 0 - 1
searx/static/oscar/js/searx.min.js


+ 0 - 0
searx/static/courgette/css/style.css → searx/static/themes/courgette/css/style.css


+ 0 - 0
searx/static/courgette/img/bg-body-index.jpg → searx/static/themes/courgette/img/bg-body-index.jpg


+ 0 - 0
searx/static/courgette/img/favicon.png → searx/static/themes/courgette/img/favicon.png


+ 0 - 0
searx/static/courgette/img/github_ribbon.png → searx/static/themes/courgette/img/github_ribbon.png


+ 0 - 0
searx/static/courgette/img/icon_dailymotion.ico → searx/static/themes/courgette/img/icons/icon_dailymotion.ico


+ 0 - 0
searx/static/courgette/img/icon_deviantart.ico → searx/static/themes/courgette/img/icons/icon_deviantart.ico


+ 0 - 0
searx/static/courgette/img/icon_github.ico → searx/static/themes/courgette/img/icons/icon_github.ico


+ 0 - 0
searx/static/courgette/img/icon_kickass.ico → searx/static/themes/courgette/img/icons/icon_kickass.ico


+ 0 - 0
searx/static/courgette/img/icon_soundcloud.ico → searx/static/themes/courgette/img/icons/icon_soundcloud.ico


+ 0 - 0
searx/static/courgette/img/icon_stackoverflow.ico → searx/static/themes/courgette/img/icons/icon_stackoverflow.ico


+ 0 - 0
searx/static/courgette/img/icon_twitter.ico → searx/static/themes/courgette/img/icons/icon_twitter.ico


+ 0 - 0
searx/static/courgette/img/icon_vimeo.ico → searx/static/themes/courgette/img/icons/icon_vimeo.ico


+ 0 - 0
searx/static/courgette/img/icon_wikipedia.ico → searx/static/themes/courgette/img/icons/icon_wikipedia.ico


+ 0 - 0
searx/static/courgette/img/icon_youtube.ico → searx/static/themes/courgette/img/icons/icon_youtube.ico


+ 0 - 0
searx/static/courgette/img/preference-icon.png → searx/static/themes/courgette/img/preference-icon.png


+ 0 - 0
searx/static/courgette/img/search-icon.png → searx/static/themes/courgette/img/search-icon.png


+ 0 - 0
searx/static/courgette/img/searx-mobile.png → searx/static/themes/courgette/img/searx-mobile.png


+ 0 - 0
searx/static/courgette/img/searx.png → searx/static/themes/courgette/img/searx.png


+ 0 - 0
searx/static/courgette/img/searx_logo.svg → searx/static/themes/courgette/img/searx_logo.svg


+ 0 - 0
searx/static/courgette/js/mootools-autocompleter-1.1.2-min.js → searx/static/themes/courgette/js/mootools-autocompleter-1.1.2-min.js


+ 0 - 0
searx/static/courgette/js/mootools-core-1.4.5-min.js → searx/static/themes/courgette/js/mootools-core-1.4.5-min.js


+ 0 - 0
searx/static/courgette/js/searx.js → searx/static/themes/courgette/js/searx.js


+ 0 - 0
searx/static/default/css/style.css → searx/static/themes/default/css/style.css


+ 0 - 0
searx/static/default/img/favicon.png → searx/static/themes/default/img/favicon.png


+ 0 - 0
searx/static/default/img/github_ribbon.png → searx/static/themes/default/img/github_ribbon.png


+ 0 - 0
searx/static/default/img/icon_dailymotion.ico → searx/static/themes/default/img/icons/icon_dailymotion.ico


+ 0 - 0
searx/static/default/img/icon_deviantart.ico → searx/static/themes/default/img/icons/icon_deviantart.ico


+ 0 - 0
searx/static/default/img/icon_github.ico → searx/static/themes/default/img/icons/icon_github.ico


+ 0 - 0
searx/static/default/img/icon_kickass.ico → searx/static/themes/default/img/icons/icon_kickass.ico


+ 0 - 0
searx/static/default/img/icon_soundcloud.ico → searx/static/themes/default/img/icons/icon_soundcloud.ico


+ 0 - 0
searx/static/default/img/icon_stackoverflow.ico → searx/static/themes/default/img/icons/icon_stackoverflow.ico


+ 0 - 0
searx/static/default/img/icon_twitter.ico → searx/static/themes/default/img/icons/icon_twitter.ico


+ 0 - 0
searx/static/default/img/icon_vimeo.ico → searx/static/themes/default/img/icons/icon_vimeo.ico


+ 0 - 0
searx/static/default/img/icon_wikipedia.ico → searx/static/themes/default/img/icons/icon_wikipedia.ico


+ 0 - 0
searx/static/default/img/icon_youtube.ico → searx/static/themes/default/img/icons/icon_youtube.ico


+ 0 - 0
searx/static/default/img/preference-icon.png → searx/static/themes/default/img/preference-icon.png


+ 0 - 0
searx/static/default/img/search-icon.png → searx/static/themes/default/img/search-icon.png


+ 0 - 0
searx/static/default/img/searx.png → searx/static/themes/default/img/searx.png


+ 0 - 0
searx/static/default/img/searx_logo.svg → searx/static/themes/default/img/searx_logo.svg


+ 0 - 0
searx/static/default/js/mootools-autocompleter-1.1.2-min.js → searx/static/themes/default/js/mootools-autocompleter-1.1.2-min.js


+ 0 - 0
searx/static/default/js/mootools-core-1.4.5-min.js → searx/static/themes/default/js/mootools-core-1.4.5-min.js


+ 0 - 0
searx/static/default/js/searx.js → searx/static/themes/default/js/searx.js


+ 0 - 0
searx/static/default/less/autocompleter.less → searx/static/themes/default/less/autocompleter.less


+ 0 - 0
searx/static/default/less/code.less → searx/static/themes/default/less/code.less


+ 0 - 0
searx/static/default/less/definitions.less → searx/static/themes/default/less/definitions.less


+ 0 - 0
searx/static/default/less/mixins.less → searx/static/themes/default/less/mixins.less


+ 0 - 0
searx/static/default/less/search.less → searx/static/themes/default/less/search.less


+ 0 - 0
searx/static/default/less/style.less → searx/static/themes/default/less/style.less


+ 0 - 0
searx/static/oscar/.gitignore → searx/static/themes/oscar/.gitignore


+ 2 - 2
searx/static/oscar/README.rst → searx/static/themes/oscar/README.rst

@@ -1,14 +1,14 @@
 install dependencies
 install dependencies
 ~~~~~~~~~~~~~~~~~~~~
 ~~~~~~~~~~~~~~~~~~~~
 
 
-run this command in the directory ``searx/static/oscar``
+run this command in the directory ``searx/static/themes/oscar``
 
 
 ``npm install``
 ``npm install``
 
 
 compile sources
 compile sources
 ~~~~~~~~~~~~~~~
 ~~~~~~~~~~~~~~~
 
 
-run this command in the directory ``searx/static/oscar``
+run this command in the directory ``searx/static/themes/oscar``
 
 
 ``grunt``
 ``grunt``
 
 

+ 0 - 0
searx/static/oscar/css/bootstrap.min.css → searx/static/themes/oscar/css/bootstrap.min.css


+ 0 - 0
searx/static/oscar/css/leaflet.min.css → searx/static/themes/oscar/css/leaflet.min.css


+ 0 - 0
searx/static/oscar/css/oscar.min.css → searx/static/themes/oscar/css/oscar.min.css


+ 0 - 0
searx/static/oscar/fonts/glyphicons-halflings-regular.eot → searx/static/themes/oscar/fonts/glyphicons-halflings-regular.eot


+ 0 - 0
searx/static/oscar/fonts/glyphicons-halflings-regular.svg → searx/static/themes/oscar/fonts/glyphicons-halflings-regular.svg


+ 0 - 0
searx/static/oscar/fonts/glyphicons-halflings-regular.ttf → searx/static/themes/oscar/fonts/glyphicons-halflings-regular.ttf


+ 0 - 0
searx/static/oscar/fonts/glyphicons-halflings-regular.woff → searx/static/themes/oscar/fonts/glyphicons-halflings-regular.woff


+ 0 - 0
searx/static/oscar/gruntfile.js → searx/static/themes/oscar/gruntfile.js


+ 0 - 0
searx/static/oscar/img/favicon.png → searx/static/themes/oscar/img/favicon.png


+ 0 - 0
searx/static/oscar/img/icons/README.md → searx/static/themes/oscar/img/icons/README.md


+ 0 - 0
searx/static/oscar/img/icons/amazon.png → searx/static/themes/oscar/img/icons/amazon.png


+ 0 - 0
searx/static/oscar/img/icons/dailymotion.png → searx/static/themes/oscar/img/icons/dailymotion.png


+ 0 - 0
searx/static/oscar/img/icons/deviantart.png → searx/static/themes/oscar/img/icons/deviantart.png


+ 0 - 0
searx/static/oscar/img/icons/facebook.png → searx/static/themes/oscar/img/icons/facebook.png


+ 0 - 0
searx/static/oscar/img/icons/flickr.png → searx/static/themes/oscar/img/icons/flickr.png


+ 0 - 0
searx/static/oscar/img/icons/github.png → searx/static/themes/oscar/img/icons/github.png


+ 0 - 0
searx/static/oscar/img/icons/kickass.png → searx/static/themes/oscar/img/icons/kickass.png


BIN
searx/static/themes/oscar/img/icons/openstreetmap.png


BIN
searx/static/themes/oscar/img/icons/photon.png


BIN
searx/static/themes/oscar/img/icons/searchcode code.png


BIN
searx/static/themes/oscar/img/icons/searchcode doc.png


+ 0 - 0
searx/static/oscar/img/icons/soundcloud.png → searx/static/themes/oscar/img/icons/soundcloud.png


+ 0 - 0
searx/static/oscar/img/icons/stackoverflow.png → searx/static/themes/oscar/img/icons/stackoverflow.png


Some files were not shown because too many files changed in this diff