Browse Source

[doc] engines overview added

Adam Tauber 9 years ago
parent
commit
11d6d80d88
2 changed files with 315 additions and 0 deletions
  1. 314 0
      docs/dev/engine_overview.rst
  2. 1 0
      docs/index.rst

+ 314 - 0
docs/dev/engine_overview.rst

@@ -0,0 +1,314 @@
+Engine overview
+===============
+
+
+searx is a `metasearch-engine <https://en.wikipedia.org/wiki/Metasearch_engine>`__,
+so it is using different search engines to provide better results.
+
+Because there is no general search-api which can be used for every
+search-engine, there must be build an adapter between searx and the
+external search-engine. This adapters are stored in the folder
+`*searx/engines* <https://github.com/asciimoo/searx/tree/master/searx/engines>`__,
+and this site is build to make an general documentation about this
+engines
+
+
+.. contents::
+   :depth: 3
+
+general engine configuration
+----------------------------
+
+It is required to tell searx what results can the engine provide. The
+arguments can be inserted in the engine file, or in the settings file
+(normally ``settings.yml``). The arguments in the settings file override
+the one in the engine file.
+
+Really, it is for most options no difference if there are contained in
+the engine-file or in the settings. But there is a standard where to
+place specific arguments by default.
+
+
+engine-file
+~~~~~~~~~~~
+
++---------------------+-----------+-----------------------------------------+
+| argument            | type      | information                             |
++=====================+===========+=========================================+
+| categories          | list      | pages, in which the engine is working   |
++---------------------+-----------+-----------------------------------------+
+| paging              | boolean   | support multible pages                  |
++---------------------+-----------+-----------------------------------------+
+| language\_support   | boolean   | support language choosing               |
++---------------------+-----------+-----------------------------------------+
+
+settings.yml
+~~~~~~~~~~~~
+
++------------+----------+-----------------------------------------------+
+| argument   | type     | information                                   |
++============+==========+===============================================+
+| name       | string   | name of search-engine                         |
++------------+----------+-----------------------------------------------+
+| engine     | string   | name of searx-engine (filename without .py)   |
++------------+----------+-----------------------------------------------+
+| shortcut   | string   | shortcut of search-engine                     |
++------------+----------+-----------------------------------------------+
+| timeout    | string   | specific timeout for search-engine            |
++------------+----------+-----------------------------------------------+
+
+overrides
+~~~~~~~~~
+
+There are some options, with have default values in the engine, but are
+often overwritten by the settings. If the option is assigned in the
+engine-file with ``None`` it has to be redefined in the settings,
+otherwise searx is not starting with that engine.
+
+The naming of that overrides can be wathever you want. But we recommend
+the using of already used overrides if possible:
+
++-----------------------+----------+--------------------------------------------------------------+
+| argument              | type     | information                                                  |
++=======================+==========+==============================================================+
+| base\_url             | string   | base-url, can be overwrite to use same engine on other url   |
++-----------------------+----------+--------------------------------------------------------------+
+| number\_of\_results   | int      | maximum number of results per request                        |
++-----------------------+----------+--------------------------------------------------------------+
+| language              | string   | ISO code of language and country like en\_US                 |
++-----------------------+----------+--------------------------------------------------------------+
+| api\_key              | string   | api-key if required by engine                                |
++-----------------------+----------+--------------------------------------------------------------+
+
+example-code
+~~~~~~~~~~~~
+
+.. code:: python
+
+    # engine dependent config
+    categories = ['general']
+    paging = True
+    language_support = True
+
+doing request
+-------------
+
+To perform a search you have to specific at least a url on which the
+request is performing
+
+passed arguments
+~~~~~~~~~~~~~~~~
+
+This arguments can be used to calculate the search-query. Furthermore,
+some of that parameters are filled with default values which can be
+changed for special purpose.
+
++----------------------+------------+------------------------------------------------------------------------+
+| argument             | type       | default-value, informations                                            |
++======================+============+========================================================================+
+| url                  | string     | ``''``                                                                 |
++----------------------+------------+------------------------------------------------------------------------+
+| method               | string     | ``'GET'``                                                              |
++----------------------+------------+------------------------------------------------------------------------+
+| headers              | set        | ``{}``                                                                 |
++----------------------+------------+------------------------------------------------------------------------+
+| data                 | set        | ``{}``                                                                 |
++----------------------+------------+------------------------------------------------------------------------+
+| cookies              | set        | ``{}``                                                                 |
++----------------------+------------+------------------------------------------------------------------------+
+| verify               | boolean    | ``True``                                                               |
++----------------------+------------+------------------------------------------------------------------------+
+| headers.User-Agent   | string     | a random User-Agent                                                    |
++----------------------+------------+------------------------------------------------------------------------+
+| category             | string     | current category, like ``'general'``                                   |
++----------------------+------------+------------------------------------------------------------------------+
+| started              | datetime   | current date-time                                                      |
++----------------------+------------+------------------------------------------------------------------------+
+| pageno               | int        | current pagenumber                                                     |
++----------------------+------------+------------------------------------------------------------------------+
+| language             | string     | specific language code like ``'en_US'``, or ``'all'`` if unspecified   |
++----------------------+------------+------------------------------------------------------------------------+
+
+parsed arguments
+~~~~~~~~~~~~~~~~
+
+The function ``def request(query, params):`` is always returning the
+``params`` variable back. Inside searx, the following paramters can be
+used to specific a search-request:
+
++------------+-----------+----------------------------------------------------------+
+| argument   | type      | information                                              |
++============+===========+==========================================================+
+| url        | string    | requested url                                            |
++------------+-----------+----------------------------------------------------------+
+| method     | string    | HTTP request methode                                     |
++------------+-----------+----------------------------------------------------------+
+| headers    | set       | HTTP header informations                                 |
++------------+-----------+----------------------------------------------------------+
+| data       | set       | HTTP data informations (parsed if ``method != 'GET'``)   |
++------------+-----------+----------------------------------------------------------+
+| cookies    | set       | HTTP cookies                                             |
++------------+-----------+----------------------------------------------------------+
+| verify     | boolean   | Performing SSL-Validity check                            |
++------------+-----------+----------------------------------------------------------+
+
+example-code
+~~~~~~~~~~~~
+
+.. code:: python
+
+    # search-url
+    base_url = 'https://example.com/'
+    search_string = 'search?{query}&page={page}'
+
+    # do search-request
+    def request(query, params):
+        search_path = search_string.format(
+            query=urlencode({'q': query}),
+            page=params['pageno'])
+
+        params['url'] = base_url + search_path
+
+        return params
+
+returning results
+-----------------
+
+Searx has the possiblity to return results in different media-types.
+Currently the following media-types are supported:
+
+-  default
+-  images
+-  videos
+-  torrent
+-  map
+
+to set another media-type as default, you must set the parameter
+``template`` to the required type.
+
+default
+~~~~~~~
+
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| result-parameter   | information                                                                                                                        |
++====================+====================================================================================================================================+
+| url                | string, which is representing the url of the result                                                                                |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| title              | string, which is representing the title of the result                                                                              |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| content            | string, which is giving a general result-text                                                                                      |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| publishedDate      | `datetime.datetime <https://docs.python.org/2/library/datetime.html#datetime-objects>`__, represent when the result is published   |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+
+images
+~~~~~~
+
+to use this template, the parameter
+
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+| result-parameter   | information                                                                                                                                               |
++====================+===========================================================================================================================================================+
+| template           | is set to ``images.html``                                                                                                                                 |
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+| url                | string, which is representing the url to the result site                                                                                                  |
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+| title              | string, which is representing the title of the result *(partly implemented)*                                                                              |
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+| content            | *(partly implemented)*                                                                                                                                    |
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+| publishedDate      | `datetime.datetime <https://docs.python.org/2/library/datetime.html#datetime-objects>`__, represent when the result is published *(partly implemented)*   |
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+| img\_src           | string, which is representing the url to the result image                                                                                                 |
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+| thumbnail\_src     | string, which is representing the url to a small-preview image                                                                                            |
++--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
+
+videos
+~~~~~~
+
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| result-parameter   | information                                                                                                                        |
++====================+====================================================================================================================================+
+| template           | is set to ``videos.html``                                                                                                          |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| url                | string, which is representing the url of the result                                                                                |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| title              | string, which is representing the title of the result                                                                              |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| content            | *(not implemented yet)*                                                                                                            |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| publishedDate      | `datetime.datetime <https://docs.python.org/2/library/datetime.html#datetime-objects>`__, represent when the result is published   |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| thumbnail          | string, which is representing the url to a small-preview image                                                                     |
++--------------------+------------------------------------------------------------------------------------------------------------------------------------+
+
+torrent
+~~~~~~~
+
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| result-parameter | information                                                                                                                                           |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| template         | is set to ```torrent.html```                                                                                                                          |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| url              | string, which is representing the url of the result                                                                                                   |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| title            | string, which is representing the title of the result                                                                                                 |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| content          | string, which is giving a general result-text                                                                                                         |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| publishedDate    | [datetime.datetime](https://docs.python.org/2/library/datetime.html#datetime-objects), represent when the result is published _(not implemented yet)_ |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| seed             | int, number of seeder                                                                                                                                 |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| leech            | int, number of leecher                                                                                                                                |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| filesize         | int, size of file in bytes                                                                                                                            |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| files            | int, number of files                                                                                                                                  |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| magnetlink       | string, which is the [magnetlink](https://en.wikipedia.org/wiki/Magnet_URI_scheme) of the result                                                      |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+| torrentfile      | string, which is the torrentfile of the result                                                                                                        |
++------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+
+
+
+map
+~~~
+
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| result-parameter        | information                                                                                                                        |
++=========================+====================================================================================================================================+
+| url                     | string, which is representing the url of the result                                                                                |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| title                   | string, which is representing the title of the result                                                                              |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| content                 | string, which is giving a general result-text                                                                                      |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| publishedDate           | `datetime.datetime <https://docs.python.org/2/library/datetime.html#datetime-objects>`__, represent when the result is published   |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| latitude                | latitude of result (in decimal format)                                                                                             |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| longitude               | longitude of result (in decimal format)                                                                                            |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| boundingbox             | boundingbox of result (array of 4. values ``[lat-min, lat-max, lon-min, lon-max]``)                                                |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| geojson                 | geojson of result (http://geojson.org)                                                                                             |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| osm.type                | type of osm-object (if OSM-Result)                                                                                                 |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| osm.id                  | id of osm-object (if OSM-Result)                                                                                                   |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| address.name            | name of object                                                                                                                     |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| address.road            | street adress of object                                                                                                            |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| address.house\_number   | house number of object                                                                                                             |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| address.locality        | city, place of object                                                                                                              |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| address.postcode        | postcode of object                                                                                                                 |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+| address.country         | country of object                                                                                                                  |
++-------------------------+------------------------------------------------------------------------------------------------------------------------------------+
+

+ 1 - 0
docs/index.rst

@@ -34,6 +34,7 @@ Developer documentation
 
 
    dev/contribution_guide
    dev/contribution_guide
    dev/install/installation
    dev/install/installation
+   dev/engine_overview
    dev/search_api
    dev/search_api
    dev/plugins
    dev/plugins
    dev/translation
    dev/translation