filtron.rst 5.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183
  1. ==========================
  2. How to protect an instance
  3. ==========================
  4. .. _filtron: https://github.com/asciimoo/filtron
  5. Searx depens on external search services. To avoid the abuse of these services
  6. it is advised to limit the number of requests processed by searx.
  7. An application firewall, filtron_ solves exactly this problem. Filtron is just
  8. a middleware between your web server (nginx, apache, ...) and searx.
  9. filtron & go
  10. ============
  11. .. _Go: https://golang.org/
  12. .. _filtron README: https://github.com/asciimoo/filtron/blob/master/README.md
  13. .. sidebar:: init system
  14. ATM the ``filtron.sh`` supports only systemd init process used by debian,
  15. ubuntu and many other dists. If you have a working init.d file to start/stop
  16. filtron service, please contribute.
  17. Filtron needs Go_ installed. If Go_ is preinstalled, filtron_ is simply
  18. installed by ``go get`` package management (see `filtron README`_). If you use
  19. filtron as middleware, a more isolated setup is recommended.
  20. #. Create a separated user account (``filtron``).
  21. #. Download and install Go_ binary in users $HOME (``~filtron``).
  22. #. Install filtron with the package management of Go_ (``go get -v -u
  23. github.com/asciimoo/filtron``)
  24. #. Setup a proper rule configuration :origin:`[ref]
  25. <utils/templates/etc/filtron/rules.json>` (``/etc/filtron/rules.json``).
  26. #. Setup a systemd service unit :origin:`[ref]
  27. <utils/templates/lib/systemd/system/filtron.service>`
  28. (``/lib/systemd/system/filtron.service``).
  29. To simplify such a installation and the maintenance of; use our script
  30. ``utils/filtron.sh``:
  31. .. program-output:: ../utils/filtron.sh --help
  32. :ellipsis: 0,5
  33. Sample configuration of filtron
  34. ===============================
  35. An example configuration can be find below. This configuration limits the access
  36. of:
  37. - scripts or applications (roboagent limit)
  38. - webcrawlers (botlimit)
  39. - IPs which send too many requests (IP limit)
  40. - too many json, csv, etc. requests (rss/json limit)
  41. - the same UserAgent of if too many requests (useragent limit)
  42. .. code:: json
  43. [{
  44. "name":"search request",
  45. "filters":[
  46. "Param:q",
  47. "Path=^(/|/search)$"
  48. ],
  49. "interval":"<time-interval-in-sec (int)>",
  50. "limit":"<max-request-number-in-interval (int)>",
  51. "subrules":[
  52. {
  53. "name":"roboagent limit",
  54. "interval":"<time-interval-in-sec (int)>",
  55. "limit":"<max-request-number-in-interval (int)>",
  56. "filters":[
  57. "Header:User-Agent=(curl|cURL|Wget|python-requests|Scrapy|FeedFetcher|Go-http-client)"
  58. ],
  59. "actions":[
  60. {
  61. "name":"block",
  62. "params":{
  63. "message":"Rate limit exceeded"
  64. }
  65. }
  66. ]
  67. },
  68. {
  69. "name":"botlimit",
  70. "limit":0,
  71. "stop":true,
  72. "filters":[
  73. "Header:User-Agent=(Googlebot|bingbot|Baiduspider|yacybot|YandexMobileBot|YandexBot|Yahoo! Slurp|MJ12bot|AhrefsBot|archive.org_bot|msnbot|MJ12bot|SeznamBot|linkdexbot|Netvibes|SMTBot|zgrab|James BOT)"
  74. ],
  75. "actions":[
  76. {
  77. "name":"block",
  78. "params":{
  79. "message":"Rate limit exceeded"
  80. }
  81. }
  82. ]
  83. },
  84. {
  85. "name":"IP limit",
  86. "interval":"<time-interval-in-sec (int)>",
  87. "limit":"<max-request-number-in-interval (int)>",
  88. "stop":true,
  89. "aggregations":[
  90. "Header:X-Forwarded-For"
  91. ],
  92. "actions":[
  93. {
  94. "name":"block",
  95. "params":{
  96. "message":"Rate limit exceeded"
  97. }
  98. }
  99. ]
  100. },
  101. {
  102. "name":"rss/json limit",
  103. "interval":"<time-interval-in-sec (int)>",
  104. "limit":"<max-request-number-in-interval (int)>",
  105. "stop":true,
  106. "filters":[
  107. "Param:format=(csv|json|rss)"
  108. ],
  109. "actions":[
  110. {
  111. "name":"block",
  112. "params":{
  113. "message":"Rate limit exceeded"
  114. }
  115. }
  116. ]
  117. },
  118. {
  119. "name":"useragent limit",
  120. "interval":"<time-interval-in-sec (int)>",
  121. "limit":"<max-request-number-in-interval (int)>",
  122. "aggregations":[
  123. "Header:User-Agent"
  124. ],
  125. "actions":[
  126. {
  127. "name":"block",
  128. "params":{
  129. "message":"Rate limit exceeded"
  130. }
  131. }
  132. ]
  133. }
  134. ]
  135. }]
  136. Route request through filtron
  137. =============================
  138. Filtron can be started using the following command:
  139. .. code:: sh
  140. $ filtron -rules rules.json
  141. It listens on ``127.0.0.1:4004`` and forwards filtered requests to
  142. ``127.0.0.1:8888`` by default.
  143. Use it along with ``nginx`` with the following example configuration.
  144. .. code:: nginx
  145. location / {
  146. proxy_set_header Host $http_host;
  147. proxy_set_header X-Real-IP $remote_addr;
  148. proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  149. proxy_set_header X-Scheme $scheme;
  150. proxy_pass http://127.0.0.1:4004/;
  151. }
  152. Requests are coming from port 4004 going through filtron and then forwarded to
  153. port 8888 where a searx is being run.