User Tools

Site Tools


self_hosted_search_engine

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
self_hosted_search_engine [2016/05/31 21:55]
sgripon [Self hosted search engine]
self_hosted_search_engine [2016/11/06 12:18]
sgripon [Elasticsearch]
Line 55: Line 55:
 After a few seconds, you can check elasticsearch is running by accessing http://​localhost:​9200:​ After a few seconds, you can check elasticsearch is running by accessing http://​localhost:​9200:​
  
-<​code>​+<​code ​javascript>
 { {
   "​status"​ : 200,   "​status"​ : 200,
Line 131: Line 131:
 Here //​index_name//​ and //type// or default values for nutch, that we will use after. Here //​index_name//​ and //type// or default values for nutch, that we will use after.
  
-Copy //​calaca/​_site//​ directory content to apache root directory (e.g. ///​var/​www/​html//​.+Copy //​calaca/​_site//​ directory content to apache root directory (e.g. ///​var/​www/​html//​). 
 + 
 +It should be also necessary to customize calaca //​index.html//​ to support types from nutch. To do that, modify the article in section with class "​results":​ 
 + 
 +<file html index.html>​ 
 +... 
 +<article class='​result'​ ng-repeat='​result in results track by $id(result)'>​ 
 +   <​h2><​a href="​{{result.url}}">​{{result.title || result.url}}</​a></​h2>​ 
 +   <​p>​{{result.content}}</​p>​ 
 +</​article>​ 
 +... 
 +</​file>​
  
 At this step, you have a working self-hosted search engine at http://​localhost/​. It is now time to feed it with data. At this step, you have a working self-hosted search engine at http://​localhost/​. It is now time to feed it with data.
Line 233: Line 244:
   ​   ​
 That's it, you now have your own self hosted search engine! Just add the crawl command in a cron job to refresh regularly pages in the index. That's it, you now have your own self hosted search engine! Just add the crawl command in a cron job to refresh regularly pages in the index.
 +
 +===== Next Steps =====
 +
 +  * Use [[https://​www.elastic.co/​products/​kibana|kibana]] to build KPI on the data indexed into elasticsearch
 +  * Index databases (for example mysql)
 +  * Index data from REST web api (for example redmine issues)
 +  * ...
   ​   ​
 **Share this page:** **Share this page:**
 ~~socialite~~ ~~socialite~~
- 
 ===== Like this tutorial ? ===== ===== Like this tutorial ? =====
  
-See also [[:​a_development_chain_to_build_reliable_software|A development chain to build reliable software]].+See also[[:​a_development_chain_to_build_reliable_software|A development chain to build reliable software]].
  
self_hosted_search_engine.txt · Last modified: 2016/11/06 12:18 by sgripon