全文索引 对整个的内容构建索引, 全文称之为一个文档Document 搜索引擎 就相当于数据库中的模糊查询 select * from t field like '%小美%' Nutch(爬虫,Dong Cunting) Luence(全文索引 搜索引擎) -----Servlet ----solr(全文索引 搜索引擎) -----Struts2 ----solr could(4.0)才支持分布式 支持集群 ----Compass(是对Luence的封装) -----Springmvc ----ElasticSearch(分布式、支持海量数据) ----------------------------------------------------------------------- EasticSearch配置 需要注意一点:在2.x后面的版本只能安装在linux的普通用户下面 解压: ~]$ unzip soft/elasticsearch-2.3.0.zip 重命名 配置: 在elasticsearch.yml文件中配置,修改一下内容 network.host: uplooking01 cluster-name: bigdata node.name: hadoop path.data: /home/uplooking/data/elasticsearch path.logs: /home/uplooking/logs/elasticsearch 验证: 通过curl 在linux终端 curl -XGET http://uplooking01:9200 浏览器 http://uplooking01:9200 { "name" : "hadoop", "cluster_name" : "bigdata", "version" : { "number" : "2.3.0", "build_hash" : "8371be8d5fe5df7fb9c0516c474d77b9feddd888", "build_timestamp" : "2016-03-29T07:54:48Z", "build_snapshot" : false, "lucene_version" : "5.5.0" }, "tagline" : "You Know, for Search" } 安装配置成功! 创建索引库 curl -XPUT uplooking01:9200/bigdata 创建索引 curl -XPOST uplooking01:9200/bigdata/product/1 -d '{"name":"hadoop", "author":"Doug Couting", "lastest_version":"3.0.0"}' 查询索引信息 curl -XGET uplooking01:9200/bigdata ==>获取当前索引库的配置信息 curl -XGET uplooking01:9200/bigdata/_search ==>查询索引库下面的所有索引信息 curl -XGET uplooking01:9200/bigdata/_search?pretty==>对查询结果格式化 curl -XGET uplooking01:9200/bigdata/type/1?pretty==>精确查询索引id为1的一条信息 curl -XPOST uplooking01:9200/bigdata/product/2 -d '{"name":"hive", "author":"facebook", "lastest_version":"2.1.0"}' curl -XPOST uplooking01:9200/bigdata/product/3 -d '{"name":"hbase", "author":"apache", "lastest_version":"1.1.5"}' 更新 推荐是POST===>对具体的一条索引记录进行更新 ==》没有就给我新增,有的话,更新覆盖原有内容 curl -XPOST uplooking01:9200/bigdata/product/3 -d '{"name":"hive", "author":"facebook", "lastest_version":"1.2.1"}' 删除 curl -XPOST uplooking01:9200/bigdata/product/4 -d '{"name":"storm", "author":"apache", "lastest_version":"1.0.2"}' curl -XDELETE uplooking01:9200/bigdata/product/4--->根据指定的索引id进行删除 高级查询 查询某一个索引下面的source的内容 curl -XGET uplooking01:9200/bigdata/product/1/_source?pretty { "name" : "hadoop", "author" : "Doug Couting", "lastest_version" : "3.0.0" } 查询name是hadoop,hbase条件查询 curl -XGET 'uplooking01:9200/bigdata/product/_search?q=name:hadoop,hbase&pretty' { "took" : 20, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.04500804, "hits" : [ { "_index" : "bigdata", "_type" : "product", "_id" : "1", "_score" : 0.04500804, "_source" : { "name" : "hadoop", "author" : "Doug Couting", "lastest_version" : "3.0.0" } }, { "_index" : "bigdata", "_type" : "product", "_id" : "3", "_score" : 0.04500804, "_source" : { "name" : "hbase", "author" : "apache", "lastest_version" : "1.1.5" } } ] } } 部分字段查询 curl -XGET 'uplooking01:9200/bigdata/product/3?_source=name,author&pretty' { "_index" : "bigdata", "_type" : "product", "_id" : "3", "_version" : 1, "found" : true, "_source" : { "author" : "apache", "name" : "hbase" } } curl 'http://uplooking01:9200/_cat/indices?v'--->查看多个索引库列表 批量操作--bulk 注意:ElasticSearch节点健康状态说明!!! 在Elasticsearch集群中可以监控统计很多信息,但是只有一个是最重要的时集群健康(cluster health)。Es中用三种颜色状态表示:green,yellow,red. Green:所有主分片和副本分片都可用 Yellow:所有主分片可用,但不是所有副本分片都可用 Red:不是所有的主分片都可用 集群的安装: 注意:安装过程中不要拷贝原来节点上面的数据文件(path.data: /home/uplooking/data/elasticsearch)!!! 集群配置的配置文件:[]括起来的表示每台机器都不相同 cluster.name: bigdata [node.name: hadoop] path.data: /home/uplooking/data/elasticsearch path.logs: /home/uplooking/logs/elasticsearch [network.host: uplooking01] discovery.zen.ping.unicast.hosts: ["uplooking01", "uplooking02", "uplooking03"] discovery.zen.ping.multicast.enabled: false |
|