Elasticsearch上手——PythonAPI的简单使用

株野 2017-05-25

展开全文

Python够直接，从它开始是个不错的选择。

安装

我在CentOS 7上安装了Python3.6，安装时使用下面的命令：

import traceback

from pymongo import MongoClient

from elasticsearch import Elasticsearch

# 建立到MongoDB的连接

_db = MongoClient('mongodb://127.0.0.1:27017')['blog']

# 建立到Elasticsearch的连接

_es = Elasticsearch()

# 初始化索引的Mappings设置

_index_mappings = {

"mappings": {

"user": {

"properties": {

"title": { "type": "text" },

"name": { "type": "text" },

"age": { "type": "integer" }

}

},

"blogpost": {

"properties": {

"title": { "type": "text" },

"body": { "type": "text" },

"user_id": {

"type": "keyword"

},

"created": {

"type": "date"

}

# 如果索引不存在，则创建索引

if _es.indices.exists(index='blog_index') is not True:

_es.indices.create(index='blog_index', body=_index_mappings)

# 从MongoDB中查询数据，由于在Elasticsearch使用自动生成_id，因此从MongoDB查询

# 返回的结果中将_id去掉。

user_cursor = db.user.find({}, projection={'_id':False})

user_docs = [x for x in user_cursor]

# 记录处理的文档数

processed = 0

# 将查询出的文档添加到Elasticsearch中

for _doc in user_docs:

try:

_es.index(index='blog_index', doc_type='user', body=_doc)

processed += 1

print('Processed: ' + str(processed), flush=True)

except:

traceback.print_exc()

# 查询所有记录结果

print('Search all...', flush=True)

_query_all = {

'query': {

'match_all': {}

}

_searched = _es.search(index='blog_index', doc_type='user', body=_query_all)

print(_searched, flush=True)

# 输出查询到的结果

for hit in _searched['hits']['hits']:

print(hit['_source'], flush=True)

# 查询姓名中包含jerry的记录

print('Search name contains jerry.', flush=True)

_query_name_contains = {

'query': {

'match': {

'name': 'jerry'

}

_searched = _es.search(index='blog_index', doc_type='user', body=_query_name_contains)

print(_searched, flush=True)

运行上面的文件(elasticsearch_trial.py)：

1	`python3 elasticsearch_tria.py`

可以得到下面的输出结果：

Processed: 1

Processed: 2

Processed: 3

Search all...

{'took': 1, 'timed_out': False, '_shards': {'total': 5, 'successful': 5, 'failed': 0}, 'hits': {'total': 3, 'max_score': 1.0, 'hits': [{'_index': 'blog_index', '_type': 'user', '_id': 'AVn4TrrVXvwnWPWhxu5q', '_score': 1.0, '_source': {'title': 'Manager', 'name': 'Trump Heat', 'age': 67}}, {'_index': 'blog_index', '_type': 'user', '_id': 'AVn4TrscXvwnWPWhxu5s', '_score': 1.0, '_source': {'title': 'Engineer', 'name': 'Tommy Hsu', 'age': 32}}, {'_index': 'blog_index', '_type': 'user', '_id': 'AVn4Trr2XvwnWPWhxu5r', '_score': 1.0, '_source': {'title': 'President', 'name': 'Jerry Jim', 'age': 21}}]}}

{'title': 'Manager', 'name': 'Trump Heat', 'age': 67}

{'title': 'Engineer', 'name': 'Tommy Hsu', 'age': 32}

{'title': 'President', 'name': 'Jerry Jim', 'age': 21}

Search name contains jerry.

{'took': 3, 'timed_out': False, '_shards': {'total': 5, 'successful': 5, 'failed': 0}, 'hits': {'total': 1, 'max_score': 0.25811607, 'hits': [{'_index': 'blog_index', '_type': 'user', '_id': 'AVn4Trr2XvwnWPWhxu5r', '_score': 0.25811607, '_source': {'title': 'President', 'name': 'Jerry Jim', 'age': 21}}]}}

这里写图片描述