前言

常见的数据类型一般为3种:结构化数据、非结构化数据、半结构化数据。

  1. 结构化数据,通常会放在关系型数据库,如Mysql,Oracle等
  2. 非结构化数据,如文档、音频、视频等,是二维结构保存和体现不了数据格式,一般放在MongoDb,Redis等非关系型数据库(K,V结构)
  3. 半结构化数据,如HTML,XML等数据和格式混在一起的,一般也是放在MongoDb,Redis等非关系型数据库

目前有一个需求:不管你是啥数据类型,都能按我搜索的关键字进行快速查找。
这个需求的难点在于:我们并不能保证所有的数据的类型都是统一的。如果是结构化数据,快速查找可以建索引。非结构可以通过K快速查,而半结构不行,因为K找到V后,V还是数据格式和内容混在一起的

而Elasticsearch解决为了解决上述问题,出现的一种文档型数据库,可保存结构化数据,非结构化数据,半结构化数据,并进行快速查找。

什么是Elasticsearch

The Elastic Stack,包括 Elasticsearch、Kibana、Beats和 Logstash(也称为ELK Stack)。能够安全可靠地获取任何来源、任何格式的数据,然后实时地对数据进行搜索、分析和可视化。
Elaticsearch,简称为ES,ES是一个开源的高扩展的分布式全文搜索引擎,是整个ElasticStack 技术栈的核心。它可以近乎实时的存储、检索数据;本身扩展性很好,可以扩展到上百台服务器,处理PB级别的数据。

说简单点:Elasticsearch是Elastic Stack全家桶的一员。

Elasticsearch入门常识

  1. 9300端口为Elasticsearch集群间组件的通信端口,9200端口为浏览器访问的http协议RESTful端口。
  2. ES通过Java开发,因此得有JDK支持,最好使用服务器上已有的,不用他内置的。
  3. 支持和推荐使用HTTP协议,以RESTFUL风格的方式操作数据。
  4. 数据的请求和响应,ES推荐是JSON格式进行交互
  5. Java中,如果说万物皆对象。那么Elasticsearch,万物皆索引。他使用的是倒排索引:
    > 1. 正排索引。举个例子:
id content
1001 zhangsan is my name
1002 lisi is my name

> 我在ID建立主键索引,根据ID=1001,找到这条数据。这根据编号,找到内容,这类索引叫正排索引
> 2. 倒排索引。举个例子:

keyword id
name 1001,1002
lisi 1002
> 我对内容里的字符串进行索引,根据内容,找到编号,这类索引叫倒排索引。
> 倒排索引的目的是为了找到编号,进而再根据正排索引找到完全数据来源。
  1. ES认为GET,PUT,DELETE是幂等性的,而POST为非幂等性,因此成功发送POST请求后,会返回此次操作数据的的唯一标识_id

Elasticsearch名词

Elasticsearch是面向文档型数据库,一条数据在这里就是一个文档。为了方便大家理解,我们将Elasticsearch里存储文档数据和关系型数据库 MySQL存储数据的概念进行一个类比
Elasticsearch:一篇学会Elasticsearch(7.x)-小白菜博客
ES里的Index可以看做一个库,而Types相当于表,Documents则相当于表的行。
这里Types 的概念已经被逐渐弱化,Elasticsearch 6.X中,一个index下已经只能包含一个type,Elasticsearch 7.X中, Type的概念已经被删除了。

总结:type可不用关注,关注index,document,field即可。

HTTP之RESTFUL操作数据

创建、新增,删除索引

之前说过ES是支持且只允许RESTFUL风格操作数据,比如说对索引操作。
查询索引:GET,http://localhost:9200/{索引名}
创建索引:PUT,http://localhost:9200/{索引名}
删除索引:DELETE,http://localhost:9200/{索引名}

注意:对索引不能使用POST,即不能修改索引。

文档操作

创建文档

这里就有点意思了,URL如下:
> http://localhost:9200/{索引名}/_doc

提出两个问题:

  1. _doc是文档名吗?不是,_doc可以理解为固定写法,表示在指定索引下创建一个文档。是的,文档的标识和命名啥的不是由你来指定的
  2. 请求是PUT吗?不是,是POST。因为ES认为你直接创建文档,没有指明文档的唯一标识,需要ES帮你指明和返回给你。那如何指定文档唯一标识呢?如下:
    > [http://localhost:9200/{索引名}/_doc/{文档id}]

指定文档_id就可以发送PUT请求了。记得加上json请求体哦,以下示例:

{
  "consigneeCardNo": "532233197906090036",
  "consigneeCardType": "01",
  "dataId": "4412578860562"
}

查询文档

根据文档ID查询

> 查询单个文档:Get, [http://localhost:9200/{索引名}/_doc/{文档id}]
> 返回文档对象,_source属性保存着我们新增的数据

查询索引下所有文档

> 查询索引下所有文档:Get, [http://localhost:9200/{索引名}/_search]
> 返回数据中hit属性为一个文档数组,每个文档的_source属性保存着我们新增的数据。

单条件查询(精准查询、模糊查询、分词查询)

  1. 方式一:URL拼接参数(不推荐,因为中文可能会乱码)
    Get, [http://localhost:9200/{索引名}/_search?q=name:张三]

  2. 方式二:请求体(GET也可以发送请求体,只是平时用的少,POST用的多)
    Get, [http://localhost:9200/{索引名}/_search];
    请求体:

{
	"query":{
		"match":{//不仅仅查询name为张三的,会进行精准、模糊、分词查询
			"name": "张三"
		}
	}
}
完全匹配
{
	"query":{
		"match_phrase":{//查询name为张三的
			"name": "张三"
		}
	}
}

字段高亮显示

{
	"query":{
		"match_phrase":{//查询name为张三的
			"name": "张三"
		}
	},
	"heightlight":{
		"fields":{
			"name":{}//name字符串里匹配关键字的部分高亮显示
		}
	}
}

查询的结果可能有:name=张三的,也可能name="张四",也可能是name="赵三"...

分页查询

Get, [http://localhost:9200/{索引名}/_search];
请求体:

{
	"query":{
		"match":{//不仅仅查询name为张三的,会进行精准、模糊、分词查询
			"name": "张三"
		}
	},
	"from":0,//from和size和limit 0,10一样效果
	"size":10
}

查询指定字段

Get, [http://localhost:9200/{索引名}/_search];
请求体:

{
	"query":{
		"match":{//不仅仅查询name为张三的,会进行精准、模糊、分词查询
			"name": "张三"
		}
	},
	"from":0,//from和size和limit 0,10一样效果
	"size":10,
	"_source":["name"]//只查name属性
}

字段排序

Get, [http://localhost:9200/{索引名}/_search];
请求体:

{
	"query":{
		"match":{//不仅仅查询name为张三的,会进行精准、模糊、分词查询
			"name": "张三"
		}
	},
	"from":0,//from和size和limit 0,10一样效果
	"size":10,
	"_source":["name"],//只查name属性
	"sort":{
		"age": {
			"order": "desc"
		}
	}
}

多条件查询(精准查询、模糊查询、分词查询)

Get, [http://localhost:9200/{索引名}/_search];

  1. 请求体:查询name为张三并且age=18岁的
{
	"query":{
		"bool":{
			"must":[
				{
					"match": {
						"name":"张三"
					}
				},
				{
					"match": {
						"age":18
					}
				}
			]
		}
	}
}
  1. 请求体:查询name为张三或者age=18岁的
{
	"query":{
		"bool":{
			"should":[
				{
					"match": {
						"name":"张三"
					}
				},
				{
					"match": {
						"age":18
					}
				}
			]
		}
	}
}
  1. 请求体:查询name为张三且age>18岁的
{
	"query":{
		"bool":{
			"should":[
				{
					"match": {
						"name":"张三"
					}
				}
			],
			"filter":{
				"range":{
					"age":{
						"gt": 18
					}
				}
			}
		}
	}
}

聚合查询

Get, [http://localhost:9200/{索引名}/_search];

  1. 分组:
{
	"aggs":{//聚合操作
		"name_group":{//聚合名称,任意起名
			"terms":{//分组操作
				"field": "name"//分组字段
			}
		}
	}
}
  1. 平均值:
{
	"aggs":{//聚合操作
		"name_avg":{//聚合名称,任意起名
			"avg":{//求平均值操作
				"field": "name"//求平均值字段
			}
		}
	}
}

修改文档

> PUT: [http://localhost:9200/{索引名}/_doc/{文档id}];这个除了新增功能,还会全量覆盖,就是把之前的文档内容重新覆盖
> POST: [http://localhost:9200/{索引名}/_update/{文档id}];修改的报文格式是有要求的,举例:

{
	"doc": {//要更新的属性写在doc里
		"name":"zhangsan",
		"age": 14
	}
}

删除文档

> DELETE: [http://localhost:9200/{索引名}/_doc/{文档id}]

JavaApi操作ElasticSearch(非spring模式)

索引操作

新增索引

新增user索引:

	RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );
		//获取索引操作客户端
        IndicesClient indicesClient = client.indices();

        CreateIndexResponse response = indicesClient.create(new CreateIndexRequest("user"), RequestOptions.DEFAULT);

        if(response.isAcknowledged()){
            System.out.println("成功");
        }else {
            System.out.println("失败");
        }

        client.close();

查询索引

查询user索引:

	RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        IndicesClient indicesClient = client.indices();

        GetIndexResponse response = indicesClient.get(new GetIndexRequest("user"), RequestOptions.DEFAULT);

        System.out.println(response.getAliases());
        System.out.println(response.getMappings());
        System.out.println(response.getSettings());

        client.close();

删除索引

删除user索引:

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );

        IndicesClient indicesClient = client.indices();

        AcknowledgedResponse response = indicesClient.delete(new DeleteIndexRequest("user"), RequestOptions.DEFAULT);

        if(response.isAcknowledged()){
            System.out.println("删除成功");
        }else {
            System.out.println("删除失败");
        }

        client.close();

文档操作

新增文档

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        //插入文档
        IndexRequest indexRequest = new IndexRequest();

        indexRequest.index("user").id("1001");//插入到哪个索引,并指定文档ID

        User user = new User();
        user.setName("张三");
        user.setAge(13);

        indexRequest.source(XContentType.JSON, JSONUtil.toJsonStr(user));//设置插入的数据

        IndexResponse response = client.index(indexRequest, RequestOptions.DEFAULT);//发送

        System.out.println(response.getResult());

        client.close();

批量新增

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        BulkRequest bulkRequest = new BulkRequest();

        bulkRequest.add(new IndexRequest().index("user").id("1001").source(XContentType.JSON, JSONUtil.toJsonStr(new User("zhangsan",14))));
        bulkRequest.add(new IndexRequest().index("user").id("1001").source(XContentType.JSON, JSONUtil.toJsonStr(new User("lisi",14))));
        bulkRequest.add(new IndexRequest().index("user").id("1001").source(XContentType.JSON, JSONUtil.toJsonStr(new User("wangwu",14))));

        //批量新增文档
        BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getItems());
        
        client.close();

更新文档(局部更新)

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        //更新文档
        UpdateRequest updateRequest = new UpdateRequest();

        updateRequest.index("user").id("1001");//更新到哪个索引哪个文档ID

        User user = new User();
        user.setName("张三");
        user.setAge(18);

        updateRequest.doc(XContentType.JSON, JSONUtil.toJsonStr(user));//设置更新数据

        UpdateResponse response = client.update(updateRequest, RequestOptions.DEFAULT);//发送

        System.out.println(response.getResult());

        client.close();

查询文档

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        //查询文档
        GetRequest getRequest = new GetRequest();

        getRequest.index("user").id("1001");//查询哪个索引哪个文档ID

        GetResponse response = client.get(getRequest, RequestOptions.DEFAULT);//发送

        System.out.println(response.getSourceAsString());//查询的是我们的json数据

        client.close();

高级查询-全量查询

//全量查询
        SearchRequest request = new SearchRequest();

        //条件构造器:查询所有
        SearchSourceBuilder builder = new SearchSourceBuilder().query(QueryBuilders.matchAllQuery());
        request.indices("user").source(builder);

        SearchResponse response = client.search(request, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getHits().getTotalHits());

        //遍历我们的报文数据
        for (SearchHit hit : response.getHits()) {
            System.out.println(hit.getSourceAsString());
        }
        client.close();

高级查询-单条件查询

        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        //查询 name = zhangsan, 只要name字段,取前10条,根据年龄降序
        SearchRequest request = new SearchRequest();

        //条件构造器
        SearchSourceBuilder builder = new SearchSourceBuilder().
                query(QueryBuilders.termQuery("name", "zhangsan"));//查询name = zhangsan

        String[] includes = {"name"};
        String[] excludes = {};
        builder.fetchSource(includes,excludes);//只要name字段

        builder.from(0);
        builder.size(10);//前10条

        builder.sort("age", SortOrder.DESC);//年龄降序

        request.indices("user").source(builder);

        SearchResponse response = client.search(request, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getHits().getTotalHits());

        //遍历我们的报文数据
        for (SearchHit hit : response.getHits()) {
            System.out.println(hit.getSourceAsString());
        }
        client.close();

高级查询-多条件查询

  1. 查询 name = zhangsan, age = 13
RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        SearchRequest request = new SearchRequest();

        //条件构造器
        SearchSourceBuilder builder = new SearchSourceBuilder();

        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();//多条件构造器

        //查询 name = zhangsan, age = 13
        boolQueryBuilder.must(QueryBuilders.matchQuery("name","zhangsan"));
        boolQueryBuilder.must(QueryBuilders.matchQuery("age",13));

        builder.query(boolQueryBuilder);
        
        request.indices("user").source(builder);

        SearchResponse response = client.search(request, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getHits().getTotalHits());

        //遍历我们的报文数据
        for (SearchHit hit : response.getHits()) {
            System.out.println(hit.getSourceAsString());
        }
        client.close();
  1. 查询age>=13的
RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        SearchRequest request = new SearchRequest();

        //条件构造器
        SearchSourceBuilder builder = new SearchSourceBuilder();

        RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("age");//范围查询构造器

        rangeQuery.gte(13);//年龄大于13的

        builder.query(rangeQuery);

        request.indices("user").source(builder);

        SearchResponse response = client.search(request, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getHits().getTotalHits());

        //遍历我们的报文数据
        for (SearchHit hit : response.getHits()) {
            System.out.println(hit.getSourceAsString());
        }
        client.close();

高级查询-模糊查询、高亮显示

对 name = zhangsan进行模糊查询,并进行高亮显示

        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        SearchRequest request = new SearchRequest();

        //条件构造器
        SearchSourceBuilder builder = new SearchSourceBuilder();

        FuzzyQueryBuilder fuzzyQueryBuilder = QueryBuilders.fuzzyQuery("name", "zhangsan");//模糊查询构造器
        fuzzyQueryBuilder.fuzziness(Fuzziness.TWO);//模糊偏移量:为2表示像zhangsan22,zhanang1,11zhangsan都会被查出来,但像zhangsan333就查不出来

        HighlightBuilder highlightBuilder = new HighlightBuilder();//高亮构造器
        highlightBuilder.preTags("<font color="red">");
        highlightBuilder.postTags("</font>");
        highlightBuilder.field("name");//name字段匹配的字符串高亮显示,并用标签括起来

        builder.query(fuzzyQueryBuilder);
        builder.highlighter(highlightBuilder);

        request.indices("user").source(builder);

        SearchResponse response = client.search(request, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getHits().getTotalHits());

        //遍历我们的报文数据
        for (SearchHit hit : response.getHits()) {
            System.out.println(hit.getSourceAsString());
        }
        client.close();

高级查询-聚合查询

  1. age取最大值
RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        SearchRequest request = new SearchRequest();

        //条件构造器
        SearchSourceBuilder builder = new SearchSourceBuilder();

        //聚合名称为maxAge,对age取最大值
        MaxAggregationBuilder aggregationBuilder = AggregationBuilders.max("maxAge").field("age");

        builder.aggregation(aggregationBuilder);

        request.indices("user").source(builder);

        SearchResponse response = client.search(request, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getHits().getTotalHits());

        //遍历我们的报文数据
        for (SearchHit hit : response.getHits()) {
            System.out.println(hit.getSourceAsString());
        }
        client.close();
  1. 分组
RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        SearchRequest request = new SearchRequest();

        //条件构造器
        SearchSourceBuilder builder = new SearchSourceBuilder();

        //聚合名称为ageGroup,对age分组
        TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("ageGroup").field("age");

        builder.aggregation(termsAggregationBuilder);

        request.indices("user").source(builder);

        SearchResponse response = client.search(request, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getHits().getTotalHits());

        //遍历我们的报文数据
        for (SearchHit hit : response.getHits()) {
            System.out.println(hit.getSourceAsString());
        }
        client.close();

删除文档

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        //查询文档
        DeleteRequest deleteRequest = new DeleteRequest();

        deleteRequest.index("user").id("1001");//删除哪个索引哪个文档ID

        DeleteResponse response = client.delete(deleteRequest, RequestOptions.DEFAULT);//发送

        System.out.println(response);

        client.close();

批量删除

RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost",9200,"http"))
        );


        BulkRequest bulkRequest = new BulkRequest();

        bulkRequest.add(new DeleteRequest().index("user").id("1001"));
        bulkRequest.add(new DeleteRequest().index("user").id("1001"));
        bulkRequest.add(new DeleteRequest().index("user").id("1001"));

        //批量新增文档
        BulkResponse response = client.bulk(bulkRequest, RequestOptions.DEFAULT);

        System.out.println(response.getTook());
        System.out.println(response.getItems());

        client.close();