共 33 篇文章 |
|
//搜索名字中含有jack的文档 WildcardQueryBuilder queryBuilder2 = QueryBuilders.wildcardQuery( "interest", "*read*");//搜索interest中含有read的文档 BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery(); //name中含有jack或者interest含有read,相当于or boolQueryBuilder.should(... 阅76 转0 评0 公众公开 20-01-20 17:07 |
spark深入:配置文件与日志一、第一部分1、spark2.1与hadoop2.7.3集成,spark on yarn模式下,需要对hadoop的配置文件yarn-site.xml增加内容,如下:spark.yarn.historyServer.address=node2:18080spark.history.ui.port=18080spark.eventLog.enabled=truespark.eventLog.dir=hdfs:///tmp/spark/eventsspark.history.fs.logDirectory=hdfs:///t... 阅99 转0 评0 公众公开 19-04-30 16:41 |
从Hive迁移到SparkSQL,有赞的大数据实践前言。从 Hive 到 SparkSQL 的迁移之路。在 SparkSQL 迁移之初,我们选择的路线是遵循二八法则,从优化耗费资源最多的头部任务开始,把 Top100 的任务从 Hive 往 SparkSQL 迁移,逐步积累典型错误,包括 SparkSQL 和 Hive 的不一致行为,比较典型的问题由 ORC 格式文件为空,Spark 会抛空指针异常而失败... 阅94 转0 评0 公众公开 19-01-12 19:25 |
a b a b1 1 1 11 2 1 21 3 1 32 3 NULL NULL2 4 NULL NULL但是接下来我将展示目前版本的Hive在外关联上的一个重要缺陷 :select ljn001.*,ljn002.*from ljn001 left outer join ljn002on (ljn001.a = ljn002.a and ljn001.b = ljn002.b and ... 阅1542 转0 评0 公众公开 18-10-12 18:53 |
with a as (select 1 a from dual union allselect 1 a from dual union allselect 1 a from dual union allselect 2 a from dual union all select 3 a from dual union allselect 4 a from dual union allselect 4 a from dual union allselect 5 a from dual ... 阅2659 转1 评0 公众公开 18-09-19 11:59 |
BI中事实表和维度表的定义。也就是说事实表是多个维度表的一个交点。首先介绍下数据库结构中的星型结构,该结构在位于结构中心的单个事实数据表中维护数据,其它维度数据存储在维度表中。每个维度表与事实数据表直接相关,且通常通过一个键联接到事实数据表中。事实表是数据仓库结构中的中央表,它包含联系事实与维度表的数字度量值和键。 阅398 转0 评0 公众公开 18-08-23 10:24 |
col1 string, col2 string, col3 string )load data local inpath ‘/home/jiangzl/shell/test.txt’ into table tmp_jiangzl_test;select col1,col2,concat_ws(‘,’,collect_set(col3)) from tmp_jiangzl_test group by col1,col2;select col1, col2, col3.//不去重 SELECT id,name,age1 as age FROM table1 UNION all SELECT id,name,age2 a... 阅357 转0 评0 公众公开 18-07-26 10:31 |
addressname addrzhangsan beijingzhangsan shanghailisi tianjinwangwu nanjing.select max(user.id) as id, user.name, concat_ws(",",collect_set(address.addr)) from user join address on user.name = address.name group by user.name order by id;1 zhangsan shanghai,beijing2 lisi tianjin3 wangwu nanjing.select col... 阅697 转1 评0 公众公开 18-07-26 10:26 |
hive> createtable lxw_dual as select 1 + 1.2 from lxw_dual;hive> createtable lxw_dual as select 5.6 – 4 from lxw_dual;hive> create table lxw_dual as select round(9542.158) fromlxw_dual;hive> Create table lxw_test as select map(''''''''100'''''''&... 阅12 转0 评0 公众公开 18-07-13 10:42 |
hive> createtable lxw_dual as select 1 + 1.2 from lxw_dual;hive> createtable lxw_dual as select 5.6 – 4 from lxw_dual;hive> create table lxw_dual as select round(9542.158) fromlxw_dual;hive> Create table lxw_test as select map(''''''''100'''''''&... 阅38 转0 评0 公众公开 18-07-12 11:09 |