2022/05/25 Minutes of the bi-weekly meeting of Apache Linkis(incubating) 1. [Fixed topic] Apache Linkis incubation & version progress synchronization. —— Xu Jie https://docs.qq.com/sheet/DSFJyTld3Y0JGeU54?tab=uf5xax 2. [Fixed topic] Apache Linkis1.1.1 progress synchronization. —— Xia Chen https://docs.qq.com/sheet/DVWZLYlFVTWVrdmlr The voting is over, an email will be issued on 2022-05-25, and a soft article will be issued on the public account this week, which is basically completed. 3. [Fixed topic] Apache Linkis1.1.2 progress synchronization. - Tok Road 4. [Fixed topic] Apache Linkis1.2.0 progress synchronization. —— Wang Zhen Code has been merged, documentation and test cases are being prepared In addition, the 1.1.3 new feature code has been submitted, the unit test has been supplemented, and the documentation is being improved, hosted by Sun Shun 5. [Fixed topic] Apache Linkis community operation progress is synchronized. —— Li Wen The operating indicators of the open source community are growing normally, and the growth of core indicators such as committer and pr is in line with expectations. Judging from the tweets of the public account issued in the past two weeks, developers are more concerned about the technical level, as well as the components related to Linux or WDS. Some new editions are released, and some articles about other activities and news articles are of general interest. There are 28 contributors and 80 registrations in total, and further contact is required. The video on how to become a developer has been launched on websites such as Station B. It is recommended to start with a simpler way of documentation. Already 7 volunteers have started contributing to the community through documentation. Xingce community meetup, the number of viewers reached 4000+, but it did not bring traffic growth to the community. On the evening of Thursday, June 9th, a community meetup will be held, which will be postponed for a week until June 15th. 6. [Temporary Issue] Test Environment Whitelist Access Mechanism —— Wang Heping For security reasons, the existing test environment cannot be connected normally, and external resources such as Alibaba Cloud need to be purchased for redeployment, and it is necessary to further communicate resource application issues within Wezhong. 7. [Temporary topic] Apache Linkis 1.1.3 feature introduction (Prometheus monitoring) - Sun Shun It mainly introduces the monitoring architecture and deployment plan of Linkis Prometheus. The main monitoring data is in JVM and NETWORK. If necessary, Tianyi Cloud can contribute to the monitoring of indicators that are closer to the business, such as the number of successful and failed tasks. 8. [Temporary Issue] Linkis containerization starts - Tao Kelu Linis containerization starts to start development, and Tianyi Cloud can provide some codes and ideas to cooperate. 9. [Temporary Issue] Based on the linkis&dss 1.X version, the adaptation of peripheral components - Di Shuai 10. [Fixed topic] The host of the next regular meeting, welcome to claim it. Postponed for a week, held on June 15, hosted by Hui Ge 11. [Temporary Issue] Answers to developer questions https://docs.qq.com/sheet/DUlJIREJKaHlVVUVU?u=444994d9ce1841d3af5bb0d65efbe4a9&tab=ggfj6i For resource usage issues, it is recommended that users upgrade from 1.0.2 to 1.0.3 or later, and 1.1.2 will be the long-term support version. k8s deployment linkis exists. Due to the overlap network, the service ip and port are random. When restarting, linkis saves the eurka address in the local database, resulting in the address before the connection after the restart. It is recommended not to keep unnecessary information in the database. , The data source runs through the entire data governance. If it is configured in Linkis, it is not conducive to expansion and use. The main Linkis data source is for tool development and use, not the data source used by the company.
Configuration parameter change Whether it is necessary to add a md file of parameter change record in the source code library Recent developments merged 1. https://github.com/apache/incubator-linkis/pull/2168/files FileSource supports variable configuration for file types Jie Longping has been merged 2. https://github.com/apache/incubator-linkis/issues/2124 Optimize the result set path to be separated by date, to solve the problem of too many subdirectories in a single folder. The result set path of different dates is in the same folder, such as "/tmp /linkis/hadoop/linkis/20220516_210525/IDE/40099", which may result in too many files in a folder. The number of merged hdfs directories is limited. 3. https://github.com/apache/incubator-linkis/pull/2109 Add support for sqoop engine plugin Merged 4. https://github.com/apache/incubator-linkis/issues/2103 Fixed a bug where kerberos was not used, and the kinit thread was started when executing JDBC engine tasks. Merged 5. https://github.com/apache/incubator-linkis/issues/2110 Removed the binary file .mvn/wrapper/maven-wrapper.jar in the source code, and adjusted the LICENSE instructions related to .mvn/* 6. https://github.com/apache/incubator-linkis/pull/2113 Upgrade py4j-0.10.7-src.zip to py4j-0.10.9.5-src.zip 7. https://github.com/apache/incubator-linkis/pull/2116 linkis-storage module replaces cglib with spring built-in cglib 8. https://github.com/apache/incubator-linkis/pull/2131 Remove the introduction of pandas to solve the problem that the python engine fails to start due to lack of dependencies 9. https://github.com/apache/incubator-linkis/pull/2133 The temporary storage path of data source kafka and hive is added to check the function of automatically creating a directory 10. https://github.com/apache/incubator-linkis/pull/2142 Fix the problem that the JDBC Engine console configuration cannot take effect immediately after modification (the cache time is adjusted as a configuration item) 11. https://github.com/apache/incubator-linkis/pull/2160 The consumption queue for task submission supports the configuration of specific high-volume users 12. https://github.com/apache/incubator-linkis/pull/2161 Added support for automatic formatting parameters when exporting the result set to an excel file to be merged 1. https://github.com/apache/incubator-linkis/pull/2173 Add support for presto engine plugin 2. https://github.com/apache/incubator-linkis/pull/2164 entrance The parameter that supports the number of retries of the task - RetryCountLabel Asf header is missing 3. https://github.com/apache/incubator-linkis/pull/2163 Add task and execution EC records, and EC information is recorded in the task's Metrics field 4. https://github.com/apache/incubator-linkis/pull/2159 EC's log log supports scrolling by size and time 5. https://github.com/apache/incubator-linkis/pull/2150 The common and entry modules both have the logic of custom variable substitution, and the optimization is aggregated into the common module for processing 6. https://github.com/apache/incubator-linkis/pull/2147 The gson of dependabot is upgraded from 2.8.5 to 2.8.9 7. https://github.com/apache/incubator-linkis-website/pull/265 Supplementary engine implementation details document to be merged Build failed issue 1. https://github.com/apache/incubator-linkis/issues/2144 The problem of shell script authorization +x permission failure occurs when compiling different systems. Tao Kelu follow up 2. https://github.com/apache/incubator-linkis/issues/2141 Change dbcp in JDBC engine to dbcp2 --------------------- 2022/05/25 Apache Linkis(incubating) 双周例会会议纪要 1. 【固定议题】Apache Linkis 孵化&版本 进展同步。 —— 徐杰 https://docs.qq.com/sheet/DSFJyTld3Y0JGeU54?tab=uf5xax 2. 【固定议题】Apache Linkis1.1.1 进展同步。 —— 夏晨 https://docs.qq.com/sheet/DVWZLYlFVTWVrdmlr 投票结束,2022-05-25发版邮件,这周发公众号软文,基本完成。 3. 【固定议题】Apache Linkis1.1.2 进展同步。 —— 陶克路 测试完成,文档完成,等待1.1.1发版完成 4. 【固定议题】Apache Linkis1.2.0 进展同步。 —— 王震 代码已完成合并,文档与测试用例正在准备准备 另外1.1.3 新特性代码已提交,补充单元测试,文档在完善,孙顺主持 5. 【固定议题】Apache Linkis 社区运营 进展同步。 —— 李文 开源社区运行指标增长正常,committer与pr等核心指标增长符合预期,从最近两周发出的公众号的推文来看,开发者们是比较关注技术层面的,还有Linux或者WDS相关的组件的一些新版本的发布,而对于其他的活动类的些软文类的文章关注度一般。 贡献者证书和摆台登记人数28人,总数为80人,需要进一步联系。 已经在B站等网站推出如何成为开发者视频,建议是从文档较为简单的方式开始。已经有7个志愿者通过文档开始给社区做贡献。 星策社区meetup,观看人数达到4000+,但是没有带给社区带来流量增长。 6月9日周四晚上,开展社区meetup活动,双周会延期一周至6月15日举行。 6. 【临时议题】测试环境白名单访问机制 —— 王和平 安全原因,现有的测试环境不能正常接入,需要采购如阿里云等外部资源进行重新部署,需要微众内部进一步进行沟通资源申请问题。 7. 【临时议题】Apache Linkis1.1.3特性介绍(Prometheus监控) —— 孙顺 主要介绍了Linkis Prometheus监控架构涉及及部署方案,主要监控的数据在JVM、NETWORK,如有需要天翼云这边可贡献成功与失败任务数等更贴近业务的指标监控。 8. 【临时议题】Linkis容器化开始启动 —— 陶克路 Linis容器化开始启动开发,天翼云可提供一些代码与思路配合。 9. 【临时议题】基于linkis&dss 1.X版本的,周边组件适配情况—— 邸帅 强哥汇报了各个系统的版本规划情况,及各个版本之间的依赖关系。 10. 【固定议题】下一场例会的主持人,欢迎认领。 延期一周,6月15日举行,辉哥主持 11. 【临时议题】开发者问题答疑 https://docs.qq.com/sheet/DUlJIREJKaHlVVUVU?u=444994d9ce1841d3af5bb0d65efbe4a9&tab=ggfj6i 针对资源使用问题,建议用户从1.0.2升级至1.0.3以上版本,1.1.2将作为为长期支持版本。 k8s部署linkis存在,由于采用overlap网络,服务ip和端口是随机的,再重启的时候,linkis把eurka地址存在本地数据库中,导致重启后还是连接之前地址,建议不保留不必要的信息在数据库中, 数据源贯穿整个数据治理,如果在linkis中配置,不利于扩展和使用,主要Linkis数据源是针对工具开发使用,而非公司统一使用的数据源。 配置参数变化 是否需要在源码库中增加一个参数变化记录的md文档 近期动态 已合并 1. https://github.com/apache/incubator-linkis/pull/2168/files FileSource中文件类型支持变量配置 介龙平 已合并 2. https://github.com/apache/incubator-linkis/issues/2124 优化结果集路径以日期分隔,解决单个文件夹子目录过多问题 不同日期的resustset路径在同一个文件夹,如“/tmp/linkis/hadoop/linkis/20220516_210525/IDE/40099”,可能会导致一个文件夹下文件太多 已合并 hdfs目录个数有限制 3. https://github.com/apache/incubator-linkis/pull/2109 添加sqoop引擎插件的支持 已合并 4. https://github.com/apache/incubator-linkis/issues/2103 修复了未使用 kerberos,在执行JDBC引擎任务时 kinit 线程启动的错误 已合并 5. https://github.com/apache/incubator-linkis/issues/2110 移除了源码中的二进制文件.mvn/wrapper/maven-wrapper.jar,调整.mvn/*相关的LICENSE说明 6. https://github.com/apache/incubator-linkis/pull/2113 升级 py4j-0.10.7-src.zip 至 py4j-0.10.9.5-src.zip 7. https://github.com/apache/incubator-linkis/pull/2116 linkis-storage 模块将 cglib 替换为spring 内置的cglib 8. https://github.com/apache/incubator-linkis/pull/2131 移除对pandas的引入,解决python引擎因为缺失依赖导致启动失败的问题 9. https://github.com/apache/incubator-linkis/pull/2133 数据源kafka与hive的临时存储路径增加检查自动创建目录功能 10. https://github.com/apache/incubator-linkis/pull/2142 修复JDBC Engine 控制台配置修改后无法立即生效的问题(cache时间调整为配置项) 11. https://github.com/apache/incubator-linkis/pull/2160 任务提交的消费队列支持配置特定大容量用户 12. https://github.com/apache/incubator-linkis/pull/2161 新增对结果集导出到 excel文件时,自动格式化参数的支持 待合并 1. https://github.com/apache/incubator-linkis/pull/2173 添加presto引擎插件的支持 2. https://github.com/apache/incubator-linkis/pull/2164 entrance 支持任务重试次数的参数- RetryCountLabel Asf头部缺失 3. https://github.com/apache/incubator-linkis/pull/2163 增加任务与执行EC的记录,EC信息记录到任务的 Metrics字段中 4. https://github.com/apache/incubator-linkis/pull/2159 EC的log日志支持按大小和时间切割滚动 5. https://github.com/apache/incubator-linkis/pull/2150 common和entrance模块都存在自定义变量替换的逻辑,优化聚集到common模块中处理 6. https://github.com/apache/incubator-linkis/pull/2147 dependabot的gson从2.8.5升级至2.8.9 7. https://github.com/apache/incubator-linkis-website/pull/265 引擎实现细节文档补充 待合并 构建失败 issue 1. https://github.com/apache/incubator-linkis/issues/2144 不同系统编译出现shell脚本授权+x权限失败问题 陶克路跟进 2. https://github.com/apache/incubator-linkis/issues/2141 将 JDBC 引擎中的 dbcp 更改为 dbcp2