justinmclean commented on code in PR #5837: URL: https://github.com/apache/gravitino/pull/5837#discussion_r1903699305
########## docs/glossary.md: ########## @@ -74,187 +215,135 @@ license: "This software is licensed under the Apache License version 2." ## Docker container -- A lightweight, standalone, executable package that includes everything needed to run a piece of software, including the code, runtime, libraries, and system tools. +- A lightweight, standalone package that includes everything needed to run a software. + A container compiles an application with its dependencies and runtime for distribution. ## Docker Hub -- A cloud-based registry service for Docker containers, allowing users to share and distribute containerized applications. +- A cloud-based registry service for Docker containers. + Users can publish, browse and download containerized software using this service. ## Docker image -- A lightweight, standalone, and executable package that includes everything needed to run a piece of software, including the code, runtime, libraries, and system tools. +- A lightweight, standalone package that includes everything needed to run a software. + A Docker image typically comprises the code, runtime, libraries, and system tools. -## Docker file +## Dockerfile -- A configuration file used to create a Docker image, specifying the base image, dependencies, and commands for building the image. +- A configuration file for building a Docker image. + A Dockerfile contains instructions to build a standard image for distributing a software. -## Dropwizard Metrics +## Dropwizard metrics - A Java library for measuring the performance of applications and providing support for various metric types. -## Amazon Elastic Block Store (EBS) - -- A scalable block storage service provided by Amazon Web Services. - ## Environment variables -- Variables used to pass information to running processes. +- Variables used to customize the runtime configuration for a process. ## Geo-distributed - The distribution of data or services across multiple geographic locations. +## Git + +- A distributed version control system used for tracking software artifacts. + ## GitHub -- A web-based platform for version control and collaboration using Git. +- A web-based platform for version control and community collaboration using Git. ## GitHub Actions -- A continuous integration and continuous deployment (CI/CD) service provided by GitHub, used for automating build, test, and deployment workflows. +- A continuous integration and continuous deployment (CI/CD) service provided by GitHub. + GitHub Actions are used for automating the build, test, and deployment workflows. ## GitHub labels -- Tags assigned to GitHub issues or pull requests for organization, categorization, or workflow automation. +- Labels assigned to GitHub issues or pull requests for organization or workflow automation. ## GitHub pull request -- A proposed change to a repository submitted by a user through the GitHub platform. +- A proposed change to a GitHub repository submitted by a user. ## GitHub repository - The location where GitHub stores a project's source code and related files. ## GitHub workflow -- A series of automated steps defined in a YAML file that runs in response to events on a GitHub repository. - -## Git - -- A version control system used for tracking changes and collaborating on source code. - -## GPG/GnuPG - -- Gnu Privacy Guard or GnuPG, an open-source implementation of the OpenPGP standard, used for encrypting and signing files and emails. +- A series of automated steps that are triggered by certain events on a GitHub repository. ## Gradle -- A build automation tool for building, testing, and deploying projects. +- A automation tool for building, testing, and deploying projects. ## Gradlew -- A Gradle wrapper script, used for executing Gradle commands without installing Gradle separately. - -## Apache Gravitino - -- An open-source software platform originally created by Datastrato for high-performance, geo-distributed, and federated metadata lakes. Designed to manage metadata directly in different sources, types, and regions, providing unified metadata access for data and AI assets. - -## Apache Gravitino configuration file (gravitino.conf) - -- The configuration file for the Gravitino server, located in the `conf` directory. It follows the standard property file format and contains settings for the Gravitino server. +- A Gradle wrapper script, used for executing Gradle commands. ## Hashes -- Cryptographic hash values generated from the contents of a file, often used for integrity verification. - -## HDFS - -- **HDFS** (Hadoop Distributed File System) is an open-source, distributed file system and a key component of the Apache Hadoop ecosystem. It is designed to store and process large-scale datasets, providing high reliability, fault tolerance, and performance for distributed storage solutions. +- Cryptographic hash values generated from some data. + A typical use case is to verify the integrity of a file. ## Headless -- A system without a graphical user interface. - -## HTTP port - -- The port number on which a server listens for incoming connections. - -## Apache Iceberg Hive catalog - -- The **Iceberg Hive catalog** is a specialized metadata service designed for the Apache Iceberg table format, allowing external systems to interact with Iceberg metadata via a Hive metastore thrift client. - -## Apache Iceberg REST catalog - -- The **Iceberg REST Catalog** is a specialized metadata service designed for the Apache Iceberg table format, allowing external systems to interact with Iceberg metadata via a RESTful API. - -## Apache Iceberg JDBC catalog - -- The **Iceberg JDBC Catalog** is a specialized metadata service designed for the Apache Iceberg table format, allowing external systems to interact with Iceberg metadata using JDBC (Java Database Connectivity). +- A system without a local console. ## Identity fields -- Fields in tables that define the identity of the table, specifying how rows in the table are uniquely identified. +- Fields in tables that define the identity of the records. + In the scope of a table, the identity fields are used as the unique identifier of a row. ## Integration tests -- Tests designed to ensure the correctness and compatibility of software when integrated into a unified system. - -## IP address - -- Internet Protocol address, a numerical label assigned to each device participating in a computer network. +- Tests for the correctness and compatibility of a software. + It is typically conducted when integrating a component into a larger system. Review Comment: I think this is better: "Tests that ensure software correctness and compatibility when integrating components into a larger system." -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org