Oh my god, 112? :DD I was thinking it would be less than 10. Anyway, I think we need to integrate this to some ant target. If you expanded on this, that would be great.
On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com> wrote: > A very primitive implementation of the 1st idea below: > > String configUrl = > "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml"; > Field[] allFields = Config.class.getFields(); > List<String> topLevelPropertyNames = new ArrayList<>(); > for(Field field : allFields) > { > if (!Modifier.isStatic(field.getModifiers())) > { > topLevelPropertyNames.add(field.getName()); > } > } > > URL url = new URL(configUrl); > List<String> lines = Files.readAllLines(Paths.get(url.toURI())); > > int missedCount = 0; > for (String propertyName : topLevelPropertyNames) > { > boolean found = false; > for (String line : lines) > { > if (line.startsWith(propertyName + ":") > || line.startsWith("#" + propertyName + ":") > || line.startsWith("# " + propertyName + ":")) { > found = true; > break; > } > } > if (!found) > { > missedCount++; > System.out.println(propertyName); > } > } > System.out.println("Total missed:" + missedCount); > > > It prints the following config property names which are defined in > Config.java but not present as "property" or "# property " in a file: > > permissions_cache_max_entries > roles_cache_max_entries > credentials_cache_max_entries > auto_bootstrap > force_new_prepared_statement_behaviour > use_deterministic_table_id > repair_request_timeout > stream_transfer_task_timeout > cms_await_timeout > cms_default_max_retries > cms_default_retry_backoff > epoch_aware_debounce_inflight_tracker_max_size > metadata_snapshot_frequency > available_processors > repair_session_max_tree_depth > use_offheap_merkle_trees > internode_max_message_size > native_transport_max_message_size > native_transport_max_request_data_in_flight_per_ip > native_transport_max_request_data_in_flight > native_transport_receive_queue_capacity > min_free_space_per_drive > max_space_usable_for_compactions_in_percentage > reject_repair_compaction_threshold > concurrent_index_builders > max_streaming_retries > commitlog_max_compression_buffers_in_pool > max_mutation_size > dynamic_snitch > failure_detector > use_creation_time_for_hint_ttl > key_cache_migrate_during_compaction > key_cache_invalidate_after_sstable_deletion > paxos_cache_size > file_cache_round_up > disk_optimization_estimate_percentile > disk_optimization_page_cross_chance > purgeable_tobmstones_metric_granularity > windows_timer_interval > otc_coalescing_strategy > otc_coalescing_window_us > otc_coalescing_enough_coalesced_messages > otc_backlog_expiration_interval_ms > scripted_user_defined_functions_enabled > user_defined_functions_threads_enabled > allow_insecure_udfs > allow_extra_insecure_udfs > user_defined_functions_warn_timeout > user_defined_functions_fail_timeout > user_function_timeout_policy > back_pressure_enabled > back_pressure_strategy > repair_command_pool_full_strategy > repair_command_pool_size > block_for_peers_timeout_in_secs > block_for_peers_in_remote_dcs > skip_stream_disk_space_check > snapshot_on_repaired_data_mismatch > validation_preview_purge_head_start > initial_range_tombstone_list_allocation_size > range_tombstone_list_growth_factor > snapshot_on_duplicate_row_detection > check_for_duplicate_rows_during_reads > check_for_duplicate_rows_during_compaction > autocompaction_on_startup_enabled > auto_optimise_inc_repair_streams > auto_optimise_full_repair_streams > auto_optimise_preview_repair_streams > consecutive_message_errors_threshold > internode_error_reporting_exclusions > compact_tables_enabled > vector_type_enabled > intersect_filtering_query_warned > intersect_filtering_query_enabled > streaming_slow_events_log_timeout > repair_state_expires > repair_state_size > paxos_variant > skip_paxos_repair_on_topology_change > paxos_purge_grace_period > paxos_on_linearizability_violations > paxos_state_purging > paxos_repair_enabled > paxos_topology_repair_no_dc_checks > paxos_topology_repair_strict_each_quorum > skip_paxos_repair_on_topology_change_keyspaces > paxos_contention_wait_randomizer > paxos_contention_min_wait > paxos_contention_max_wait > paxos_contention_min_delta > paxos_repair_parallelism > sstable_read_rate_persistence_enabled > client_request_size_metrics_enabled > max_top_size_partition_count > max_top_tombstone_partition_count > min_tracked_partition_size > min_tracked_partition_tombstone_count > top_partitions_enabled > severity_during_decommission > progress_barrier_min_consistency_level > progress_barrier_default_consistency_level > progress_barrier_timeout > progress_barrier_backoff > discovery_timeout > unsafe_tcm_mode > cql_start_time > native_transport_throw_on_overload > native_transport_queue_max_item_age_threshold > native_transport_min_backoff_on_queue_overload > native_transport_max_backoff_on_queue_overload > native_transport_timeout > enforce_native_deadline_for_hints > Total missed:112 > > > > On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <smikloso...@apache.org> > wrote: > >> It should also work the other way around. If there is a property which is >> commented out in yaml and it is not in Config.java, that should fail as >> well. If it is not commented out and it is not in Config.java, that will >> fail in runtime as it fails on unrecognized property. >> >> This will be used in practice very rarely as we seldom remove the >> properties in Config but if we do and a property is commented out, we >> should not ship a dead property name, even commented out. >> >> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org> wrote: >> >>> > > If "# my_cool_property: true" is NOT in cassandra.yaml, we might >>> indeed add it, also commented out. I think it would be quite easy to check >>> against yaml if there is a line starting on "# my_cool_property" or just on >>> "my_cool_property". Both cases would satisfy the check. >>> >>> Makes sense, I think this would be good to have as a lint or test to >>> easily catch overlooks during review. >>> >>> >>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič < >>> smikloso...@apache.org> wrote: >>> >>>> >>>> >>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org> wrote: >>>> >>>>> > from time to time I see configuration properties in Config.java and >>>>> they are clearly not in cassandra.yaml. Not every property in Config is in >>>>> cassandra.yaml. I would like to know if there is some specific reason >>>>> behind that. >>>>> >>>>> I think one of the original reasons was to "hide" advanced configs >>>>> that are not meant to be updated, unless in very niche circumstances. >>>>> However I think this has been extrapolated to non-advanced settings. >>>>> >>>>> > Question related to that is if we could not have a build-time check >>>>> that all properties in Config have to be in cassandra.yaml and fail the >>>>> build if a property in Config does not have its counterpart in yaml. >>>>> >>>>> Are you saying every configuration property should be commented-out, >>>>> or do you think that every Config property should be specified in >>>>> cassandra.yaml with their default uncomented ? One issue with that is that >>>>> you could cause user confusion if you "reveal" a niche/advanced config >>>>> that >>>>> is not meant to be updated. I think this would be addressed by >>>>> the @HiddenInYaml flag you are proposing in a later post. >>>>> >>>> >>>> Yes, then can stay hidden, but we should annotate it with @Hidden or >>>> similar. As of now, if that property is not in yaml, we just don't know if >>>> it was forgotten to be added or if we have not added it on purpose. >>>> >>>> They can keep being commented out if they currently are. Imagine a >>>> property in Config.java >>>> >>>> public boolean my_cool_property = true; >>>> >>>> and then this in cassandra.yaml >>>> >>>> # my_cool_property: true >>>> >>>> It is completely ok. >>>> >>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might indeed >>>> add it, also commented out. I think it would be quite easy to check against >>>> yaml if there is a line starting on "# my_cool_property" or just on >>>> "my_cool_property". Both cases would satisfy the check. >>>> >>>> >>>> >>>>> > There are dozens of properties in Config and I have a strong >>>>> suspicion that we missed to publish some to yaml so users do not even know >>>>> such a property exists and as of now we do not even know which they are. >>>>> >>>>> I believe this is a problem. I think most properties should be in >>>>> cassandra.yaml, unless they are very advanced or not meant to be updated. >>>>> >>>>> Another tangential issue is that there are features/settings that >>>>> don't even have a Config entry, but are just controlled by JVM properties. >>>>> >>>>> I think that we should attempt to unify Config and jvm properties >>>>> under a predictable structure. For example, if there is a YAML config >>>>> enable_user_defined_functions, then there should be a respective JVM flag >>>>> -Dcassandra.enable_user_defined_functions, and vice versa. >>>>> >>>> >>>> Yeah, good idea. >>>> >>>> >>>>> >>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič < >>>>> smikloso...@apache.org> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> from time to time I see configuration properties in Config.java and >>>>>> they are clearly not in cassandra.yaml. Not every property in Config is >>>>>> in >>>>>> cassandra.yaml. I would like to know if there is some specific reason >>>>>> behind that. >>>>>> >>>>>> Question related to that is if we could not have a build-time check >>>>>> that all properties in Config have to be in cassandra.yaml and fail the >>>>>> build if a property in Config does not have its counterpart in yaml. >>>>>> >>>>>> There are dozens of properties in Config and I have a strong >>>>>> suspicion that we missed to publish some to yaml so users do not even >>>>>> know >>>>>> such a property exists and as of now we do not even know which they are. >>>>>> >>>>> > > -- > Dmitry Konstantinov >