Love the suggestion of marking the hidden/advanced configuration properties with annotations. Leaving a “configuration” property out of the main configuration file should be deliberate and well thought and argued. I highly doubt we have 112 “advanced” properties that really need to be hidden to protect users from themselves :-(
Agreed with Ekaterina that it is worth reviewing the current properties and come up with the subset that should stay hidden, and expose the rest on the yaml file. > On Jan 24, 2025, at 8:00 AM, Dmitry Konstantinov <netud...@gmail.com> wrote: > > https://issues.apache.org/jira/browse/CASSANDRA-20249 > > On Fri, 24 Jan 2025 at 15:40, Dmitry Konstantinov <netud...@gmail.com > <mailto:netud...@gmail.com>> wrote: >> Maybe I missed some patterns but it looks like a pretty good estimation, I >> did like 10 random checks manually to verify :-) >> I will try to make an ant target with a similar logic (hopefully, during the >> weekend) >> I will create a ticket to track this activity (to share attachments there to >> not overload the thread with such outputs in future). >> >> On Fri, 24 Jan 2025 at 15:37, Štefan Miklošovič <smikloso...@apache.org >> <mailto:smikloso...@apache.org>> wrote: >>> Oh my god, 112? :DD I was thinking it would be less than 10. >>> >>> Anyway, I think we need to integrate this to some ant target. If you >>> expanded on this, that would be great. >>> >>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com >>> <mailto:netud...@gmail.com>> wrote: >>>> A very primitive implementation of the 1st idea below: >>>> >>>> String configUrl = >>>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml"; >>>> Field[] allFields = Config.class.getFields(); >>>> List<String> topLevelPropertyNames = new ArrayList<>(); >>>> for(Field field : allFields) >>>> { >>>> if (!Modifier.isStatic(field.getModifiers())) >>>> { >>>> topLevelPropertyNames.add(field.getName()); >>>> } >>>> } >>>> >>>> URL url = new URL(configUrl); >>>> List<String> lines = Files.readAllLines(Paths.get(url.toURI())); >>>> >>>> int missedCount = 0; >>>> for (String propertyName : topLevelPropertyNames) >>>> { >>>> boolean found = false; >>>> for (String line : lines) >>>> { >>>> if (line.startsWith(propertyName + ":") >>>> || line.startsWith("#" + propertyName + ":") >>>> || line.startsWith("# " + propertyName + ":")) { >>>> found = true; >>>> break; >>>> } >>>> } >>>> if (!found) >>>> { >>>> missedCount++; >>>> System.out.println(propertyName); >>>> } >>>> } >>>> System.out.println("Total missed:" + missedCount); >>>> >>>> It prints the following config property names which are defined in >>>> Config.java but not present as "property" or "# property " in a file: >>>> permissions_cache_max_entries >>>> roles_cache_max_entries >>>> credentials_cache_max_entries >>>> auto_bootstrap >>>> force_new_prepared_statement_behaviour >>>> use_deterministic_table_id >>>> repair_request_timeout >>>> stream_transfer_task_timeout >>>> cms_await_timeout >>>> cms_default_max_retries >>>> cms_default_retry_backoff >>>> epoch_aware_debounce_inflight_tracker_max_size >>>> metadata_snapshot_frequency >>>> available_processors >>>> repair_session_max_tree_depth >>>> use_offheap_merkle_trees >>>> internode_max_message_size >>>> native_transport_max_message_size >>>> native_transport_max_request_data_in_flight_per_ip >>>> native_transport_max_request_data_in_flight >>>> native_transport_receive_queue_capacity >>>> min_free_space_per_drive >>>> max_space_usable_for_compactions_in_percentage >>>> reject_repair_compaction_threshold >>>> concurrent_index_builders >>>> max_streaming_retries >>>> commitlog_max_compression_buffers_in_pool >>>> max_mutation_size >>>> dynamic_snitch >>>> failure_detector >>>> use_creation_time_for_hint_ttl >>>> key_cache_migrate_during_compaction >>>> key_cache_invalidate_after_sstable_deletion >>>> paxos_cache_size >>>> file_cache_round_up >>>> disk_optimization_estimate_percentile >>>> disk_optimization_page_cross_chance >>>> purgeable_tobmstones_metric_granularity >>>> windows_timer_interval >>>> otc_coalescing_strategy >>>> otc_coalescing_window_us >>>> otc_coalescing_enough_coalesced_messages >>>> otc_backlog_expiration_interval_ms >>>> scripted_user_defined_functions_enabled >>>> user_defined_functions_threads_enabled >>>> allow_insecure_udfs >>>> allow_extra_insecure_udfs >>>> user_defined_functions_warn_timeout >>>> user_defined_functions_fail_timeout >>>> user_function_timeout_policy >>>> back_pressure_enabled >>>> back_pressure_strategy >>>> repair_command_pool_full_strategy >>>> repair_command_pool_size >>>> block_for_peers_timeout_in_secs >>>> block_for_peers_in_remote_dcs >>>> skip_stream_disk_space_check >>>> snapshot_on_repaired_data_mismatch >>>> validation_preview_purge_head_start >>>> initial_range_tombstone_list_allocation_size >>>> range_tombstone_list_growth_factor >>>> snapshot_on_duplicate_row_detection >>>> check_for_duplicate_rows_during_reads >>>> check_for_duplicate_rows_during_compaction >>>> autocompaction_on_startup_enabled >>>> auto_optimise_inc_repair_streams >>>> auto_optimise_full_repair_streams >>>> auto_optimise_preview_repair_streams >>>> consecutive_message_errors_threshold >>>> internode_error_reporting_exclusions >>>> compact_tables_enabled >>>> vector_type_enabled >>>> intersect_filtering_query_warned >>>> intersect_filtering_query_enabled >>>> streaming_slow_events_log_timeout >>>> repair_state_expires >>>> repair_state_size >>>> paxos_variant >>>> skip_paxos_repair_on_topology_change >>>> paxos_purge_grace_period >>>> paxos_on_linearizability_violations >>>> paxos_state_purging >>>> paxos_repair_enabled >>>> paxos_topology_repair_no_dc_checks >>>> paxos_topology_repair_strict_each_quorum >>>> skip_paxos_repair_on_topology_change_keyspaces >>>> paxos_contention_wait_randomizer >>>> paxos_contention_min_wait >>>> paxos_contention_max_wait >>>> paxos_contention_min_delta >>>> paxos_repair_parallelism >>>> sstable_read_rate_persistence_enabled >>>> client_request_size_metrics_enabled >>>> max_top_size_partition_count >>>> max_top_tombstone_partition_count >>>> min_tracked_partition_size >>>> min_tracked_partition_tombstone_count >>>> top_partitions_enabled >>>> severity_during_decommission >>>> progress_barrier_min_consistency_level >>>> progress_barrier_default_consistency_level >>>> progress_barrier_timeout >>>> progress_barrier_backoff >>>> discovery_timeout >>>> unsafe_tcm_mode >>>> cql_start_time >>>> native_transport_throw_on_overload >>>> native_transport_queue_max_item_age_threshold >>>> native_transport_min_backoff_on_queue_overload >>>> native_transport_max_backoff_on_queue_overload >>>> native_transport_timeout >>>> enforce_native_deadline_for_hints >>>> Total missed:112 >>>> >>>> >>>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <smikloso...@apache.org >>>> <mailto:smikloso...@apache.org>> wrote: >>>>> It should also work the other way around. If there is a property which is >>>>> commented out in yaml and it is not in Config.java, that should fail as >>>>> well. If it is not commented out and it is not in Config.java, that will >>>>> fail in runtime as it fails on unrecognized property. >>>>> >>>>> This will be used in practice very rarely as we seldom remove the >>>>> properties in Config but if we do and a property is commented out, we >>>>> should not ship a dead property name, even commented out. >>>>> >>>>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org >>>>> <mailto:pa...@apache.org>> wrote: >>>>>> > > If "# my_cool_property: true" is NOT in cassandra.yaml, we might >>>>>> > indeed add it, also commented out. I think it would be quite easy to >>>>>> > check against yaml if there is a line starting on "# my_cool_property" >>>>>> > or just on "my_cool_property". Both cases would satisfy the check. >>>>>> >>>>>> Makes sense, I think this would be good to have as a lint or test to >>>>>> easily catch overlooks during review. >>>>>> >>>>>> >>>>>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič >>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>> >>>>>>> >>>>>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org >>>>>>> <mailto:pa...@apache.org>> wrote: >>>>>>>> > from time to time I see configuration properties in Config.java and >>>>>>>> > they are clearly not in cassandra.yaml. Not every property in Config >>>>>>>> > is in cassandra.yaml. I would like to know if there is some specific >>>>>>>> > reason behind that. >>>>>>>> >>>>>>>> I think one of the original reasons was to "hide" advanced configs >>>>>>>> that are not meant to be updated, unless in very niche circumstances. >>>>>>>> However I think this has been extrapolated to non-advanced settings. >>>>>>>> >>>>>>>> > Question related to that is if we could not have a build-time check >>>>>>>> > that all properties in Config have to be in cassandra.yaml and fail >>>>>>>> > the build if a property in Config does not have its counterpart in >>>>>>>> > yaml. >>>>>>>> >>>>>>>> Are you saying every configuration property should be commented-out, >>>>>>>> or do you think that every Config property should be specified in >>>>>>>> cassandra.yaml with their default uncomented ? One issue with that is >>>>>>>> that you could cause user confusion if you "reveal" a niche/advanced >>>>>>>> config that is not meant to be updated. I think this would be >>>>>>>> addressed by the @HiddenInYaml flag you are proposing in a later post. >>>>>>> >>>>>>> Yes, then can stay hidden, but we should annotate it with @Hidden or >>>>>>> similar. As of now, if that property is not in yaml, we just don't know >>>>>>> if it was forgotten to be added or if we have not added it on purpose. >>>>>>> >>>>>>> They can keep being commented out if they currently are. Imagine a >>>>>>> property in Config.java >>>>>>> >>>>>>> public boolean my_cool_property = true; >>>>>>> >>>>>>> and then this in cassandra.yaml >>>>>>> >>>>>>> # my_cool_property: true >>>>>>> >>>>>>> It is completely ok. >>>>>>> >>>>>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might indeed >>>>>>> add it, also commented out. I think it would be quite easy to check >>>>>>> against yaml if there is a line starting on "# my_cool_property" or >>>>>>> just on "my_cool_property". Both cases would satisfy the check. >>>>>>> >>>>>>> >>>>>>>> > There are dozens of properties in Config and I have a strong >>>>>>>> > suspicion that we missed to publish some to yaml so users do not >>>>>>>> > even know such a property exists and as of now we do not even know >>>>>>>> > which they are. >>>>>>>> >>>>>>>> I believe this is a problem. I think most properties should be in >>>>>>>> cassandra.yaml, unless they are very advanced or not meant to be >>>>>>>> updated. >>>>>>>> >>>>>>>> Another tangential issue is that there are features/settings that >>>>>>>> don't even have a Config entry, but are just controlled by JVM >>>>>>>> properties. >>>>>>>> >>>>>>>> I think that we should attempt to unify Config and jvm properties >>>>>>>> under a predictable structure. For example, if there is a YAML config >>>>>>>> enable_user_defined_functions, then there should be a respective JVM >>>>>>>> flag -Dcassandra.enable_user_defined_functions, and vice versa. >>>>>>> >>>>>>> Yeah, good idea. >>>>>>> >>>>>>>> >>>>>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič >>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> from time to time I see configuration properties in Config.java and >>>>>>>>> they are clearly not in cassandra.yaml. Not every property in Config >>>>>>>>> is in cassandra.yaml. I would like to know if there is some specific >>>>>>>>> reason behind that. >>>>>>>>> >>>>>>>>> Question related to that is if we could not have a build-time check >>>>>>>>> that all properties in Config have to be in cassandra.yaml and fail >>>>>>>>> the build if a property in Config does not have its counterpart in >>>>>>>>> yaml. >>>>>>>>> >>>>>>>>> There are dozens of properties in Config and I have a strong >>>>>>>>> suspicion that we missed to publish some to yaml so users do not even >>>>>>>>> know such a property exists and as of now we do not even know which >>>>>>>>> they are. >>>> >>>> >>>> >>>> -- >>>> Dmitry Konstantinov >> >> >> >> -- >> Dmitry Konstantinov > > > > -- > Dmitry Konstantinov