Love the suggestion of marking the hidden/advanced configuration properties 
with annotations. Leaving a “configuration” property out of the main 
configuration file should be deliberate and well thought and argued. I highly 
doubt we have 112 “advanced” properties that really need to be hidden to 
protect users from themselves :-(

Agreed with Ekaterina that it is worth reviewing the current properties and 
come up with the subset that should stay hidden, and expose the rest on the 
yaml file.


> On Jan 24, 2025, at 8:00 AM, Dmitry Konstantinov <netud...@gmail.com> wrote:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-20249
> 
> On Fri, 24 Jan 2025 at 15:40, Dmitry Konstantinov <netud...@gmail.com 
> <mailto:netud...@gmail.com>> wrote:
>> Maybe I missed some patterns but it looks like a pretty good estimation, I 
>> did like 10 random checks manually to verify :-)
>> I will try to make an ant target with a similar logic (hopefully, during the 
>> weekend)
>> I will create a ticket to track this activity (to share attachments there to 
>> not overload the thread with such outputs in future).
>> 
>> On Fri, 24 Jan 2025 at 15:37, Štefan Miklošovič <smikloso...@apache.org 
>> <mailto:smikloso...@apache.org>> wrote:
>>> Oh my god, 112? :DD I was thinking it would be less than 10.
>>> 
>>> Anyway, I think we need to integrate this to some ant target. If you 
>>> expanded on this, that would be great.
>>> 
>>> On Fri, Jan 24, 2025 at 4:31 PM Dmitry Konstantinov <netud...@gmail.com 
>>> <mailto:netud...@gmail.com>> wrote:
>>>> A very primitive implementation of the 1st idea below:
>>>> 
>>>> String configUrl = 
>>>> "file:///Users/dmitry/IdeaProjects/cassandra-trunk/conf/cassandra.yaml";
>>>> Field[] allFields = Config.class.getFields();
>>>> List<String> topLevelPropertyNames = new ArrayList<>();
>>>> for(Field field : allFields)
>>>> {
>>>>     if (!Modifier.isStatic(field.getModifiers()))
>>>>     {
>>>>         topLevelPropertyNames.add(field.getName());
>>>>     }
>>>> }
>>>> 
>>>> URL url = new URL(configUrl);
>>>> List<String> lines = Files.readAllLines(Paths.get(url.toURI()));
>>>> 
>>>> int missedCount = 0;
>>>> for (String propertyName : topLevelPropertyNames)
>>>> {
>>>>     boolean found = false;
>>>>     for (String line : lines)
>>>>     {
>>>>         if (line.startsWith(propertyName + ":")
>>>>             || line.startsWith("#" + propertyName + ":")
>>>>             || line.startsWith("# " + propertyName + ":")) {
>>>>             found = true;
>>>>             break;
>>>>         }
>>>>     }
>>>>     if (!found)
>>>>     {
>>>>         missedCount++;
>>>>         System.out.println(propertyName);
>>>>     }
>>>> }
>>>> System.out.println("Total missed:" + missedCount);
>>>> 
>>>> It prints the following config property names which are defined in 
>>>> Config.java but not present as "property" or "# property " in a file:
>>>> permissions_cache_max_entries
>>>> roles_cache_max_entries
>>>> credentials_cache_max_entries
>>>> auto_bootstrap
>>>> force_new_prepared_statement_behaviour
>>>> use_deterministic_table_id
>>>> repair_request_timeout
>>>> stream_transfer_task_timeout
>>>> cms_await_timeout
>>>> cms_default_max_retries
>>>> cms_default_retry_backoff
>>>> epoch_aware_debounce_inflight_tracker_max_size
>>>> metadata_snapshot_frequency
>>>> available_processors
>>>> repair_session_max_tree_depth
>>>> use_offheap_merkle_trees
>>>> internode_max_message_size
>>>> native_transport_max_message_size
>>>> native_transport_max_request_data_in_flight_per_ip
>>>> native_transport_max_request_data_in_flight
>>>> native_transport_receive_queue_capacity
>>>> min_free_space_per_drive
>>>> max_space_usable_for_compactions_in_percentage
>>>> reject_repair_compaction_threshold
>>>> concurrent_index_builders
>>>> max_streaming_retries
>>>> commitlog_max_compression_buffers_in_pool
>>>> max_mutation_size
>>>> dynamic_snitch
>>>> failure_detector
>>>> use_creation_time_for_hint_ttl
>>>> key_cache_migrate_during_compaction
>>>> key_cache_invalidate_after_sstable_deletion
>>>> paxos_cache_size
>>>> file_cache_round_up
>>>> disk_optimization_estimate_percentile
>>>> disk_optimization_page_cross_chance
>>>> purgeable_tobmstones_metric_granularity
>>>> windows_timer_interval
>>>> otc_coalescing_strategy
>>>> otc_coalescing_window_us
>>>> otc_coalescing_enough_coalesced_messages
>>>> otc_backlog_expiration_interval_ms
>>>> scripted_user_defined_functions_enabled
>>>> user_defined_functions_threads_enabled
>>>> allow_insecure_udfs
>>>> allow_extra_insecure_udfs
>>>> user_defined_functions_warn_timeout
>>>> user_defined_functions_fail_timeout
>>>> user_function_timeout_policy
>>>> back_pressure_enabled
>>>> back_pressure_strategy
>>>> repair_command_pool_full_strategy
>>>> repair_command_pool_size
>>>> block_for_peers_timeout_in_secs
>>>> block_for_peers_in_remote_dcs
>>>> skip_stream_disk_space_check
>>>> snapshot_on_repaired_data_mismatch
>>>> validation_preview_purge_head_start
>>>> initial_range_tombstone_list_allocation_size
>>>> range_tombstone_list_growth_factor
>>>> snapshot_on_duplicate_row_detection
>>>> check_for_duplicate_rows_during_reads
>>>> check_for_duplicate_rows_during_compaction
>>>> autocompaction_on_startup_enabled
>>>> auto_optimise_inc_repair_streams
>>>> auto_optimise_full_repair_streams
>>>> auto_optimise_preview_repair_streams
>>>> consecutive_message_errors_threshold
>>>> internode_error_reporting_exclusions
>>>> compact_tables_enabled
>>>> vector_type_enabled
>>>> intersect_filtering_query_warned
>>>> intersect_filtering_query_enabled
>>>> streaming_slow_events_log_timeout
>>>> repair_state_expires
>>>> repair_state_size
>>>> paxos_variant
>>>> skip_paxos_repair_on_topology_change
>>>> paxos_purge_grace_period
>>>> paxos_on_linearizability_violations
>>>> paxos_state_purging
>>>> paxos_repair_enabled
>>>> paxos_topology_repair_no_dc_checks
>>>> paxos_topology_repair_strict_each_quorum
>>>> skip_paxos_repair_on_topology_change_keyspaces
>>>> paxos_contention_wait_randomizer
>>>> paxos_contention_min_wait
>>>> paxos_contention_max_wait
>>>> paxos_contention_min_delta
>>>> paxos_repair_parallelism
>>>> sstable_read_rate_persistence_enabled
>>>> client_request_size_metrics_enabled
>>>> max_top_size_partition_count
>>>> max_top_tombstone_partition_count
>>>> min_tracked_partition_size
>>>> min_tracked_partition_tombstone_count
>>>> top_partitions_enabled
>>>> severity_during_decommission
>>>> progress_barrier_min_consistency_level
>>>> progress_barrier_default_consistency_level
>>>> progress_barrier_timeout
>>>> progress_barrier_backoff
>>>> discovery_timeout
>>>> unsafe_tcm_mode
>>>> cql_start_time
>>>> native_transport_throw_on_overload
>>>> native_transport_queue_max_item_age_threshold
>>>> native_transport_min_backoff_on_queue_overload
>>>> native_transport_max_backoff_on_queue_overload
>>>> native_transport_timeout
>>>> enforce_native_deadline_for_hints
>>>> Total missed:112
>>>> 
>>>> 
>>>> On Fri, 24 Jan 2025 at 15:10, Štefan Miklošovič <smikloso...@apache.org 
>>>> <mailto:smikloso...@apache.org>> wrote:
>>>>> It should also work the other way around. If there is a property which is 
>>>>> commented out in yaml and it is not in Config.java, that should fail as 
>>>>> well. If it is not commented out and it is not in Config.java, that will 
>>>>> fail in runtime as it fails on unrecognized property.
>>>>> 
>>>>> This will be used in practice very rarely as we seldom remove the 
>>>>> properties in Config but if we do and a property is commented out, we 
>>>>> should not ship a dead property name, even commented out. 
>>>>> 
>>>>> On Fri, Jan 24, 2025 at 3:51 PM Paulo Motta <pa...@apache.org 
>>>>> <mailto:pa...@apache.org>> wrote:
>>>>>> >  >  If "# my_cool_property: true" is NOT in cassandra.yaml, we might 
>>>>>> > indeed add it, also commented out. I think it would be quite easy to 
>>>>>> > check against yaml if there is a line starting on "# my_cool_property" 
>>>>>> > or just on "my_cool_property". Both cases would satisfy the check.
>>>>>> 
>>>>>> Makes sense, I think this would be good to have as a lint or test to 
>>>>>> easily catch overlooks during review.
>>>>>> 
>>>>>> 
>>>>>> On Fri, Jan 24, 2025 at 9:44 AM Štefan Miklošovič 
>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Jan 24, 2025 at 3:27 PM Paulo Motta <pa...@apache.org 
>>>>>>> <mailto:pa...@apache.org>> wrote:
>>>>>>>> > from time to time I see configuration properties in Config.java and 
>>>>>>>> > they are clearly not in cassandra.yaml. Not every property in Config 
>>>>>>>> > is in cassandra.yaml. I would like to know if there is some specific 
>>>>>>>> > reason behind that.
>>>>>>>> 
>>>>>>>> I think one of the original reasons was to "hide" advanced configs 
>>>>>>>> that are not meant to be updated, unless in very niche circumstances. 
>>>>>>>> However I think this has been extrapolated to non-advanced settings.
>>>>>>>> 
>>>>>>>> > Question related to that is if we could not have a build-time check 
>>>>>>>> > that all properties in Config have to be in cassandra.yaml and fail 
>>>>>>>> > the build if a property in Config does not have its counterpart in 
>>>>>>>> > yaml.
>>>>>>>> 
>>>>>>>> Are you saying every configuration property should be commented-out, 
>>>>>>>> or do you think that every Config property should be specified in 
>>>>>>>> cassandra.yaml with their default uncomented ? One issue with that is 
>>>>>>>> that you could cause user confusion if you "reveal" a niche/advanced 
>>>>>>>> config that is not meant to be updated. I think this would be 
>>>>>>>> addressed by the @HiddenInYaml flag you are proposing in a later post.
>>>>>>> 
>>>>>>> Yes, then can stay hidden, but we should annotate it with @Hidden or 
>>>>>>> similar. As of now, if that property is not in yaml, we just don't know 
>>>>>>> if it was forgotten to be added or if we have not added it on purpose.
>>>>>>> 
>>>>>>> They can keep being commented out if they currently are. Imagine a 
>>>>>>> property in Config.java
>>>>>>> 
>>>>>>> public boolean my_cool_property = true;
>>>>>>> 
>>>>>>> and then this in cassandra.yaml
>>>>>>> 
>>>>>>> # my_cool_property: true
>>>>>>> 
>>>>>>> It is completely ok.
>>>>>>> 
>>>>>>> If "# my_cool_property: true" is NOT in cassandra.yaml, we might indeed 
>>>>>>> add it, also commented out. I think it would be quite easy to check 
>>>>>>> against yaml if there is a line starting on "# my_cool_property" or 
>>>>>>> just on "my_cool_property". Both cases would satisfy the check.
>>>>>>> 
>>>>>>>  
>>>>>>>> > There are dozens of properties in Config and I have a strong 
>>>>>>>> > suspicion that we missed to publish some to yaml so users do not 
>>>>>>>> > even know such a property exists and as of now we do not even know 
>>>>>>>> > which they are.
>>>>>>>> 
>>>>>>>> I believe this is a problem. I think most properties should be in 
>>>>>>>> cassandra.yaml, unless they are very advanced or not meant to be 
>>>>>>>> updated.
>>>>>>>> 
>>>>>>>> Another tangential issue is that there are features/settings that 
>>>>>>>> don't even have a Config entry, but are just controlled by JVM 
>>>>>>>> properties.
>>>>>>>> 
>>>>>>>> I think that we should attempt to unify Config and jvm properties 
>>>>>>>> under a predictable structure. For example, if there is a YAML config 
>>>>>>>> enable_user_defined_functions, then there should be a respective JVM 
>>>>>>>> flag -Dcassandra.enable_user_defined_functions, and vice versa.
>>>>>>> 
>>>>>>> Yeah, good idea.
>>>>>>>  
>>>>>>>> 
>>>>>>>> On Fri, Jan 24, 2025 at 9:16 AM Štefan Miklošovič 
>>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote:
>>>>>>>>> Hello,
>>>>>>>>> 
>>>>>>>>> from time to time I see configuration properties in Config.java and 
>>>>>>>>> they are clearly not in cassandra.yaml. Not every property in Config 
>>>>>>>>> is in cassandra.yaml. I would like to know if there is some specific 
>>>>>>>>> reason behind that.
>>>>>>>>> 
>>>>>>>>> Question related to that is if we could not have a build-time check 
>>>>>>>>> that all properties in Config have to be in cassandra.yaml and fail 
>>>>>>>>> the build if a property in Config does not have its counterpart in 
>>>>>>>>> yaml.
>>>>>>>>> 
>>>>>>>>> There are dozens of properties in Config and I have a strong 
>>>>>>>>> suspicion that we missed to publish some to yaml so users do not even 
>>>>>>>>> know such a property exists and as of now we do not even know which 
>>>>>>>>> they are.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Dmitry Konstantinov
>> 
>> 
>> 
>> --
>> Dmitry Konstantinov
> 
> 
> 
> --
> Dmitry Konstantinov

Reply via email to