syedamisbahh opened a new pull request, #2171:
URL: https://github.com/apache/tika/pull/2171

   This pull request includes multiple code refactorings aimed at improving 
clarity, readability, and maintainability in the Apache Tika codebase. The 
changes preserve original functionality while making the code more expressive 
and modular.
   
   Refactorings Applied
   1. Extract Method + Decompose Conditional
   Location: MediaTypeRegistry#getSupertype()
   Replaced deeply nested if-else blocks with helper methods like 
isXmlSubtype(), isTextType(), isEmptyType(), etc.
   Used early returns to simplify control flow and reduce cyclomatic complexity.
   
   2. Rename Variable
   Locations: MediaTypeRegistry.java, JsonPipesIterator.java
   Renamed variables to improve self-documentation:
   type → mediaType
   t → tuple
   r → reader
   
   3. Introduce Explaining Constants
   Location: TextStatistics#looksLikeUTF8()
   Replaced magic numbers (e.g. 0x20, 0x80, 0xc0) with named constants for 
better readability and understanding of UTF-8 byte range logic.
   
   **Note**: I am awaiting access to the Apache Tika Jira issue tracker to file 
a formal issue.
   Once granted access, I will:
   - Create the corresponding [TIKA-XXXX] issue.
   - Update this PR title and description to include the issue reference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to