Lewis John McGibbney created NUTCH-3125:
-------------------------------------------

             Summary: Replace retired MRUnit dependency with Mockito + JUnit 5
                 Key: NUTCH-3125
                 URL: https://issues.apache.org/jira/browse/NUTCH-3125
             Project: Nutch
          Issue Type: Task
          Components: build, dependency, test
            Reporter: Lewis John McGibbney
            Assignee: Lewis John McGibbney
             Fix For: 1.22


[Apache MRUnit|https://mrunit.apache.org/] was a specialized Java library for 
unit testing MapReduce components (mappers, reducers, and drivers) in 
isolation, without needing a full Hadoop cluster. Since its retirement in 2016 
due to inactivity, the community shifted toward more general-purpose testing 
tools that can handle Hadoop's unique architecture, such as mocking contexts, 
writables, and static methods.

I forgot that Nutch depends on MRUnit until I revisited NUTCGH-288.

 

I propose we replace the MRUnit test dependency with 
[Mockito|https://site.mockito.org/]; a popular mocking framework that would 
allow us to mock Hadoop's Mapper.Context, Reducer.Context, and other framework 
elements. This directly replicates MRUnit's ability to test mappers/reducers in 
isolation by simulating input/output without a cluster. Mockito is lightweight, 
actively maintained, and doesn't require Hadoop-specific jars beyond the 
existing project dependencies defined in ivy.xml. Mockito can be combined with 
JUnit 5 for assertions. For static method mocking (e.g., Hadoop counters), 
apparently we can even pair it with 
[PowerMock|https://github.com/powermock/powermock]!

 

Currently MRUnit is used in the following test Classes

./org/apache/nutch/crawl/CrawlDbUpdateTestDriver.java
./org/apache/nutch/indexer/TestIndexerMapReduce.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to