I love the walk through things. I'd clearly have found a wired, digital, method 
of doing it ( printer port or such ).
I had a similar problem. I was recovering 4004 code printed out with what 
looked like a ASR33 print. I did it manually. On looking at the data, I suspect 
the platen had ruts as the pdf image had faded columns. Most of the letter text 
was for labels or comments. These were easy to patch things like P and F or E 
and B. The harder one was C and 0.  The program mostly used decimal but when 
specifying 4004 registers data, it used for the SRC instructions or nibble 
data, they were in HEX. C and 0 were used quite often. I was able to find what 
I believe were all the errors by emulating the 4004 code and finding errors in 
the operation. I recall finding the last error that was in the display output 
routine ( related to placement of the decimal point ). I'd put "00" where the 
original code was "CC". 99+% of the "CC" in the rest of the code were really 
"00". Most mixed were either "0C" or "C0" so it seemed justified to be "00". It 
was the only location that "CC" existed in the entire code.
Even the best OCR could not have done as well as a human that understood what 
the intent was. Understanding the redundancy in the code is a valuable 
attribute that a human has that would be difficult for a learning program to 
pick up. I've used similar thinking to fix cassette tape data that had 
dropouts. It was BASIC code, although tokenized. The redundancy of the good 
parts of the data made filling in the missing parts easier.
Dwight

________________________________
From: cctalk <cctalk-boun...@classiccmp.org> on behalf of Liam Proven via 
cctalk <cctalk@classiccmp.org>
Sent: Thursday, June 27, 2019 4:55 AM
To: Discussion: On-Topic and Off-Topic Posts
Subject: Recovering the ROM of an IBM 5100 using OCR (among other things)

This is *epic*.

https://github.com/stepleton/5100NonExecutableROSDecode/blob/master/WRITEUP.md

--
Liam Proven - Profile: https://about.me/liamproven
Email: lpro...@cix.co.uk - Google Mail/Hangouts/Plus: lpro...@gmail.com
Twitter/Facebook/Flickr: lproven - Skype/LinkedIn: liamproven
UK: +44 7939-087884 - ČR (+ WhatsApp/Telegram/Signal): +420 702 829 053

Reply via email to