Back before inks were invented, engraving was the main form of writing. You formed a tablet out of clay and used a stick to impress something that looked like “<LT>” on the surface, and there you had it: primitive markup. Photography is a modern form of leaving a mark without ink, so today we are going to explore how to create a DITA document using a cell-phone camera as your XML editor.
First, find some printed page that you want to convert into DITA. I chose a type specimen book printed in 1810 by C. Norris & Co., Newburyport, Massachusetts on a Peter Smith Patent hand press. Other forms of historical engraving will probably work: intaglio and offset are good; lithographs and silkscreen not so much. I chose this text as a symbolic Rosetta Stone for this exercise. Set the camera for low resolution and take a picture:
Now get a copy of an OCR program. This post describes how to get a free demo version that does the job: Download ABBYY FineReader 6.0 Spring for Free.
Just open the OCR program and drag your uploaded photo onto the interface. If your karma is good, you should see something like this:
It is almost easy from here on. Just copy the recovered text into Microsoft Word, save as Filtered HTML, and run the h2d converter in DITA Open Toolkit (you get the hang of it after only a few hours of setting your Java classpath and finally typing the commands correctly). It’s not every day that you can take a bunch of free tools and get a DITA result like this other example that I just tried of a journal entry by Caption John Smith from 1600 or so:
Tlii Siege if OlnnipiKli; (in e&celieiU stratagem by Smith; another not much worse.Ari’Enthc lossc of Catiku. ilic V’j/i/iT.sivitIiiivputic thou-KUIlI lii’airjlf’l t!l’ Mriin^ ‘[‘(mm-i>l’ Ollllil/lti’.’lt -.0 -.Imi-rln-vi’ i mir.’l.ll /«’,’,! ,’C’V,’ ll!’-”/., ‘•[/l’-‘l:'<. ..ml!’,’,: M.1′ , ,.’u:!’i…..!Barrai SueO, General] >.i rii, \i-],«],ik(- Vnilleiy, lie Udtriustii ill* (.^-iiiiiniir, his inmtiy Iricml. ‘in’li II link1, lliiil he muikl i]iiucr,:ikc in iti.iki’ III.K hiuiv. ;-in tliiti-; [n> intciid-cil, ;tti<[ liau- \:\~ ,I’I-IHT. lu.fld tin i- brine him inn to H>IIU>. LI!,in vih’fHii :iii”lit ninki’rtic fl;imc nf x ‘(‘i.jvh •.[‘cue In t],i- T(.H,,I-; /uVf/ ii]f!,minl MJtli tin, Knu.fii; iiiK’nlio”; \,,,;;/, made i! so |-!:,iiL.’. ihm h.nliKlih li.r gav. l.i.nf.nidrsHlio in the dsiiki1 nc.n hA-j;i-]it iiim In a i>n……iiinc, wln-iv!„• !,lu lU’d thi( i’ ‘I’niTlii – ri|iur.r,i,.ui Ai,.n cllu i . u Lirli ji^milii’Hi’u!. ,,.n! ;M,~HUTC. anniin niiti llmv oiln r iin’s in lih<th(,nSli iistant se ,“! 1…..l.’i…….h. I nill . h.ir-,. (.11 !’,r I .1,1. ;it In Al;i ‘mis,l[,.> you; Ete-sfto^ ansivtred liv v.unl,!. :i.,(hl»is ius done First licviii liK !ni’s^tM(. .is Iw, tk-ii diikU-il l!ii.< Al^lB.ljL-1 in (M
I was so impressed, I just had to show you on this first day of April. DITA doesn’t have to be hard, we just make it that way if we can. As I say almost too often, “Use this knowledge with care!”