Retrieving (La)TeX from DVI, etc.

The job just can't be done automatically: DVI, PostScript and PDF are "final" formats, supposedly not susceptible to further editing - information about where things came from has been discarded. So if you've lost your (La)TeX source (or never had the source of a document you need to work on) you've a serious job on your hands. In many circumstances, the best strategy is to retype the whole document, but this strategy is to be tempered by consideration of the size of the document and the potential typists' skills.

If automatic assistance is necessary, it's unlikely that any more than text retrieval is going to be possible; the (La)TeX markup that creates the typographic effects of the document needs to be recreated by editing.

If the file you have is in DVI format, many of the techniques for converting (La)TeX to ASCII are applicable. Consider dvi2tty, crudetype and catdvi. Remember that there are likely to be problems finding included material (such as included PostScript figures, that don't appear in the DVI file itself), and mathematics is unlikely to convert easily.

To retrieve text from PostScript files, the ps2ascii tool (part of the ghostscript distribution) is available. One could try applying this tool to PostScript derived from an PDF file using pdf2ps (also from the ghostscript distribution), or Acrobat Reader itself; an alternative is pdftotext, which is distributed with xpdf.

catdvi
dviware/catdvi.tar.gz
crudetype
dviware/crudetype.tar.gz
dvi2tty
nonfree/dviware/dvi2tty.tar.gz
ghostscript
nonfree/support/ghostscript/
xpdf
Browse support/xpdf/