This post discusses an advanced typesetting problem dealing with the interplay of margin notes with German and Arabic LaTeX packages.
We have published a critical edition of Georg von der Gabelentz’s Die Sprachwissenschaft. This book comprises the text of the first edition from 1891 and the second edition from 1901. Differences between the editions are marked with different colours in the running text. Substitutions are marked in the margin. So far so good. The following image gives an example.
Gabelentz treats many different languages. Among other things, he has passages on Arabic and Hebrew. These also occur with critical annotations.
Arabic script is an abjad. Vowels are normally not written, but they can be added as diacritic marks if required. Texts are typically either fully vocalized or fully unvocalized. The LaTeX packages arabtex and hebtex can be used to typeset this. The vocalized and unvocalized variants are given below. The relevant marks are the fatḥah ( َ) and the sukūn (ـْ). The fatḥah indicates that the next vowel is a short /a/, whereas the sukūn indicates that the following vowel slot ist empty. In the example below, we have the consonant sequence “qtlt” and the vowel sequence “aa-a” (indicated by َ ـْ َ َ ), which give qa.ta.l.ta when combined.
There is one instance of incomplete vocalization in the book. The final letter is missing its fatḥah. The first letters of the word qatalta all come with diacritics; it is only the last letter which is bare. It is extremely likely that this is an oversight. Therefore, this is corrected in the running text and the original, erroneous, sequence is given in the margin for reference.
The LaTeX package arabtex uses double quotation marks to suppress diacritics.
Gabelentz’s book is written in German. The option [german]
of the LaTeX package babel
also uses "
to mark German umlauts. One could use "a
to write ä for instance (today one would of course use Unicode, but the option is still there).
This presents no problem while the text is on the main page. However, things are different with margins and footnotes. It took some time to figure out where the problem lay.
The main problem is that commands like \marginpar
(and \footnote
) tokenize their argument without interpreting it. The reason for this is that LaTeX cannot typeset margin notes (and footnotes) immediately upon encountering them in the code. It needs to store their content temporarily, then calculate the optimal position for them in the margin (or at the bottom of the page, in the case of footnotes), and finally typeset the content at the calculated position. The mechanics of storing the content have the effect that babel’s definition of the quotation mark "
is fixated in a way that conflicts with the arabic packages.
One solution is to set the \catcode
for the quotation mark manually just for the relevant passage in such a way that it can be processed by the arabic packages without problem.
{\catcode`\"=12%
\marginpar{
\vspace{-5pt}
\color{black}
\raggedright
\scriptsize
\arabictext{qatalt"}\linebreak
\tiny 1891 und 1901
}{%
\color{lsMidBlue}
\arabictext{qatalta}
}
}
If there was more than one such passage, one could also disable babel’s special settings for the quotation mark or tell it to use another character that does not conflict with the arabic packages, for example ‘|
‘ instead of ‘"
‘.
\usepackage[german]{babel}
\aliasshorthand{"}{|}
\AtBeginDocument{\shorthandoff*{"}}
\usepackage{hebtex}
This has been a rather technical post describing the problems of critical editions in several scripts with margin notes. The whole project is available on github for others to see the implementation in case similar problems arise elsewhere. Thanks go to Mathias Schenner for identifying the problem and finding the two solutions, as well as for having a look at the LaTeX parts of this post.