Showing posts with label XeTeX. Show all posts
Showing posts with label XeTeX. Show all posts

Tuesday, December 16, 2014

GNU Freefont fonts and XeLaTeX

The problem

There's been a long-standing issue about using the Gnu Freefont fonts with XeLaTeX.  The fonts are "Free Serif", "Free Sans" "Free Mono", and each has normal, italic, bold and bold-italic versions.  
These fonts are maintained by Stevan White, who has done a lot of support and maintenance work on them.  
These fonts are of special interest to people who type Indian languages because they include nice, and rather complete Devanāgarī character sets in addition to glyphs for
  • Bengali
  • Gujarati
  • Gurmukhi
  • Oriya
  • Sinhala
  • Tamil
    and
  • Malayalam
The Gnu Freefonts are excellent for an exceptionally wide range of scripts and languages, as well as symbols.  See the coverage chart.

At the time of writing this blog, December 2014, the release version of the fonts is 4-beta, dated May 2012.  This is the release that's distributed with TeXLive 2014, and is generally available with other programs that include or require the FreeFonts.

But the 2012 release of the FreeFonts causes problems with the current versions of XeTeX.  Basically, the Devanagari conjunct consonants in the 2012 fonts are incompatible with the current XeTeX compositing engine. (For the technical: Up to TL 2012 XeTeX used ICU; since TL 2013 it's used HarfBuzz.)

In the last couple of years, Stevan has done a great deal of work on the Devanagari parts of the FreeFonts, and he has solved these problems.  But his improvements and developments are only available in the Subversion repository.   For technically-able users, it's not hard to download and compile this pre-release version of the fonts.  But then to make sure that XeTeX calls the right version of the FreeFonts, it's also necessary to weed out the 2012 version of the fonts that's distributed with TeX Live 2014.  And that's a bit hard.  In short, things get fiddly.

Now, Norbert Preining has created a special TeX Live repository for the Subversion version of the FreeFonts.  TeX Live 2014 users can now just invoke that repo and sit back and enjoy the correct Devanagari typesetting.

New warning June 2017: 
the procedure below is no longer supported.  Don't do it.

WARNING
Be warned that the version distributed here is a development version, not meant for production. Expect severe breakage. You need to know what you are doing!
END WARNING

Here follow Norbert's instructions (as of Dec 2014).  Remember to use sudo if you have TeX Live installed system-wide.

The solution. A new TeX Live repository for the pre-release Gnu FreeFonts

Norbert says (Dec 2014):

Here we go: Please do:
tlmgr repository add http://www.tug.org/~preining/tlptexlive/ tlptexlive
tlmgr pinning add tlptexlive gnu-freefont
tlmgr install --reinstall gnu-freefont
You should see something like:  
[~] tlmgr install --reinstall gnu-freefont
...
[1/1, ??:??/??:??] reinstall: gnu-freefont @tlptexlive [12311k]
...
Note the
@tlptexlive
After that you can do  
tlmgr info gnu-freefont
and should see: 
Package installed:   Yes
revision:    3007
sizes:       src: 27157k, doc: 961k, run: 19769k
relocatable: No
collection:  collection-fontsextra
Note the
revision: 3007
which corresponds to the freefont subversion revision!!!

From now on, after the pinning action, updates for gnu-freefont will
always be pulled from tlptexlive (see man page of tlmgr).
 

Reverting the change:

In case you ever want to return to the versions as distributed in TeX Live, please do
tlmgr pinning remove tlptexlive gnu-freefont
tlmgr install --reinstall gun-freefont


Thank you, Norbert!

Wednesday, July 24, 2013

Minimal example of XeLaTeX with Velthuis input mapping



\documentclass{article}

\usepackage{polyglossia}

\newfontfamily
  \sanskritfont [Script=Devanagari,Mapping=velthuis-sanskrit]{Sanskrit 2003}

\begin{document}

\sanskritfont
\noindent\huge

aasiidraajaa nalo naama viirasenasuto balii|\\
upapannairgu.nairi.s.tai ruupavaana"svakovida.h||

\end{document}

Monday, May 27, 2013

XeLaTeX for Sanskrit: update

In a post on 5 July 2010, I gave an example of how to use XeLaTeX with various fonts and various ways of inputting text.   Some time later, the commands in the Fontspec and Polyglossia packages were updated, and my example didn't work as advertised any more.  Here is an update that works again.

Tuesday, February 26, 2013

Converting XeLaTeX into ODT or MS Word

TeX4ht can do a lot of the work of converting from LaTeX to wordprocessor.  But when one adds in the complications of UTF8 characters, multiple scripts, and XeLaTeX, things can get complicated.

C. V. Radhakrishnan today pointed me to this discussion on the TeX4ht mailing list:
What Radhakrishnan says is:
As far as I understand, TeX4ht won't support fontspec or XeLaTeX
technologies of using system fonts that do not have *.tfm's. In effect, by
adopting TeX4ht, one is likely to loose the features brought in by XeTeX.
However, here is another approach.

   1. We translate all the Unicode character representations in the
   document to Unicode code points in 7bit ascii which is very much palatable
   to TeX4ht. A simple perl script, utf2ent.pl in the attached archive does
   the job.
   2. We run TeX4ht on the output of step 1.
   3. Open the *html in a browser, I believe, we get what you wanted. See
   the attached screen shot as it appeared in Firefox in my Linux box.

Here is what I did with your specimen document.

   1. commented out lines that related to fontspec package from your
   sources named as alex.tex.
   2. added four lines of macro code to digest the converted TeX sources
   3. ran the command: perl utf2ent.pl alex.tex > alex-ent.tex
   4. ran the command: htlatex alex-ent "xhtml,charset=utf-8,fn-in" -utf8
   (fn-in option is to keep the footnotes in the same document). I have used a
   local bib file, mn.bib as I didn't have your bib database. biber was also
   run in the meantime to process the bibliography database.
   5. open the output, alex-ent.html in a browser. I got it as you see in
   the attached alex.png.
 Radhakrishnan's PERL script utf2ent.pl is
#!/usr/bin/perl

use strict;
use warnings;

for my $file ( @ARGV ){
  open my $fh, '<:utf8 br="" cannot="" die="" file:="" file="" open="" or="">   while( <$fh> ){
      s/([\x7f-\x{ffffff}])/'\\entity{'.ord($1).'}'/ge;
        print;
  }
}


For Radhakrishnan's continuing comments on TeX4ht development, see
TeX4ht's homepage:

Tuesday, July 03, 2012

Sanskrit hyphenation list

I'm gradually building up a file of hyphenated Sanskrit words and compounds, written in the Latin alphabet.  The file is called sanskrit-hyphenations.tex, and you are welcome to download it.

It contains hyphenation points for words in English (ayur-veda), and for words in Sanskrit (āyur-veda).
 To use it, do something like this in your style file:
\setotherlanguage{sanskrit}
\newfontfamily\sanskritfont{Sanskrit 2003}
% Define \sansk{} which is the same as \emph{}, except
% that it causes appropriate hyphenation

% for Sanskrit words.  Use \sansk{} for Sanskrit and
% \emph{} for English.

\newcommand{\sansk}[1]{\emph{\textsanskrit{#1}}}
 and \input the sanskrit-hyphenations.tex file after \begin{document}, thus:
\begin{document}
  \input{sanskrit-hyphenations.tex}

...
\end{document}

XeTeX  already has built-in hyphenation rules for Devanāgarī and Romanized Sanskrit. The above file is intended to extend the hyphenation coverage for Romanized words, using etymological and stylistic considerations.

Tuesday, October 04, 2011

Simplest Sanskrit XeLaTeX file

Input:

\documentclass{article}
\usepackage{polyglossia}
\setmainfont[Script=Devanagari]{Nakula}

\begin{document}
Your Devanāgarī looks like this:  आसीद्राजा नलो नाम and your romanized stuff looks like this: āsīd rājā nalo nāma.  
\end{document}

Output:






You can get the Nakula font (and its twin, Sahadeva) from John Smith's website, http://bombay.indology.info

Monday, November 22, 2010

Hyphenating Sanskrit in roman transliteration

%!TeX program = xelatex
%
% Thanks to Yves Codet for the first version of this test file, and to Yves
% and Jonathan Kew for the hyphenation tables
% for Sanskrit (hyph-sa.tex):
%
% This file exemplifies the case where some Sanskrit is embedded in a
% mainly-English document, but the Sanskrit words are appropriately
% hyphenated. The Sanskrit words are in the argument of the
% \textsanskrit{} command.

\documentclass[12pt]{article}

\usepackage{fontspec}
\usepackage{polyglossia}

\setdefaultlanguage{english}
\setmainfont{Charis SIL}

\setotherlanguage{sanskrit}
\newfontfamily\sanskritfont{Charis SIL}

\textwidth=0.5cm
\parindent 0pt

\begin{document}

Sanskrit hyphenation:
\par\smallskip

\textsanskrit{manum ekāgram āsīnam abhigamya maharṣayaḥ |\par}

\bigskip

English hyphenation:
\par\smallskip

manum ekāgram āsīnam abhigamya maharṣayaḥ |

\end{document}

Wednesday, September 01, 2010

XeLaTeX, Velthuis encoding, and palatal nasals

When using the Velthuis input coding for Devanāgarī, and wanting to have it handled by XeLaTeX, one finds the palatal ñ disappears in the Nāgarī.

input: sa~njaya

output: स न्जय


That's because the Velthuis input code for ञ् is ~n, and the "~" is a special code in TeX, meaning "hard space".

Here's the workaround. I define a font-switching command \dev that will turn Velthuis into Devanāgarī. \dev is mostly made up of "\textsanskrit" which is set up using the standard XeLaTeX/polyglossia \newfontfamily commands. \textsanskrit does the work of invoking the mapping-conversion (from XeTeX's velthuis-sanskrit.tec file).

But just before \textsanskrit, we change tilde into a normal character. And after \textsanskrit, we turn tilde back into an "active" hard space. We use the \aftergroup command so that the "active" version of tilde is activated after the closing of the group that contains the Devanāgarī.

Here's the code:


\newfontfamily\textsanskrit [Script=Devanagari,Mapping=velthuis-sanskrit]{Nakula}


% Make the tilde into a normal letter of the alphabet
\def\maketildeletter{\catcode`\~=11 }


% Return tilde to being the default TeX "active" character for hard space
\def\maketildeactive{\catcode`\~=13 }

\def\dev{\maketildeletter\textsanskrit \aftergroup\maketildeactive}


Here's how you use it:

input: {\dev sa~njaya uvaaca}. What did Dr~Sañjaya say?

output: सञ्जय उवाच. What did Dr Sañjaya say?

where that space betwen "Dr" and "Sañjaya" is hard, and you can't break a line there.

Enjoy.
 

Update 2020:

Using David Carlisle's much better idea from the comments below, here's the new code:
 
\newfontfamily\textsanskrit [Script=Devanagari,Mapping=velthuis-sanskrit]{Nakula}

\def\dev{\edef~{\string~}\textsanskrit }
 
\begin{document}

{\dev sa~njaya uvaaca}. What did Dr~Sañjaya say?

\end{document}

 
 

Tuesday, July 06, 2010

Switching from Devanāgarī to Roman with a single command

I have to admit even I am startled by the success of this.
In the input file below, I changed the single command:
  • \setdefaultlanguage{sanskrit}

to

  • \setdefaultlanguage{english}
and the result was the following:

How do I install RomDev mapping for XeLaTeX (Unicode transliteration -> Devanāgarī)?

[Update, February 2011: Somdev has moved his blog to http://pratibham.blogspot.com/.]

Somdev Vasudev's RomDev mapping is installed as follows:
  1. The actual mapping file is published by Somdev in his blog, here:
    http://sarasvatam.blogspot.com/2010/03/updated-teckit-romdev.html 
    [Update Feb 2011: now at http://pratibham.blogspot.com/2010/03/updated-teckit-romdev.html; update March 2012: now at https://github.com/somadeva/RomDev]
  2. Cut and paste this text, and save it in a Unicode file called RomDev.map.  Save that file in a place which XeTeX can "see," e.g., something like local/texmf/fonts/misc/xetex/fontmapping/
  3. You now need to compile the human-readable *.map file into a binary *.tec file, so that XeTeX can read it directly.  This is done by the program Teckit, which you can get here:
    http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=TECkitDownloads
  4. I'm working with Ubuntu GNU/Linux.  For me, the command is,

    teckit_compile RomDev.map -o RomDev.tec

    I'm afraid I don't know the Windows or Mac command invocation.

  5. Now you have a file in a place like
    local/texmf/fonts/misc/xetex/fontmapping/RomDev.tec

  6. Run the command that rebuilds the database of files that TeX knows about.  In Linux it's
    sudo mktexlsr
  7. That's it!  XeTeX and XeLaTeX can now see, and make use of the RomDev mapping, that converts Unicode transliteration into Devanāgarī, as exemplified in my earlier blog posts below. 

    A minimal edition of a Sanskrit verse, using XeLaTeX and Ledmac


    And here's the input for the above (tested and working in September 2019):


    \documentclass{book}
    % Set up things for XeLaTeX, and Devanagari.
    % Simplified version of http://cikitsa.blogspot.com/2010/07/xelatex-for-sanskrit.html

    \usepackage{polyglossia} % the multilingual support package
    % Next, from the polyglossia manual:
    \setdefaultlanguage{sanskrit} % this is mostly going to be Sanskrit,
    \setotherlanguage{french} % with some French embedded in it,
    \setotherlanguage{english} % and some English.
    % These will call appropriate hyphenation.
    \usepackage{xltxtra} % standard for nearly all XeLaTeX documents
    \defaultfontfeatures{Mapping=tex-text} % ditto
    \setmainfont{Gandhari Unicode} % could be any Unicode font
    % Now define the Devanagari font:
    % John Smith's Sahadeva, input using standard UTF8 transliteration
    \newfontfamily\sanskritfont [Script=Devanagari,Mapping=RomDev]{Sahadeva}

    % Now come the commands for the critical edition formatting:
    \usepackage[noeledmac]{ledmac} %"noeledmac" stops some annoying messages
    % customizations to Ledmac, and macros to make life easier.
    \def\Variant#1{\Afootnote{\relax#1}}
    \def\Lemma#1{\lemma{\relax#1}}
    \let\Reference=\Bfootnote
    \let\Grammatical=\Cfootnote
    \let\Tibetan=\Dfootnote
    % in a real edition, I'd probably also make
    % abbreviations for \textfrench (perhaps \tf) etc.
    \def\Omission#1{$\langle$#1$\rangle$}
    \def\ScribalDeletion#1{{\rm[\kern-.15em[}#1{\rm]\kern-.15em]}}
    \def\hardspace{\texttt{\char`\ }}
    \def\And{{\rm\penalty-1\quad$\mid\mid$~}} % divider between variants to the same lemma
    % more customizations: make the A notes
    % (\Variants and \Lemmas)into two-column format,
    % and make the B notes (\Reference) normal footnotes.
    %
    % changes to stuff cut-and-pasted from ledmac.sty:
    \makeatletter
    \renewcommand*{\twocolfootfmt}[3]{%
    \normal@pars
    % \hsize .45\hsize
    \hsize .49\hsize
    \parindent=0pt
    \tolerance=5000
    \raggedright
    \leavevmode\hangindent1.5em\hangafter1
    \strut{\notenumfont\printlines#1|}\enspace
    {\select@lemmafont#1|#2}\rbracket\enskip
    #3\strut\par\allowbreak}
    \foottwocol{A}
    \renewcommand*{\normalfootfmt}[3]{%
    \normal@pars
    \parindent=0pt \parfillskip=0pt plus 1fil
    \hangindent1.5em\hangafter1
    {\notenumfont\printlines#1|}\strut\enspace
    {\select@lemmafont#1|#2}\rbracket\enskip#3\strut\par}
    \footnormal{B}
    \makeatother
    \firstlinenum{1}
    \linenumincrement{1}


    % and here begins the edition:
    %
    \begin{document}
    \chapter*{yogaśatakam}
    \large


    \section*{\textenglish{The example verse by itself}}

    \textenglish{From \emph{Yogaśataka: Texte m\'edical attribu\'e
    \`a Nāgārjuna\ldots par Jean Filliozat} (Pondich\'ery, 1979), pp.\,1, 59:\par}

    \bigskip

    kṛtsnasya tantrasya gṛhītadhāmna-\\
    ścikitsitādviprasṛtasya dūram|
    vidagthavaidyapratipūjitasya\\
    kariṣyate yogaśatasya bandhaḥ|| 1||

    \bigskip

    \section*{\textenglish{The example verse, with apparatus}}
    % we could use the \stanza command, but I haven't bothered.

    %
    % I find that the judicious use of indentation
    % and newlines helps enormously to see what's what.
    % Using a good "folding editor" would be even better.
    %

    \begingroup
    \beginnumbering
    \autopar
    \edtext{
    \edtext{kṛtsnasya}{
    \Variant{%
    \textfrench{N1 détruit, C1 }kṛtas tasya,
    \textfrench{C2 }kṛtasya.}
    \Tibetan{\textfrench{T \emph{mth'yas}, ``sans limite, immense''
    traduit }kṛtsnasya.}}
    tantrasya
    \edtext{gṛhītadhāmna-}{
    \Variant{\textfrench{Ca, JK }dhamnā.}}\\
    \edtext{ścikitsitā}{
    \Lemma{cikitsitād} % not ``ścikitsitā'', of course. We're preserving
    the sandhyakṣaras.
    \Variant{\textfrench{C1, C2 } cikitsitāt.}
    \Tibetan{\textfrench{T \emph{gso-spyad} ''pratique de la
    thérapeutique''. Ordinairement
      \emph{gso spyad} est ``investigation del la th.''}}}% comment sign to stop a break after the conjunct
    \edtext{dviprasṛtasya}{
    \Lemma{viprasṛtasya} % as above with cikitsitād.
    \Variant{\textfrench{Ca} cikitsitārthaprasṛtasya, \textfrench{C1, C2}
    viprasutasya.}}
    \edtext{dūram}{
    \Variant{\textfrench{Ca} dūrāt}}|
    \\ \indent
    %
    % the above line is annoying. Because the whole verse is
    % inside an \edtext{} macro, in order to get the
    % \Grammatical note naming the upajāti verse, we have to
    % avoid having paragraph breaks, which are not allowed
    % inside \edtext{}.
    % instead, we use \\ (newline) and \indent (paragraph indent)
    % to get the same visual effect. A nasty kludge.
    %
    vidagdhavaidyapratipūjitasya\\
    \edtext{kariṣyate}{
    \Variant{\textfrench{N1} karikṣete.}}
    yogaśatasya bandhaḥ|| 1||
    }{\Lemma{}\Grammatical{Upajāti.}}
    \par % necessary to stop \autopar complaining. Thanks to Alessandro Graheli.
    \endgroup
    \end{document}

    Monday, July 05, 2010

    XeLaTeX for Sanskrit

    This example worked well in July 2010, but some TeX packages have since been updated slightly.  See the new, updated version of this example, posted on 27 May 2013.