A Call for Better Indexes

By Nicholas Higham

Indexes. Photo credit: Lendingmemo.
An index is an important component of a book. It provides a view with a much smaller granularity than a table of contents, reveals what is not present as well as what is, and by abstracting concepts can lead the reader to unexpected content.

Most academic books in mathematics are typeset in LaTeX, which has an excellent system for indexing. By inserting \index commands in the source code, and running the MakeIndex program as part of the LaTeXing sequence, an author can iteratively build up an index during the late stages of the writing process, safe in the knowledge that the automatically generated page locators will be correct.

One might expect that the quality of indexes would have improved since the pre-LaTeX days when indexes had to be generated by hand. But in my view they have not. Most indexes I see have obvious flaws.

Why is this? I think there are two main reasons. First, most authors write at most one or two books during their careers and so indexing is a task that they rarely carry out and therefore have little chance to practice. Second, indexing tends to be left to the last minute, leaving no time to study how to produce a good index. In the days before LaTeX it was much more likely that a professional indexer would be employed, and a professional with some knowledge of the subject area will do a much better job than a hurried author.

I believe it is well worth authors putting a lot of effort into indexing. The benefit is not just a better index but a better book, because the indexing process forces you to think carefully about structure and content. I am working on a book index right now and in the process of choosing entries and inspecting draft indexes I found that some important definitions were missing, spotted improvements to section titles (whose inaccuracy only became apparent when I tried to use them as index entries), and corrected instances where I had spelt words differently at different places. My experience concurs with that of Don Knuth, who said in his article Mini-Indexes for Literate Programs, “a little extra time spent on indexing generally leads to significant improvements in the text of any book that is being indexed by its author, who has a chance to see the book in a new light”.

How does one produce a good index? How does one know when one has produced a good index? This post is not the place to try to answer these questions. Some thoughts on how to index are given in Section 13.4 of my Handbook of Writing for the Mathematical Sciences (2nd ed., 1998) and I hope to provide further advice in future blog posts.

But I would like to mention one variable that has a correlation with the quality of an index: its length. Many indexes are simply too small for the content of the book.

The book The Indexing Companion by Glenda Browne and Jon Jermey (CUP, 2007) says that indexes should range from 2% for a “simple book” to 15% for a “complex book”. I think mathematics texts fall towards the “simple” end of the scale, compared with a biography or historical book, for example.

Here is a table showing the relative size of the indexes of some books that I regard as having good indexes.

 Book Total 
pages
Index
Pages 
Percentage
Index
Olver et al., NIST Handbook of Mathematical Functions   951 65  6.8 
 Horn and Johnson, Matrix Analysis  643 36.5  5.7
 Stewart and Sun, Matrix Perturbation Theory  365  15  4.1
 Press et al., Numerical Recipes  1235  41  3.3
 Knuth, The Art of Computer Programming, Volume 2, 2nd ed  638  20  3.1
 Graham et al, Concrete Mathematics  657  20  3.0
 Boyd and Vandenberghe, Convex Optimization  716  16  2.2
 Strauss, Partial Differential Equations  454  9  2.2
 Trefethen, Approximation Theory and Approximation Practice  305  6  2.0

My own four books have indexes occupying 2% to 3.7% of the book.

My rule of thumb for a mathematics book is that if your index occupies less than 2% of the book then you should think carefully about whether it needs extending. In particular, ask yourself whether you have indexed items twice when appropriate. For example, “vector space, dimension” and “dimension of vector space” are probably both needed, and likewise “norm, Euclidean” and “Euclidean norm”. Furthermore, you should index synonyms for important concepts, even if they do not appear in the book. For example, if you use the modern term “significand” in floating point arithmetic, you probably need an entry “mantissa, see significand.”

Finally, it is worth emphasizing that readers do care about indexes. The Society of Indexers (based in the UK) produces an excellent journal The Indexer, and all articles over three years old are freely accessible. A regular column Indexes Reviewed collects comments on indexes from book reviews, under the headings “praised”, “censured”, “omitted”, and “obiter dicta.” It makes an interesting read. Here are three notable snippets:

This is a book that clatters around in a dark closet of irrelevancies for 450 pages before it bumps accidentally into its index and stops.

It’s a curious production, without an index. A biography without an index is like a wheelbarrow without handles.

The stupidest index I’ve seen was in the manual for a Kia car. Changing the wheel? Don’t look under C for “changing” or S for “spare wheel” or W for “wheel” or J for “jacking” or T for “tyre” or even F for “flat tyre”. Nope, it was listed under H. For “How to change a wheel.”

Nicholas Higham is the Richardson Professor of Applied Mathematics at the University of Manchester. He is President Elect of SIAM. 

Tim Davis
November 24, 2014 
This is an important but neglected topic. I used LaTeX for my book (Direct Methods for Sparse Linear Systems, SIAM), and sweated hard to come up with good index terms. It was no small task, but I think it’s worth the effort. The book is 214 pages (excluding front matter), of which 7 pages are for the index, with 2,165 index items. It’s particular important to avoid spurious duplicates in the index (there are 1,772 unique index terms). That’s about 3.2% of the book.

I would often iterate over the index and the text. If the index didn’t look right, I would go back and polish the text and/or the \index{…} terms to smooth it out. Of particular importance is sub-indexing, where an index term appears in a hierarchy. A flat index isn’t as useful, I think.

I was motivated in part by a comment from my book editor on an early proposal for the book, whom I quote below:

“You also made a comment about aiming for an index length of 2 pages.
My experience is that most indexes are too light, so again I would do
what’s right and not aim for a particular length.” [ Nick Higham, Oct 4, 2005] 🙂 !


Dmitry Savostyanov 
November 26, 2014
I appreciate the importance of index for a printed hard copy of a book. But if we consider electronic books (either in pdf or other ebook format) — is the index still essential? Or we can completely rely on Ctrl-F (Cmd-F) and the Search tool?


Tim Davis
November 26, 2014
I think the index is still important. First, not all occurrences of a given word are important. Useless ones can get weeded out of a manually-generated index. Second, a search tool will miss related terms (‘matrix’ vs. ‘matrices’, or ‘singular value decomposition’ vs ‘SVD’). Finally, an index can be hierarchical, with nested terms. Some words are too generic. For example, if you are searching my book for a right-looking LU, you don’t want a right-looking QR. I have two ‘right-looking’ terms, one nested under ‘LU’ and the other under ‘QR’.

For an electronic version, it’s helpful if the index has active links that you can click on to go to that particular page. That’s not always the case, but it’s the best of both worlds.


Nick Higham
November 26, 2014
That’s a very good question. It’s something discussed quite a lot in the indexing literature, not least because professional indexers worry that authors’ potential reliance on their readers using “ctrl-f” will do the indexers out of a job.

However, an index does things that search can’t, such as point to synonyms and related terms (for example, I might look in the index for “principal components analysis” and find “see singular value decomposition”) and identify concepts (for example, various specific methods might be indexed under “Krylov subspace methods” as well as under their own name).


Tom Koornwinder
December 8, 2014
Don’t forget the symbol index. Often, when I only want to read something in the middle of a book, I hit on notation which may have been introduced one hundred pages before.

Happening NowPrograms and Publications
blog comments powered by Disqus
Videos More Videos
SIAM Membership

SIAM Membership

Stay connected with the applied math and computational science research communities.


Already a member? Keep Your Membership Active

Risk, Randomness, and the Power of the Lindy Effect

The Lindy effect is a useful framework for decision-making that makes sense of the “test of time” as a concept.

Decision Support Systems to Enhance Food Security in the U.K.

Data-driven decision support systems help policymakers compare the likely outcomes of different combinations of...