Conventionalities in formula writing

Introduction

Chemical formulas, those small icons which chemists are wont to scribble in their notebooks and in odd places, such as the back of an envelope, and which to the general public have become emblems of their profession, are an excellent topic for history. These artefacts remain today tools for communication within the community of chemists. They continue serving as didactic instruments in teaching. The establishment of an individual formula for a chemical compound or a substance chronicles the laboratory methods, both routine and specific, which came into play in order for it to be written down and to assume the status of the analog of a word, to be stored within the growing lexicon of chemistry.

When addressing this topic, the historical narrative, besides its usual needs for accuracy and for an unerring sense of the strange and original taste of the bygone, demands the twin crutches of philosophical and linguistic inquiries. I wish to provide these complements if not in full, at least in a manner suggestive of some of the main issues.

I shall concern myself with the period of consolidation, when formulas entered the language of organic chemistry and started becoming sterotyped, the approximate period 1865-1905. 1 Why choose such a periodization? Because it brackets, approximately, the birth of the modern chemistry journal, JACS was started in 1879, and that of the modern comprehensive repertory of new chemical compounds, Chemical Abstracts were launched in 1907. Kekulé announced in 1865 his cyclic structure of benzene. The Chemical Society published in London, in 1882, Nomenclature and Notation, the first guidelines for establishing systematic and uniform practice. And the American Chemical Society followed suit in 1884 by establishing its Committee on Nomenclature and Notation. The international conference convened in Geneva in 1892 established norms for chemical nomenclature. 2 And Alfred Werner, in 1895, gave a systematic nomenclature for coordination complexes. Key milestones in the history of molecular formulas – so-called “structural formulas”; I favor the adjective “molecular” since the meaning of “structural” has changed considerably over the twentieth century – include the serendipitous synthesis of mauveine (1857), the first synthesis of alizarin (1868) and the identification of ibogaine (1905); Gomberg first free radical appeared in print in 1900. The forty years 1865-1905 were thus for molecular formulas of organic compounds those of the rise in their practical use, of their standardization and also of the first challenges to the rules governing them.

As always in history of science, the risk of Whig history lurks at every corner of the retrodictive narrative. The danger is to read into the structural formulas, as they were used at the end of the nineteenth-beginning of the twentieth century, meanings which they had yet to acquire in the post-Gilbert N. Lewis and post-Linus C. Pauling eras. Examples of such potential anachronisms are: (i.) viewing benzene rings as ipso facto synonyms of “aromaticity;” (ii.) reading double bonds as implying shorter and stronger interatomic linkages; (iii.) interpreting loss of a water molecule in a dehydration process as a thermodynamic driving force for the observed conversion. The eerie superficial similarity of these late nineteenth formulas to our early twenty-first century formulas can easily become misleading.

1. On the necessity of conventions

Structural formulas came of age in the 1860s. 3 With the advent of graphical representations for chemical compounds, came the attendant need for arbitrary conventions in making such graphemes become integral and essential components of publications. One witnesses indeed a stabilization of such graphic notation for organic compounds during the period of consideration, 1865-1905. Moreover, the formulas rapidly settled into a format which would endure spectacularly, to the extent that, more than a century later, present-day formulas continue bearing a strong resemblance to their forebears. Such a success begs for consideration of the means by which it came to be achieved.

These means were, first and foremost, the anchoring in a set of routine laboratory procedures. Justus von Liebig’s Giessen laboratory in its very organization had pioneered such an analytical methodology. 4 The first step was to obtain the analysis of the unknown compound, and to derive from it the elemental composition. The molecular weight was obtained from measurement of the density, and later on from measurement of the melting point depression or of the elevation of the boiling point. These assumed explictly or implicitly validity of the Avogadro-Ampère hypothesis. Consideration of the molecular weight turned the elemental composition into a compositional formula such as C2H4O2, where the subscripts denoted the numbers of atoms of each element denoted by letters. Establishment of the compositional formula was a well-trodden and an established routine. Empirical analytical data were thus converted into sequence of alphanumerical symbols, the formula. But to what extent did chemists who followed this routine believe in atoms? 5 Were they only paying lip service to the tenets of positivism by denying actual existence to the atoms while their very behavior demonstrated a blind faith in their existence?

The establishment of the structural formula was an exercise in representation which cannot be considered outside its historical context. During the period under consideration, the late nineteenth century, the structural (or constitutional) formula was a mapping-out of the relative location of the atoms, based upon a series of reactions the substrate molecule had undergone.

Thus structural formulas postulated the at least empirical validity of atomic theory. 6 They embodied paradoxically reactivity data as a static structure. And the latter was to be considered as a map, 7 rather than as an actual spatial geometry for the molecule depicted. This last point is crucial. Structural formulas during at least the period 1865-1905 were much closer to a topological than to a geometrical representation. I can do no better than to quote to this effect Ira Remsen’s textbook, first published in 1885:

“In studying the chemical conduct of these compounds, their decompositions, and the modes of preparing them, we become familiar with many facts which it is desirable to represent by means of the formulas.” 8

Remsen, at the time one of the best Germany-trained American organic chemists, kept reiterating throughout his textbook the two key concepts in the above definition of the structural formula: (i.) it embodies the familiarity of the chemist with the compound under study; (ii. ) such familiarity is gained from application to that compound of well-understood reactions. To quote again Remsen:

“The formulas are but the condensed expressions of the conclusions which are drawn from the reactions.” 9

Thus to read an actual molecular geometry into a turn of the twentieth century structural formula would be an anachronism.

The phrase “condensed expressions” in this last quotation of Remsen’s begs for attention. Indeed, understanding of structural formulas is improved when one realizes that chemists were intent upon concision, that they aimed at shorthand notation of the empirical evidence. This explains one of the features in the standardization of the writing of structural formulas, the recourse to compositional formulations in the writing of structural formulas: for instance notations such as C6H5 or C6H4 refer unambiguously to presence of a benzene ring in a formula.

The concept of a formula thus aptly summed-up by Remsen is of a chemical “word” (more on that aspect later) which, somehow, represents the sum total of a set of chemical transformations. The chemical formula is close to being an analogon of the bottom-line in a financial recapitulation from an accountant. But, if indeed it is the sum total of the transformations undergone by a molecule, does it have predictive value: is it at the same time the sum total of all the other reactions, yet to be actualized, a particular molecule is capable of? Such a conceptual leap, viz. reading in a formula both the real and the virtual, with the effect of virtualizing the real and of prognosticating the realizing of the virtual, slowly came to pass.

An epistemic switch indeed took place between 1854 and 1858: the formula turned from a retrodictive device to a predictive tool. A key ingredient was the gradual integration of isomer count as a mapping instrument. The Kekulé benzene formula was swiftly adopted by the chemical industry which very greatly helped its acceptance by the academic community, and this was another key ingredient.

Molecular formulas have become an important part of the language of chemistry but they do not constitute all of it. Truly, they are ideograms in that they may be interpreted directly as pictures or they can be named. 10 I can look at a pair of fused six-membered rings, one of which bears oxygen atoms, and the word “naphtoquinone” jumps to mind. Such a duality of pictures and names is essential to the language of chemistry. It obeys Peirce’s second trichotomy, which delineates iconic, indexical and symbolic relationships between signs and objects.

The concept of semiotic iconicity, 11 which posits resemblance, formal or actual, between sign and object is relevant here. Was the iconicity of molecular formulas a necessity or a contingence? This is an interesting question for the historian and for the philosopher. It may well have been a fortuitous occurrence. Whatever the case, its existence tends to dehistoricize the devising and the development of molecular formulas, on one hand, and those of the attendant chemical nomenclature, on the other.

2. Historical continuity

I have drawn attention already to the remarkable endurance of
formulas for organic molecules. They look remarkably alike at the turns of the twentieth and of the twenty first centuries. Yet, so many changes befell and nourished chemistry during the intervening twentieth century. The main such change, relevant to our topic, is that structural elucidation underwent during the 1930s and the 1940s a major revamping of its methods, in the process jettisonning “wet chemistry” (chemical reactions performed at the bench) in favor of a gamut of physical methods, consisting at first of molar refraction, parachor, measured dipole moments, followed later on by X-ray diffraction and by the various spectroscopies: infrared, uv-visible, nuclear magnetic resonance, mass spectrometry, circular dichroism, optical rotatory dispersion, etc.
Why did paper representations of the structural formula remain near-invariant during that whole period, when the methodology for determining the structure of an organic molecule underwent such revolutionary changes? The answer to this question involves at least these three factors: the force of convention, already alluded to in the previous section; substantialism, i.e. the implicit notion of a molecule embodying somewhat magically the qualities of the corresponding substance; and the cumulative nature of the programme for chemistry, as it was set in the eighteenth century, at the time when Venel wrote his entry “Chymie” for the Encyclopédie and of Lavoisier.

Structural formulas are written in a highly arbitrary manner, subject to conventions: the student of organic chemistry is first exposed to the paradigmatic teaching of nomenclature, he or she is apprenticed into the correct writing of structural formulas at the same time when basic laboratory skills are inculcated to the novice, such as distillation, melting point determination, running a Grignard reaction, etc. Such an apprenticeship has also changed surprisingly little for the duration of the twentieth century. Hence, there was a need for fixing the conventions in structural representation very early on, because otherwise the language of chemistry would quickly become an unwieldy tool for communication, simply because of the cornucopia of organic molecules to be thus represented. Structural formulas were set early on, already before World War I, into such a lingua franca format, enforced by international conferences and committees, as well as by the house style of publications such as first and foremost the Beilstein and the Chemical Abstracts.

Substantialism was (and may remain to this day) a mythical belief in a full continuity between perceived qualities of substances at the macroscopic scale (such as color, smell, aspect of the crystals, biological activity, …) and at the microscopic scale of the molecule. Such an act of faith into the homogeneity of chemical matter, at whatever scale of spatial coordinates from the centimeter to the nanometer, went hand in hand with naïve realism regarding the molecules of chemistry, to be considered as molecular objects, no different from those in the ordinary, macroscopic world: objects with a given shape, sets of tiny balls connected with springs as in the sub-discipline of the aptly termed molecular mechanics. Substantialism was part and parcel of the ideology which claimed semi-complete autonomy of chemistry from physics.

The third factor responsible for the near-invariance of structural representation of organic molecules for the duration of the twentieth century has been the perceived programme of natural products chemistry. As it was conceived at the time of Gabriel Venel, this was to be a natural history of substances extracted from plants predominantly. Each such substance had to be isolated, characterized with physical data, named, and related to kindred substances. Elucidation of its structure was tacked on to the list of its physical characteristics, when it started becoming feasible: it made it easier to thus distinguish a substance S , identified by its structural formula, from the manifold of its innumerable isomers T, U, V, W, X, Y, Z, …

But is it really the case that structural formulas remained invariant throughout the twentieth century? Of course, not. Two significant changes, however minute and discreet they may appear, have been the assumption of formulas from their initial textual assimilation to iconic status; and their spatial orientation on the printed page. We shall come to the other, typographical changes in representation at a later stage.

A very important point to note is that there is significant evidence for structural formulas to have continued being considered by quite a few chemists, at least during the period 1865-1905, as an integral part of the text of a chemical publication. The typographer would do his/her best to set the formula along the line of text it belonged with. To many organic chemists, the formula was an integral part of the sentence it was included in. Close scrutiny of publications of this period reveals, as the proverbial smoking gun, that very often a structural formula is followed indeed by a punctuation mark, such as a comma or a period (dot, or full stop), as befits a word in a text: the formula was then word-like. Only later, much later, did it wrap itself into blank space and started being released from textual into iconic status.

Such typographic conventions endured during the whole period under consideration. One finds them, whether in an article by Emil Fischer and Otto Fischer in 1879, 12 or still in 1903 in the contribution to the first synthesis of indigo by Bamberger and Eiger. 13

Such a move was of crucial importance. Textual and iconic registers differ markedly. The former partakes of authoritative discourse, the latter provides illustrative value. The former makes the structural formula into a narrative element, the latter presents the structural formula as a piece of evidence for consideration by the reader. The typographic change is symptomatic of the transition which chemical publications undergo at the turn of the twentieth century, when figures and images in the chemical paper start becoming enframed and captioned, from the earlier natural historical narrative to the latter brief (almost in the legal sense) presented in front of the court of opinion of the chemical community.

Iconic messages differ from linguistic messages in that the former, even more than the latter, are polysemic, admit of several meanings simultaneously. A written, a textual message guides its reader toward its intended meaning: hence, to follow Roland Barthes, two of the anchoring functions of a textual message are identification and interpretation. 14 By stepping into the void in-between blocks of text, the structural formula would divest itself of such indexing, deictic functions of the text, in favor of the other and quite different attributes of an icon, such as its irrefutability and the admiring stance it implicitly demands from the viewer.

When a molecular formula is part of a text, it is read just like the text, left to right and top to bottom. The sense of reading, hence the way in which the eye-mind grasps it, are enforced rigidly. By climbing into a blank space on the page and being hoisted like a flag, the structural formula ipso facto lost that enforced patterning of its acquisition by the reader. Hence, a new set of conventions became necessary, so that standardization would endure: the formula had to be oriented in space; and, for this purpose, arbitrary rules had to be devised, which could be done by numbering the atoms in the carbon skeleton: for everyone to use the same starting point and the same clockwise or counterclockwise eye motion in the acquisition of the structural formula-as-icon. As an earlier case not yet thus standardized, an 1883 paper in Berichte depicts a methyl piperidine still as a formula embedded in text. 15 During the course of the twentieth century, the same formula would be typeset, as an icon outside of the text and with the nitrogen heteroatom at the bottom of the vertical, conventionally.

Structural formulas of organic molecules have an important and dual epistemic function, akin as islands of semantic stability to that of nouns in a natural language. They ensure that (i.) the position of the experimenter is that of a scientific realist, 16 i.e. of a believer in the reality of the invisible entities in the glassware, since he or she can use those entities in a controlled and predictable manner to make other entities, such as dyes and drugs and materials; and that
(ii.) the language in which these representations are grounded remains a self-consistent and an invariable nomenclature.

The fascinating philosophical point is that, while the structural formula of say rosaniline represents the same substance to Robert B. Woodward, say, in 1979 as it did to Emil Fischer in 1879, and even though such a signifier (a little changed, admittedly) pointed to the same signified, nevertheless the entity signified had been enriched in meaning with time.

Not only is there a large measure of commensurability between what rosaniline, in its formulation on paper, meant to Emil Fischer in 1879 and what it meant to R.B. Woodward in 1979 (the year of his death) and thus knowledge about this single signified, rosaniline, could accumulate gradually; but new layers of meaning have deposited themselves during the intervening time, so that the same symbols set down on paper elicit from their modern viewer concepts – on reactivities of various sites in the molecule, on synthetic access routes, etc. – which an Emil Fischer demonstrably could not have thought of during his lifetime. In other words, the structural formula is not only a representation of an invisible entity, it is also, like the phonemes assembled into speech, liable to creative use. This feature, which the language of chemistry shares with natural language, makes it richer than a mere nomenclature. 17

3. Orthography

Formulaic orthography, during the transition period I am focussing on (1865-1905), had to answer a dual need: that for stabilization of chemical graphemes within agreed-upon conventions; and that for abeyance of chemical syntax, valence theory especially. I wish to focus here on the no less important inter-related issues of heteroatoms in a formula and of the proper orientation of a formula on the printed page.

So-called “heteroatoms” are, as the reader will recall, atoms in an organic molecule other than carbon or hydrogen. The reference here is of course to hydrocarbons as so-called “parent molecules”, with alcohols, amines, halides, etc. being viewed as derivatives. As organic chemistry developed during the nineteenth century and because – this is a crucial point – synthetic organic chemistry was driven then to a very large extent by analytical organic chemistry, the focus was predominantly on carbon-rich and hydrogen-rich molecules.

This was true of artificial molecules too. Even though the dye industry, with its explosive growth in the 1860s, came to be responsible for proliferation of man-made compounds, nevertheless these novel constructs were given compositions similar to those of natural substances. The elements carbon and hydrogen predominated in constitutional formulas. The other elements, starting with nitrogen and oxygen, in general were a lesser part of the composition, whether in number of atoms or in mass. Hence, it became “natural” to view them as special, as “heteroatoms” – historians would be well advised to take a closer look at this concept as it emerged and took root.

As an aside, let us note in passing that, since the valence four of carbon exceeds those of most other elements present in organic molecules (oxygen two, sulfur and selenium likewise, nitrogen three, phosphorus three or only occasionally five, etc.), a logical consequence of valence theory might have been a totally different historical course, with chemists synthesizing carbon-poor rather than carbon-rich molecules. As another short parenthesis, let us also remark that, had chemistry set itself on a purely combinatorial course without constant reference to natural products, then formulas such as C5H13ON, i.e. with one lone nitrogen and one lone oxygen against a total of 17 carbons and hydrogens, would not necessarily have become much more frequent than formulas such as say C3H5O3N3.

In any case, the notion of an heteroelement came to be established early on, implicitly or explicitly. Its presence is a given in formulas, almost from their inception. Let me give an example: on p. 524 of the already referred-to 1883 Berichte paper by Liebermann and Paal, 15 one finds a representation of the transformation of an oxypropylpropylamine into 3-methyl piperidine (in modern terminology). The former molecule can be written H3C-CH2- CH2-NH- CH2-CHOH-CH3 and, of course, the latter molecule is cyclic, with a nitrogen-containing six-membered ring.

How do Liebermann and Paal depict both these molecules? By making sure that the nitrogen heteroatom, i.e. the important visual clue for pattern recognition, be seen as central. Thus, they place it on the horizontal line of text, on a median of their chemical equation. The nitrogen heteroatom thus takes pride of place in both formulas, that of the starting material as well as that of the product are both set above and below this central nitrogen.

Nowadays, the convention has changed. If one were to find such a piperidine ring in a modern article, published say in JACS or in Angewandte Chemie, it would be printed vertically rather than horizontally, with the nitrogen atom at six o’clock rather than at nine

Which brings up the related point of the spatial orientation of a formula, also an important arbitrary convention for quick recognition, identification and retrieval. Just as the face of a clock or the cardinal points on a geographic map early on during Modern Times, came to be placed as we know them – North and 12 up, East and 3 to the right, South and 6 down, West and 9 to the left – likewise chemical formulas became typeset in stereotyped manner, making for easier pattern recognition. The important documents to the historian here are the typographical instructions to the printers, internal to printing shops. One may safely surmise, given the leading role of German chemistry, both academic and industrial during the period I am focussing on, that such typographical rules originated in Germany and were then copied by typesetters elsewhere (Britain, France, United States, …).

4. The question of representation

Because of the passing of historical time, and because of the ensuing contextual change and reframing of questions, no longer can we see the formulas of the 1880s as they appeared to contemporaries. An effort of imagination is needed, lest Whig history compels the mind to ignore the paradigmatic shifts in favor of continuity. It would take the deft hand of the professional historian to paint the picture of what a formula meant, say to Emil Fischer, with all of the subtle hues, with all of the erudite documentation too, in order to convey the long bygone concepts.

My first point here will be to reiterate the quotation of Remsen’s, at the beginning of this text: “The formulas are but the condensed expressions of the conclusions which are drawn from the reactions.” A house stands to its builders (architect, contractor, masons, carpenters, electricians, plumbers, painters, etc.) as the enduring memory of : a job, a sequence of stages in its construction, varied difficulties and incidents, on-the-spot improvisations and fixes to remedy a host of suddenly appeared practical problems. In like manner, a structural formula spelled out to its proponent an historical account of how it came to be, of how it had been slowly and carefully wrought. A formula was the sum total of the work, of the practical operations, of the inter-relating to already known compounds, which had gone into its elucidation.

My second point refers to the nature of representation, as embodied in a chemical formula, such as we find for instance in a 1900 Berichte paper. 18 Representation has no existence in the absolute, nor in a void. 19 Any representation is both utilitarian and self-referential. Since any representation is purposeful, to detach it from its original practical use is a methodological bias as reprehensible as removing it from its historical context.

Actually, the original use was indeed part and parcel of the historical context. For instance, the naphtoquinone structures depicted in the 1900 paper by Kehrman and Kramer 10 are quite systematically items in chemical equations. The chemical transformations narrated in this paper obey mass conservation, which the reader can satisfy him/herself about from these formulas at a glance. The other, primary role of these formulas is to indicate the nature of the transformation. For instance, aniline adds to a quinone ring; following which it is transferred to another quinone ring. The formulas condense such evenemential narratives. And they are tools in the knowledge-building community of chemists. 20

The question of accurate representation, as we say nowadays, was a nonissue. Structural formulas did not purport to be this paradox, images of the microscopic invisible. Their textual status, which has already been emphasized, masked any iconic intent or content. The latter has become evident to us only from hindsight. This explains the seemingly incoherent attitude of chemists in the late nineteenth century, suspending their disbelief in the existence of atoms while at the same time making routine, daily use of structural formulas.

5. Connectors

During the period under study, 1865-1905, a variety of typographical devices are used to denote what we term nowadays bonding. The presence of a connection between atoms in a molecule can be indicated with, rather rarely, a solid line; most often, with a hyphenated line or with a dot. The latter usage endured in the publications of the British Royal Society of Chemistry for a good part of the twentieth century.

We witness today a rather similar situation, where the URLs for Web-pages also use the hyphen and the dot as connectors, with the two complementary roles of union and separation.
To return to late nineteenth-century chemical formulas, one should resist jumping to the simplistic conclusion of equating such connectors with a chemical bond. Firstly, and in rather obvious manner, recourse to dots and hyphens had in a paper such as the 1879 contribution by Fischer and Fischer 12 or the 1883 publication by Liebermann and Paal 15 the role of a rhetorical disclaimer: “this is merely a symbolical role, it should not construed literally as having an iconic role,” these tell us.

Secondly, these connectors originate in and derive from typographical textual markers. Thus, their meaning is akin to that of their textual relatives, i.e. each such connector stands for both a separation and a linkage of the two symbols thus assembled. When we see for instance the segment CH2 – . – CH3, the reading demanded by such a practice at the end of the nineteenth century was “we believe to have established the joint presence of a methylene group and of a methyl group, i.e. of an ethyl radical, similar to that in ethyl alcohol.” Which is not to be construed with the (present, year 2000) meaning “the methylene and methyl groups are chemically bonded to one another.”

Thirdly, and in consonance to their textual typographical origin, these connectors are primarily links between units of discourse. They serve to put together the subject matter of what is being talked about, in a sequence reflecting the tale unfolding in the text proper.

Which is not to deny the existence of a very strong historical continuity. Yes, such connectors are the ancestors to chemical bonds. Yes, organic chemists in the 1870s and 1880s conceptualized (because they had a need to) the first rudiments of covalent bonding between atoms which physicists were able to explain satisfactorily to themselves only much later, in the 1930s with the advent of quantum theory. Yes, chemical science is cumulative to some extent and each stage in representation serves in turn as a platform for another representation, as an instance of bootstrapping.

6. Groups

In pattern recognition, the mind is quick to recognize familiar elements. Thus, we see a face in any oval shape inscribed with : a pair of horizontal lines in the middle (eyes), another horizontal line at the bottom (mouth), and a vertical line in-between (nose). Likewise, structural formulas presented chemists with recurring groups of atoms.

It soon became apparent that not only were such groups – like methyl CH3, allyl CH2-CH-CH2, phenyl C6H5, benzyl, CH2 C6H5 – occurring frequently in molecules but also that, to at least first approximation, they were invariant in their properties. Hence, groups of atoms were providing chemical science in the 1860s and 1870s with equivalents to the invariants which physics was endowing itself with during the same period (such as energy, mass, electric charge, …), and which the new science of thermodynamics had made into signs of a mature science. 21 As a consequence, groups of atoms in the structural formulas of chemistry became quickly labeled with shorthand notations such as C6H5, C6H4, …, standing in these cases for a benzene ring with a given substitution pattern.

A chemistry article such as the already referred-to Fischer and Fischer 1879 article on the rosaniline dye 12 (a component of fuchsine) shows on p. 2347 monosubstituted benzene rings as C6H5, disubstituted benzene rings as C6H4, dimethylamino groups as N(CH3)2. Furthermore, its authors refer explicitly to Atomgruppe, i.e. to groups of atoms. In this respect, structural formulas, with their protracted appearance in the 1850s and 1860s, recapitulated and embodied the theory of radicals and its earlier triumph: 22 one could associate groups of atoms with a function, or functionality, i.e. with a set of predictable observables. In other words, a group of atoms could be taken as the signature for membership of a new compound within an already existing class of organic molecules. The hydroxyl group OH characterized alcohols, the CHO group characterized aldehydes and the carboxyl group CO.OH or COOH characterized organic acids, whereas the phenyl group C6H5 was typical of a benzene derivative. 23

A shorthand notation is at once a convenience, an economy in ink,24 and an encouragement to lazy thinking. Indeed any such shorthand notation is pregnant with the future raising of scientific questions. The phenyl rings, which Emil Fischer saw and treated in 1879 as building blocks in the molecular architecture of organic molecules, these syllables in chemical words would ultimately pose the question of their raison d’être, which Linus Pauling, Erich Hückel and a few others during the 1930s identified as stemming from the admittedly vague and hard to define notion of an aromatic stabilization. Likewise, the CH2 methylene groups, which to the nineteenth century organic chemists were mere rather dull modules in organic molecules, would raise in the 1940s and in the 1950s to biochemists at first the question of the possible nonequivalence of the two hydrogen atoms, with all the attendant consequences for advances in the understanding of both chirality and of enzymatic activity.

Often, our predecessors during the period 1865-1905 would not bother indicating explicitly double bonds. Aromatic rings are drawn as regular hexagons (only later on, did they mistakenly become typeset as elungated hexagons) with no indication of the three, oscillating double bonds. An example is the already referred-to work on transamination of naphtoquinones. 10

Conclusion

This contribution has provided support, in the form of empirical evidence, to my earlier equating in La parole des choses, 15 a structural formula with a word. Chemical formulas during the first decades of their existence in print were indeed often treated as text, they were thus typeset as if they were integral part of sentences and they were often both preceded and followed by punctuation marks such as commas and periods. Only later and gradually did they assume iconic status and moved, illustration-like, into blank spaces on the page. But a bit of careful scrutiny reveals that these formulas, far from having been excised from the text – as did become the modern usage, later on – still bear in the 1870s and 1880s punctuation marks, as a sort of a tattoo from their textuality, still integral and very much alive at the turn of the twentieth century.

An irony of history is the occasional rewinding of the reel, when it occasionally reverses itself. In 1949, i.e. in the heyday of pictorial molecular formulas, William J. Wiswesser answered the needs of the then-nascent computer science and turned back the clock, returning formulas to being sets of alphanumerical symbols aligned in sequence, i.e. textual pieces. His linear notation was very influential, but only for a relatively short time. 25

A related point, also made here, is that during the period 1865-1905, which can be viewed as one of consolidation, a structural formula was a map, before it became the representation of a shape. In spite of Ampère having introduced as early as 1814 in an important letter to Berthollet the seminal notion of the “representative shape” of a molecule,26 this concept was very slow in both being accepted and in becoming associated with the structural formula for a molecule. This helps to explain the indifference with which the papers by Sachse 27,28 and by Mohr 29,30 on the chair and boat forms of cyclohexane were greeted, at the end of the nineteenth-beginning of the twentieth century. Geometry was very slow in its colonization of structural formulas. Topology antedated it.

Like any other map, the molecular formula recapitulates the process of its own discovery and establishment. I did not discuss the complementary issue, which I’ll do elsewhere, of the history of flow charts in chemical publications. Suffice it to say that the two main types of chemical maps are flow charts and molecular formulas. To assert that the former are narratives while the latter are ideograms is to overstate their contrast. As we saw, molecular formulas are also a kind of narration. Furthermore, a hybrid of the two types of maps appeared much later, starting in the 1940s and 1950s: insertion of curved arrows in a molecular formula, as pioneered by Robinson and by Ingold for depiction of a plausible reaction mechanism by denoting the electronic flow and redistribution, would serve to endow static molecular formulas with the Time Arrow which vectorializes flow charts – themselves a generalization of the chemical equation.

Chemical texts, truly, are akin to a slide show as presented by art historians. A chemistry article is a linear discourse/argument punctuated with seemingly iconic representations, whose purpose is one of both illustration and synonymy. The molecular formula has an indexical role, it also has the recapitulative function of summarizing earlier work. Insertion of a formula ipso facto historicizes the text, places it in the context of a collective striving towards some achievement: to complete the first synthesis of indigo; to elucidate the constitution of fuchsine; to investigate the quinone-hydroquinone interconversion, as in some of the instances we referred to.

As a direction for future historical investigation, I submit that looking for parallels to the detextualization-retextualization (Wiswesser) of molecular formulas in the advertising industry (slogans and logos), in movies (use of titles in silent films, invention of animated cartoons), in the publishing industry (pioneering cartoons such as Georges Colomb’s Le sapeur Camember (1890-1896) and L’idée fixe du savant Cosinus (1893-1899), under the pseudonym Christophe), would contribute to the history of printing and to sociology of knowledge both.

References

1 Nye, Mary Jo. Before Big Science. The Pursuit of Modern Chemistry and Physics 1800-1940. New York and London: Twayne-Prentice Hall, 1996.
2 Verkade, Pieter E. A history of the nomenclature of Organic Chemistry. Delft: Delft University Press, 1985.
3 Brock, William H. The Norton history of chemistry. New York: W. W. Norton, 1992.
4 Brock, William H. Justus von Liebig: The Chemical Gatekeeper. Cambridge: Cambridge University Press, 1997.
5 Rocke, Alan J. Chemical atomism in the nineteenth century: From Dalton to Cannizzaro. Columbus OH: Ohio State University, 1984.
6 Wurtz, Charles-Adolphe. La théorie atomique. Paris: G. Baillière, 1879.
7 Wood, Denis. The Power of Maps. New York: The Guilford Press, 1992.
8 Remsen, Ira. Organic Chemistry. Fifth revision ed., Boston: D.C. Heath, 1909, p.15.
9 Remsen, Ira. Organic Chemistry. Fifth revision ed., Boston: D.C. Heath, 1909, p. 17.
10 Kroeber, Alfred. Anthropology. New York: Harcourt, Brace & World, 1948., p. 510
11 Napoli, Donna Jo. Linguistics. New York: Oxford University Press, 1996.
12 Fischer, Emil and Otto Fischer. Berichte 12 (1879): 2344-2353.
13 Bamberger, E. and F. Eiger. Berichte der Deutsche chemische Gesellschaft 36 (1903): 1611-1625.
14 Barthes, Roland. “Rhétorique de l’image.” In L’obvie et l’obtus. Essais critiques III, ed. Roland Barthes. pp. 31-32. Paris: Le Seuil, 1982.
15 Liebermann, C. and Paal, C.. Berichte 16 (1 1883): 523-534.
16 Hacking, Ian. “Experimentation and Scientific Realism.” In Scientific Realism, ed. Jarrett Leplin. pp. 154-172. Berkeley: University of California Press, 1984.
17 Laszlo, Pierre. La parole des choses. Collection Savoir: Sciences, Paris: Hermann, 1993.
18 Kehrman, F. and Otto Kramer. Berichte der Deutsche Gesellschaft 33 (3 1900): 3074-3086.
19 Hoffmann, Roald and Pierre Laszlo. “Representation in Chemistry.” Angew. Chem. Int. Ed. Engl. 30 (1991): 1-16.
20 Kozma, Robert, Elaine Chin, Joel Russell, and Nancy Marx. “The Roles of Representations and Tools in the Chemistry Laboratory and Their Implications for Chemistry Learning.” Journal of the Learning Sciences 9 (2 2000): 105-143.
21 Brush, Stephen G. The Kind of Motion We Call Heat: A History of the Kinetic Theory of Gases in the 19th Century. Dordrecht: North-Holland, 1976.
22 Brooke, John Hedley. Thinking About Matter. Studies in the History of Chemical Philosophy. Aldershot, Hampshire: Variorum, 1995.
23 Rocke, Alan J. The Quiet Revolution: Hermann Kolbe and the Science of Organic Chemistry. California Studies in the History of Science No. 11, Berkeley CA: University of California Press, 1993.
24 Tufte, E.R. The Visual Display of Quantitative Information. Cheshire CT: Graphics Press, 1983.
25 Smith, E. G. The Wiswesser Line-Formula Notation. New York: McGraw-Hill, 1968.
26 Ampère, André-Marie. “Lettre de M. Ampère à M. le comte Berthollet, sur la détermination des proportions dans lesquelles les corps se combinent d’après le nombre et les dispositions respectives des molécules dont leurs parties intégrantes sont composées.” Ann. de Chim. et de Phys. XC (1814): 43.
27 Sachse, H. “Über die geometrischen Isomieren der Hexamethylenderivate.” Berichte der Deutschen Chemischen Gesellschaft 23 (1890): 1363.
28 Sachse, Hermann. “Über die Konfigurationen der Polymethylenringe.” Zeitschrift für physikalische Chemie 10 (1892): 203-241.
29 Mohr, Ernst. “Die Baeyersche Spannungstheorie und die Struktur des Diamanten.” Journal für Praktische Chemie 98 (2 1918): 315-353.
30 Mohr, Ernst. “Zur Theorie des cis-trans-Isomerie des Dekahydro-naphtalins.” Berichte der deutschen chemischen Gesellschaft 55 (1922): 230-231.