XSLT with HTML5, CSS and Javascript— transforming Herodotus The Histories into HTML

Building on previous posts in the XSLT for the Modern Web series — this post will take a more detailed look at how to use XSLT, Javascript, HTML5 and CSS to provide a basic HTML5 output of Herodotus’ The Histories

XSLT for the Modern Web
10 min readMay 26, 2024

This text will build upon previous posts on the range of Javascript methods to load XML files and apply a transformation with the Javascript XSLTProcessor API.

I’ve updated this article on 22/06/2024. (It’s always great to revisit and clarify! — I’ve added some CSS styling using Bootstrap 5.2.3 and shown how to use a Google Font in a webpage)

If you’ve not yet seen my previous posts which explore using XSLT for the Modern Web, check these out.

XSLT for the Modern Web

Some of the most interesting XML documents to use with XSLT are Humanities and Historical texts. XML markup and computing methods provide many ways to examine these, both in terms of the structure, features and text content. For example:

  • Opportunity to link between sections of the text to allow easier navigation between sections and references between sections.
  • Generating simple lists (and not so simple lists) based on instances of persons, places, details
  • Marking connections between persons, places and details
  • Marking up sections of text that provide interesting anecdotes.
  • Generating timelines of events — using mention of dates or key events.
  • Extracting data that can be modelled in a completely different way — for example, mapping places and potentially geographical boundaries.
  • Reforming the text — breaking away from linear reading of a text to exploring it through instances of particular individuals or extracting and filtering the text.

Adding the additional user interface that is afforded by HTML5, CSS and modern Javascript Web API’s provides a rich layer to explore these texts in non-linear ways — through linking, applying different structures and visual formatting.

Humanities texts are often encoded in a markup vocabulary provided by the TEI (Text Encoding Initiative) which and seeks to provide a standardised XML vocabulary that can be used to mark-up Humanities based textual resources.

Using the TEI XML vocabulary can become quite an academic exercise — its purpose is to fill a need in Digital Humanities and academic fields, however, in simple terms, just as a range of elements are provided in HTML to markup up specific structural sections of a webpage — e.g. <h1>,<main> <section>,<ul>,<i>— so the TEI vocabulary provides similar elements that are for markup of specific structural sections and features of a Humanities based text.

To learn all about the TEI, check out:
Learn the TEI

The TEI vocabulary is quite verbose and it’s usage in different text contents can be subjective but essentially reduces the XML markup into a set of elements which can be used to describe both the metadata, the structure of the text (i.e sections, paragraphs) and allows for markup of features of the text itself — which can be both linguistic but also places, people, events.

Some interesting Humanities XML files that can be used to experiment with XSLT can be found via the Perseus Digital Library website.

A key text that is made available from the Perseus Digital Library is The Histories by Herodotus.

Perseus makes this text available under a Creative Commons Attribution-ShareAlike 3.0 United States License.

This version of The Histories is an an English translation by A. D. Godley. Cambridge. Harvard University Press. 1920. The Histories is a good example to use as it includes mention of a range of persons, events, places, notes and anecdotes. The edition by Godley also includes many notes which add to the richness of a read.

The Herodotus XML document in question is marked up as TEI P4, which was introduced in June 2002. The current version of TEI is P5, which was released in November 2007. It’s not always necessary to update markup as long as the XML works for what you need to do and is applied consistently to the XML document In this case, we only want to make a simple rendering of the text — given the structure of the document is simple, whether the markup is P4 or P5 is irrelevant. If, on the other hand, we wanted to apply a XSLT stylesheet which specifically expected P5 elements to be present, this would be a problem as some elements of P5 would not be present in the XML document.

(I just mention this in case you view any other TEI XML document and note slight differences — if the transition from P4 to P5 interests you, here is a link to some explanation on migration. Note that as well as TEI P5 there is also a more condensed version of P5 called TEI Lite)

For the purpose of this post I’ll provide an overview of the structure of the XML document made available. To view the XML itself, you’ll have to follow the links provided above.

Summary of the structure of the Herodotus The Histories TEI markup

<TEI.2>

The top most element is <TEI.2> (this is a P4 element, but as mentioned, it doesn’t matter that we are using an older version of TEI in this context).

<teiHeader>

The <teiHeader> element contains a variety of TEI elements to record the metadata to describe the document, editors, authorship, usage etc.

<text><body>

In this specific example, next is the <text>and <body>element — these are used to enclose the text content.

For usage in TEI P5 see <text>
For usage in TEI P5 see <body>

<div1>

To provide structure to contain each book, a <div1>element is used, with attribute type Book and attribute n denoting the book number.

<div1 type=”Book” n=”1" org=”uniform” sample=”complete”>

For usage in TEI P5 see <div1>

<p>

Within this is a <p> To me it does not seem this is completely correct usage — but for the purpose of this example it does not matter.</p>

For usage in TEI P5 see <p>

<p>

<milestone>

Milestone— these are used to mark chapters, sections and paragraphs

<milestone n="1" unit="chapter" />
<milestone n="0" unit="section" />
<milestone unit="para" />

Note that in this example, these elements are used in a ‘mixed’ content fashion meaning that text content is enclosed by the elements, as we might find with HTML. The use case in this example is like so:

<milestone unit=”para” /> [ text content ]

NOT like this:

<milestone unit=”para”> [ text content ] </milestone>

I mention this point now because whether or not text content is enclosed by elements matters to how the XSLT stylesheet is prepared.

For usage in TEI P5 see <milestone>

<name>

After the <milestone unit=”para”/> you’ll find mixed content XML. Here, as well as text content, there are <names> — both of persons but also of tribes or other ethic groups which are mentioned in the text. The use of attribute type with a value of “pers” is non-standard but regardless it has been applied consistently through the document. (In TEI P5 you’ll find “pers” is actually now defined as “person”).

<name type="pers">[text]</name>
<name type="ethnic">[text]</name>

For usage in TEI P5 see <name>

<placeName>

Where place names are mentioned, a <placeName> element is used within <name> element.

<name>
<placeName>[text]</placeName>
</name>

For usage in TEI P5 see <placeName>

<note>

Lastly, there are a couple of instances of <note> where the editors have provided elaboration on specific parts of the text.

<note>[text]</note>

For usage in TEI P5 see <note>

Structure at a glance

In summary, then, the XML document takes the below structure.

<TEI.2>
<teiHeader>
...[metadata]
</teiHeader>
<text>
<body>
<div1> - book
<p>
<milestone n="1" unit="chapter" /> - chapter
<milestone n="0" unit="section" /> - section
<milestone unit="para" /> - paragraph
...[mixed content text, <name>, <note>]
</p>
</div1>
</body>
</text>
</TEI.2>

Making a XSLT document

The first step on making an XSLT stylesheet is to consider both the XML text you want to transform (i.e the source document) and the desired output you want to obtain. In this example, the focus is on using XSLT within a website context so we want to achieve a HTML-based output.

As we have examined, we have an XML document that provides the necessary structure of the Histories into books, sections and paragraphs — the most natural transformation would be to keep these divisions in place and to provide a HTML output with headings for the headings and use the HTML5 <main>, <section>, <article>, to display the text content of the books — for now though as a simple example, let’s just use the trusty HTML <div>.

The Histories is full of people <name type=”pers”>, ethnic groups <name type=”ethnic”> and places <placeName>— so we want to find a way to make these stand out to the reader.

Update 22/06/2024: In this example we will use Bootstrap (v.5.2.3) CSS framework to provide styling for the HTML output. Bootstrap 5.2.3 can easily be included in your HTML page with a simple <link> within your HTML <head>.

...
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.2.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-rbsA2VBKQhggwzxH7pPCaAqO46MgnOM80zW1RWuH61DGLwZJEdK2Kadq2F9CUG65" crossorigin="anonymous">
...

In the <head> I have also decided to bring in a nice font from the Google Fonts collection — in this case I have chosen EB Garamond 400. Picking a nice font can really transform a document and help with the readability of the text.

...
<!-- EB Garamond 400 -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=EB+Garamond:ital@0;1&display=swap" rel="stylesheet">
...

XSLT stylesheet

Matching on each instance of a book is easy — we’ve seem that the <div1> element has been used to separate the books, so matching this element provides the easy way to break the HTML output into books. We can use the XPath “div[@type=’Book’]” to perform the matching very easily.

<xsl:template match="div1[@type='Book']">
<div class="row">
<div class="col-md-12 px-md-5">
<h2>Book:<xsl:value-of select="@n"/></h2>
<xsl:apply-templates>
</div>
</div>
</xsl:template>

<p>

The <p> element encloses the text so we will match against this first. (As noted the <milestone> elements do not enclose text so we cannot use these.

<xsl:template match="p">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>

Chapters — <milestone[@unit=’chapter’]>

<xsl:template match="milestone[@unit='chapter']">
<xsl:variable name="n" select="@n"/>
<span class="chapter" data-chapter="{$n}">
Chapter: <xsl:value-of select="$n"/>
</span>
</xsl:template>

Persons and places — HTML5 <mark>

Now let’s consider how to bring out persons and places. We an easily use the HTML element <mark> to highlight these.

<xsl:template match="name[@type='pers']">
<mark class="person">
<xsl:apply-templates/>
</mark>
</xsl:template>

<xsl:template match="name[@type='ethnic']">
<mark class="ethnic">
<xsl:apply-templates/>
</mark>
</xsl:template>

Notes- HTML5 <mark>

Notes are just that — notes which explain the text, added by the editor.

<xsl:template match="note">
<mark class="note">
<xsl:apply-templates/>
</mark>
</xsl:template>

XSLT — herodotus.xsl

Lets put this all together now into a single XSLT stylesheet.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
<div class="container-fluid">
<div class="row ps-md-5">
<h1>The Histories, Herodotus</h1>
</div>
<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="teiHeader"/>

<xsl:template match="div1[@type='Book']">
<div class="row">
<div class="col-md-12 px-md-5">
<h2>Book:<xsl:value-of select="@n"/></h2>
<xsl:apply-templates/>
</div>
</div>
</xsl:template>

<xsl:template match="p">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>

<xsl:template match="milestone[@unit='chapter']">
<xsl:variable name="n" select="@n"/>
<span class="chapter" data-chapter="{$n}">
Chapter: <xsl:value-of select="$n"/>
</span>
</xsl:template>

<xsl:template match="name[@type='pers']">
<mark class="person">
<xsl:apply-templates/>
</mark>
</xsl:template>

<xsl:template match="name[@type='ethnic']">
<mark class="ethnic">
<xsl:apply-templates/>
</mark>
</xsl:template>

<xsl:template match="note">
<mark class="note">
<xsl:apply-templates/>
</mark>
</xsl:template>

</xsl:stylesheet>

CSS

Let’s look at the CSS now. As explained we have used the Bootstrap (v.5.2.3) CSS framework to provide styling for the HTML output, so there is really quite little to update. In the example the Bootstrap CSS has been applied first so there are just a few tweaks to the headings, I’ve applied background colours and border-bottom’s to the mark elements.

I’ve also used a :hover class to set the background colours on the mark elements when the reader moves their cursor over the elements.

Note that the colours (starting with ‘ — bs-’ are all built in Bootstrap colours).

body{
font-family: "EB Garamond", serif;
font-size: 1.6em;
line-height:1.6em;
}

h1{
display:block;
margin-top:20px;
padding-left:0px;
font-weight:700;
}

h2{
display:block;
margin-top:20px;
}

span.chapter{
font-weight:600;
display:block;
margin-top:20px;
}

mark.person{
border-bottom: 2px solid var(--bs-green);
}

mark.person:hover{
background-color: var(--bs-green);
color:#FFFFFF;
}

mark.ethnic{
border-bottom: 2px solid var(--bs-orange);
}

mark.ethnic:hover{
background-color: var(--bs-orange);
color:#FFFFFF;
}

mark.note{
border-bottom: 2px solid var(--bs-red);
}

mark.note:hover{
background-color: var(--bs-red);
color:#FFFFFF;
}

Full HTML page

To apply the XSLT transformation I have used the below HTML document with included Javascript utilising the Fetch API and the relatively new async/await features of modern Javascript.

This Javascript shown in a previous post on using XSLT for the Modern Web

<!DOCTYPE html>
<html lang="en">
<head>
<meta encoding="utf-8">
<title>XSLT</title>
<meta name="viewport" content="width=device-width, initial-scale=1">

<!-- Bootstrap 5.2.3 -->
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.2.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-rbsA2VBKQhggwzxH7pPCaAqO46MgnOM80zW1RWuH61DGLwZJEdK2Kadq2F9CUG65" crossorigin="anonymous">

<!-- EB Garamond 400 -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=EB+Garamond:ital@0;1&display=swap" rel="stylesheet">
<style type="text/css">

body{
font-family: "EB Garamond", serif;
font-size: 1.6em;
line-height:1.6em;
}

h1{
display:block;
margin-top:20px;
padding:0px;
font-weight:700;
}

h2{
display:block;
margin-top:20px;
}

span.chapter{
font-weight:600;
display:block;
margin-top:20px;
}

mark.person{
border-bottom: 2px solid var(--bs-green);
}

mark.person:hover{
background-color: var(--bs-green);
color:#FFFFFF;
}

mark.ethnic{
border-bottom: 2px solid var(--bs-orange);
}

mark.ethnic:hover{
background-color: var(--bs-orange);
color:#FFFFFF;
}

mark.note{
border-bottom: 2px solid var(--bs-red);
}


mark.note:hover{
background-color: var(--bs-red);
color:#FFFFFF;
}

</style>
</head>
<body>

<script type="text/javascript">

(function(){

function fetchLoad(){

if(!('fetch' in window)) {
console.log('Fetch does not appear to be available in this browser. Please try another.');
return;
}

if(!('XSLTProcessor' in window)) {
console.log('XSLTProcessor does not appear to be available in this browser. Please try another.');
return;
}

if(!('DOMParser' in window)){
console.log('DOMParser does not appear to be available in this browser. Please try another.');
return;
}

const xsltProcessor = new XSLTProcessor();
const parser = new DOMParser();

loadFile("data/herodotus.xsl").then(data => {
const xsl = parser.parseFromString(data, "application/xml");
xsltProcessor.importStylesheet(xsl);
}).then(loadFile("data/Perseus_text_1999.01.0126.xml").then(data => {
const xml = parser.parseFromString(data, "application/xml");
const fragment = xsltProcessor.transformToFragment(xml, document);
document.body.appendChild(fragment);
}));

}

async function loadFile(filepath){

const response = await fetch(filepath);
if(!response.ok){
console.log('Looks like there was a problem: ', response.status);
}
const text = await response.text();
return text;
}

window.addEventListener("load", fetchLoad, false);
})();

</script>
</body>
</html>

Finally, the Output!

Sample of webpage output from Herodotus’ The Histories

The Histories by Herodotus is available via the Perseus Digital Library.

Perseus makes this text available under a Creative Commons Attribution-ShareAlike 3.0 United States License.

I hope you enjoyed this article and feel inspired to play with XSLT to make your own simple XML to HTML transformations!

If you’ve not yet seen my previous posts, check these out.

--

--

XSLT for the Modern Web

Re-learning XSLT in the context of the Modern Web - occassional writer, reader, Digital Humanities enthusiast All text is 100% non-AI - learn, explore, reuse