Army Committee warrants added

I’m still quite busy with paid work (you can see some of the petitions I’ve been transcribing at British History Online) but I’ve just found time to update the wiki this week. To test the data structures for manuscript texts, I’ve imported a few hundred pay warrants and receipts created by the Army Committee in 1645 and 1646. The Army Committee was a committee of MPs chaired by Robert Scawen, which handled administration and supply for the New Model Army. The warrants and receipts that I’ve imported today are all for buying horses, saddles, and harness. The data originally came from my PhD research but I’ve checked everything against the original documents at Kew and corrected some errors (although I was relieved to find that most of my notes were accurate).

Some examples:

I think I’m now satisfied with the data structures for manuscript texts. I still need to test manuscripts that are divided into sections. After that I want to finish importing authors so I can test books, articles, journals and theses at a bigger scale.

Brief update

This is just a quick roundup of changes to the wiki since the last post in August.

  • there are now pages for every regiment of the New Model Army in the First Civil War.
  • there are pages for every meeting of the Short Parliament, linked to the location where it was held, and proceedings at British History Online:
  • you can now search events by date again. It should work properly now, and there’s an option to limit it to specific types of event.
  • search suggestions in the main search box (at the top of every page) now have accent folding as well as case folding, so if you type a character without an accent, such as e, it will also match accented versions of that character, such as é. This makes Gaelic and Welsh names easier to find. For example, if you type ‘sir fon’ it will match ‘Sir Fôn’. To do this I had to hack the TitleKey extension myself, but it was easier than installing ElasticSearch.
  • some more properties have been removed to simplify the data structures:
    • ‘Addressed from’ because there are many documents it doesn’t apply to, the way I tried to use it was too inconsistent, and ‘Mentions’ is good enough for record linkage.
    • ‘Received on date’ as it’s only known in a minority of cases.
    • ‘Has ARCHON ID’ because Wikidata ID and a link to an archive’s own website do everything that is needed.
  • there are pages for a couple of particularly useful books. If you drill down from work level there are links to scans at the Internet Archive:

The next thing I want to do is test manuscript texts on a bigger scale. I have some data from my PhD research for warrants paying for horses and saddles for the New Model Army, but I found some anomalies in the data that will need checking against the originals next time I’m at Kew (probably next week). Once I’ve done that, I should be satisfied enough with the data structures that I can start really big imports. I’m already working on data for about 1,000 authors and 1,600 peers and MPs. The method that I used for meetings of the Short Parliament will scale up to the Long Parliament and Protectorate Parliaments quite easily, so I may as well get that done as soon as I can. In practice I might not be able to start these big imports until the end of the year because I’m likely to be busy with paid work, but the wiki should move up to another level and become much more useful next year.

More changes

This week I’ve finished a big overhaul of the wiki. Changes include:

  • there’s now an external identifier for the The Scotland, Scandinavia and Northern European Biographical Database (SSNE). This free to view database created by Steve Murdoch and Alexia Grosjean is a very important source for the earlier careers of many civil war officers.
  • there’s a simple entity to represent blank pages in a manuscript. This is easier to use and less cumbersome than using the full data structures designed for manuscript texts or sections.
  • behind the scenes, some of the properties have been simplified. This won’t make any noticeable difference at the front end but it does affect the RDF output and writing custom queries.
  • documentation on property pages should all be up to date now.
  • battles and sieges have been overhauled yet again, and I think I’m finally satisfied. More details of this below.

Continue reading

Cataloguing SP 28

Lots of people who have researched the British Civil Wars will know of SP 28, also known as the Commonwealth Exchequer Papers, in The UK National Archives. It’s a very important, and mostly quite poorly catalogued, collection of financial records of the parliamentarian and Protectorate war effort. One of the main aims of this project is to gradually catalogue and index SP 28. I’ve now started importing catalogue data into the wiki.

Continue reading

Battles and sieges overhauled

In the last post, I said that I was planning to change the data structures and page layouts for battles and sieges. I’ve now done that, which means you can:

  • view a map of all battles and sieges that have pages on the wiki and coordinates entered on their pages.

[Edit 15 August 2019: the form for searching by date has been temporarily removed because it didn’t work properly and it now needs updating to include other types of event. Most of this post is now obsolete because I’ve merged combat events into a new form/template that can also cover other types of event.]

[Edit 20 September 2019: the search events by date form is back and working properly. It now includes an option to limit the search to certain types of event, as well as searching all events.]

Below is more information about the changes I’ve made.

Continue reading

Progress update

This is a quick post about what I’ve been doing recently and what I’m going to do next.

First, I’ve provisionally finalised the data structures and page layouts, but feedback is still welcome (you can comment on the data structure documentation at Google docs).

The page layouts have changed a bit since the first launch of the wiki. The old ‘Semantic search’ heading has gone, and the links that were under it have been integrated into other sections, so for example everything about sources is under the ‘Sources’ heading, including the link to the query for linked sources. Now that I’ve had a chance to experiment with caches on the live wiki, I’ve found that it can be more efficient to embed query results and maps in a page, which also means you can see them without having to click a link.

Now that the data structures and page layouts are stable (for now) I can start importing batches of data. I’ve finished and tested the Python scripts for generating wiki XML files from CSV. These are based on what I did for Linking Experiences of World War One, but are simpler and more flexible.

The first successful import is a batch of authors. You can see from the authors category that there are now 54 pages for authors. All of these are linked to Wikidata IDs but most of them are not yet linked to editions of their works because I haven’t imported any books yet. Only three of these authors are women. The ratio will improve as I import more authors but it will still necessarily reflect the historical under-representation of women: although there are now lots of women publishing about the civil wars, the field used to be very male dominated. The first batch might look like an eccentric selection because my priorities for authors are based on two main factors:

  • how much of their work is relevant to this project and has specific named entities as a main subject, or is otherwise particularly important for what I’m doing
  • whether they already have a Wikidata ID

As well as building up more authors and books, early priorities for imports include:

  • historical people who are subjects of published biographies
  • battles and sieges
  • places where battles and sieges happened
  • armies that took part in battles and sieges
  • historic counties of Great Britain