About Bunadas

Bunadas is being developed gradually on a part-time basis by Caoimhín Ó Donnaíle at Sabhal Mòr Ostaig, the Gaelic-medium higher-education college in Skye in Scotland.

History and Acknowledgments

Bunadas originated from the ideas and diligence and enthusiasm of a student from Wales, Dilys Powell, who completed an honours BA at SMO beteen 1999 to 2003. Dilys first collected in a spreadsheet nearly 400 Welsh words together with what she judged to be their Scottish Gaelic cognates. By working through Cornish and Breton dictionaries, she added to this about 1000 Cornish-Welsh cognate pairs (both Kernewek Kemmyn and Kernowek Ünys) and about 650 Breton-Welsh cognate pairs. Caoimhín Ó Donnaíle formed these into an online database, the Stòr-fhaclan Co-dhàimheil Ceilteach or “Celtic Cognates Database”. Brian Stowell kindly donated a list of 1100 Manx words together with their English meanings, and these too were added to the database after matching them with their Gaelic cognates.

The later stages of the work were helped by a £1000 grant from Colmcille, and Dilys was tasked with keying in the entire 12,000 Modern-Irish–Old-Irish wordpairs listed in the Innéacs Nua-Ghaeilge don Dictionary of the Irish Language by Tomás de Bhaldraithe (1981), which had the potential to provide a bridge from the modern languages to Old Irish. Unfortunately, however, the primitive, over-simple structure of the Celtic Cognates Database, a single rectangular table of cognate words in many languages, made it very very difficult to merge in any new material, and not only was the Innéacs Nua-Ghailge data never added to the Celtic Cognates Database, but the database remained almost moribund for over ten years - certainly a useful online facility, and the interface was improved in various ways, notably by linking words to online dictionaries via Multidict, but with almost no new material ever added to it.

From about March 2016 the project was revived and rejuvenated with an entirely new underlying structure, and it was given a new and snappier name, “Bunadas”. The new structure was inspired by the programming work Caoimhín Ó Donnaíle had started six months earlier on An Sruth, Rody Gorman’s online database of Scottish Gaelic, Irish Gaelic and English idioms and expressions, funded by Colmcille. The new structure broke away entirely from the old idea of a rectangular table to show the correspondence between expressions (in the case of An Sruth) or words (in the case of Bunadas), different languages being in different columns. Instead the basic units were the expressions or words themselves, each individually labelled by language, and these could be linked with each other to form a network database. The closest links were represented by clusters (of expressions/words), and the computer could “walk” the network to find more distant relatives. Importantly, this new structure allows links to be formed between expressions or words within a language as well as links between languages.

The new structure has proved wonderfully flexible and usable. The 12,000 wordpairs from the Innéacs Nua-Ghaeilge were finally merged in, after a lot of work beforehand cleaning up the data. Independently, Kevin Scannell scanned in the Innéacs Nua-Ghaeilge and independently did a huge amount of work cleaning up the data, puting it online as Drochaid DIL. Although this represents duplicate work in a sense, in fact it proved very useful because the two datasets could be compared to eliminate any mistakes in transcribing this hugely important dataset, and Kevin’s work has been of great use to Bunadas. Work has continued ever since, adding new material to Bunadas and expanding the number of languages it covers. It currently (2019-09-28) contains 67,800 words, compared to 6600 in the old Celtic Cognates Database. Wiktionary has been the biggest source of new material. All 630 Proto-Indo-European roots in Wiktionary have been added by hand to Bunadas, together with any derivative words in the Celtic languages, and usually also with many examples of derivative words in Proto-Germanic, German, Old English, Modern English, Latin, French, Ancient Greek, etc. All Proto-Celtic lemmas in Wiktionary have been added, as have all Brythonic lemmas. Sometimes words and etymologies have been gleaned from online fora such as Guto Rhys’s CELTIC LINGUISTICS group in Facebook, and ChronHib posts and blogs. Ken George kindly donated a spreadsheet with 1100 Old Cornish words, together with their Kernewek Kemmyn reflexes and English meanings, and these have all been merged with Bunadas.

Many people have helped in various other ways. Eric Daoudal corrected many mistakes in Breton words. Steve Hewitt advised regarding “retro-endonyms” for extinct languages, such as “Henbrethonic” for Old Breton. Bunadas now has a multilingual interface using a facility originally developed for An Sruth. Pierre Morvan has kindly translated the interface to both Breton and French, and Karin Colsman has translated it into German.

The future

Send comments, ideas and questions to Caoimhín Ó Donnaíle, caoimhin@smo.uhi.ac.uk. I am very open to the idea of cooperation with other individuals and other projects. If anyone would like an SQL dump of the database, just let me know.

2019-10-21 CPD