Lisa Otty, EDINA

Over the last few months, I’ve been rushing home from work each Friday afternoon to take part in the online Programming for Humanists  course run by Texas A&M University. I was really delighted to have the opportunity to take this course—thanks CAHSS!—because, although my current work is focused on supporting digital scholarship, my academic background is in a quite traditionally humanistic and non-technical discipline.

Having taught myself a bit about computing and various digital methods, I was keen to get stuck into some programming with guidance from experts. Taught by TAMU’s world-renowned DH team, who know exactly where humanists are coming from, it was pitched at beginners and focused on introducing the Python programming language and a variety of libraries that offer tools for text manipulation and analysis.

Running for 14 weeks, the course was taught using Bluejeans online meeting software to stream weekly two-hour webinars and PythonAnywhere  , a cloud-based python coding environment, to develop, run and share code. Both worked impressively smoothly and having everyone use the same coding environment, which our tutor could also access, neatly avoided the cross-platform niggles that often take up valuable time in technical workshops.

Each week began with a short introductory talk, usually explaining background concepts and aspects of how computers operate and communicate with one another (e.g. operating systems, RAM, APIs) followed by an hour and half or so of coding practice. At first these sessions demonstrated some fundamental programming concepts such as variables, loops, if-statements, lists and dictionaries, but we quickly moved on to importing libraries to perform more complex tasks.

For a book historian like me, the examples and tasks we worked on were perfect: we spent time creating extracting sections (chapters) and information (titles) from large text files (novels from project Gutenberg), which helped us to prepare our data for subsequent sessions on topic modelling using nltk and genism, OCR using tesseract, and text analysis using nltk. Along the way we learned about conventions for writing decent code, good practice in documenting our work, how to keep reign over larger and more complex projects by breaking our code into different files that we could interpolate from our scripts.

The combination of talks, tailored examples, and hands on programming was really rewarding: I definitely gained a deeper understanding of the machine in front of me, how good code is structured and how programmes are built. The pace of the individual sessions was well-judged but the learning curve over the course was high and the webinars themselves were quite demanding: trying to follow what someone is doing, reproduce it and troubleshoot when you can’t get something working—all without losing track of the ongoing lesson—required intense concentration!

While we watched our instructor Bryan write code, the course often felt more like a broadcast I was following than an interactive webinar. That said Bryan was excellent and encouraged us to contact him between sessions with any issues or queries. In addition, the course videos were quickly made available along with Bryan’s ‘model’ code, so you could spend time during the week going over things, unpicking problems and comparing the scripts.

The next step for me is working out how to apply what I have learnt to my own projects, and to get used to working outside the controlled environment of PythonAnywhere. While seeing how a programmer works and trying to reproduce his code has given me lots of useful insights, I’m looking forward to developing my own skills further by trying to write some scripts of my own!

Resources

Programming for Humanists at TAMU

PythonAnywhere

Bluejeans