Transcription as a ‘moment of contact’ with qualitative data

The processes of undertaking stage 1 of the analysis of Mass Observation writers’ responses to the Life Lines directive leads me to reflect on something I often say to participants of my analysis courses – the various ‘moments of contact’ we have with our data during qualitative analysis.
Revisiting data at different times, from different perspectives and for different purposes is characteristic of the iterative nature of qualitative analysis, and must be prized as essential in the development of a valid interpretation. Transcription is a key ‘moment of contact’ that shouldn’t be undervalued.
We all know that transcription is time-consuming. The speed with which we can type, the technology used to facilitate the process and the form of transcript being generated all affect the amount of time it takes, but it’s typically suggested that an hour’s worth of audio recording – say from an open-ended interview – takes between 8 and 10 hours to transcribe. If you’re working with video, perhaps transcribing interactions between participants and the characteristics of the setting, as well as the content of what is being said etc. then it takes longer. Voice-recognition software is sometimes thought to offer a solution, but this is unlikely to be the case – see here for the reason why!
Several CAQDAS packages, such as ATLAS.ti, MAXQDA, NVivo, QDA Miner, HyperRESEARCH, Qualrus, and Transana (see CAQDAS Networking Project for reviews of these products) provide the ability to analyse audio, video and image data ‘directly’ – i.e. without the need for a written transcript that represents the original data. There are many analytic reasons why this might be useful.
But this technical ability also brings with it the danger of enabling lazy researchers to be lazier. Often  students get very excited when they realise that the CAQDAS package they have chosen enables direct analysis of audio-visual material. It’s almost like you can see them thinking “wow, I don’t have to do that boring transcription any more”.
(Note of clarification here, by the way – I’m NOT saying the technology encourages laziness. Technical affordances do not – and cannot – in and of themselves encourage us into certain practices, because we, as humans, as researchers, are always in control of deciding what purpose to put a technological feature).
My response to this is: Why would you think transcription is boring? Just because something takes time doesn’t mean it’s boring, surely?! You designed and undertook the interviews or focus-groups, observed the settings, designed the open-ended survey questions, whatever. Therefore how can you be bored by transcribing, formatting and preparing the data? It’s the basis of how you will go about the analysis, in fact, its an integral part of analysis.
I’m currently teaching my daughter to read. She’s four. It’s taking a while. It’s a process that involves a huge amount of repetition. I’ve spent approximately an hour a day for the past several months on this. Just like when I transcribe I’m faced with having to write out the same or similar passages several times – because research participants often say similar things in response to our interview questions – my daughter and I read the same books several times before moving on to a new one. We do that to consolidate her learning. True, after the third time she’s usually had enough of the story (you could perhaps say she’s ‘bored’ with it), but she’s also pretty chuffed with herself because she realises she can read it more easily and with more fluency the third time than she did the first. She’s ready to progress. My daughter has an older brother, he’s 8. I absolutely love the fact that he is now an accomplished reader and that he voluntarily spends time reading a range of fiction (currently he is reading Harry Potter), and non-fiction (historic, geographic and wildlife encyclopaedias are his favourite) and these days would rather read on his own than to or with me. But his reading skills are in large part a result of the time we spent together repeatedly practicing the core elements of reading that he now unconsciously exercises independently.
Why am I telling you anecdotes about my childrens’ reading learning experiences, you may be wondering? Because their experiences are a useful analogy for what we need to do as qualitative researchers with our materials. Just as my son and daughter have – and need – repeated ‘moments of contact’ with phonetic letter sounds, words, sentences, paragraphs, chapters and books to consolidate their reading expertise, so we, as qualitative researchers, have – and need – repeated ‘moments of contact’ with our data. We need those moments to achieve the deep level of contact and understanding that leads to an authoritative and valid interpretation. This is true whatever our research objectives, methodologies, analytic strategies.
The number and types of ‘moments of contact’ we have depend on the project’s characteristics – including research questions, type and amount of data, analytic approach, levels of analysis and types of output. And the way software tools are harnessed. For Defining Mass Observation, the analysts did not have the benefit of transcription as a ‘moment of contact’. For various practical reasons, others were employed to transcribe the hundreds of hand-written narratives. As they were doing so we realised that the process was providing them with not only an overview of the breadth of content contained within the materials but also valuable insights that could inform our analysis. We therefore took the opportunity of interviewing them towards the end of their process and have taken their thoughts into account in designing and undertaking the analysis. They had the overview of content that the three qualitative analysts didn’t have, which, as discussed in this blog post, was a key factor in shaping out analytic design.
During the analytic planning stage, we undertook a pilot analysis of a sub-sample of responses to both the “My Life Line” and “Social Divisions” Mass Observation Project (MOP) Directives. This involved several tasks which entailed repeated ‘moments of contact’ with the data, including the following:
– identifying, defining and representing concepts
– familiarising with the data by exploring content at a detailed level
– experimenting with different conceptualisation strategies (open-coding for thematic content, coding for tone of expressions, capturing the chronology of events, etc.)
– interrogating the occurrence of different types of codes in the data and in relation to writers with different characteristics
This pilot work essentially involved undertaking a whole mini-analysis of the sub-sample of data, experimenting with different ways of undertaking analysis and evaluating the extent to which these would enable us to answer our research questions. This resulted in designing an overall analytic plan, which we are now in the process of undertaking.
For the “My Life Line” Directive, we are just completing Stage 1: High-Level Semantic Content Mapping, which has involved the seven “Phases of Action” comprising various analytic tasks. I’ll discuss those in a different blog post. The point I want to make now is that the analysis plan was designed to overcome the lack of an overview, on the part of the the analysts, of content of the extensive material as a whole. We needed to design a process that enabled us to gain this overview quickly and comprehensively. Although we gained a lot from interviewing the transcribers, we couldn’t rely solely on their insights as they had not been asked to think about the data they were transcribing in relation to our research questions. In addition, because there are three qualitative researchers working on the analysis we needed to design a process that ensured consistency and equivalence without each of us having to engage to the same level with all the transcripts.
However, as I have been undertaking stage 1, I’ve been thinking about how we would have designed the analysis differently if we had participated in the transcription process. Would undertaking transcription have meant the analytic plan would have been different? Would we have had to go through the extensive pilot planning stage at all? At least it would have been different because it would have been more pointedly informed. We would have made notes as we were transcribing, and these would have informed the design. The lack of a comprehensive overview of content was a key factor underlying our design, so therefore it stands to reason that had we had that overview we would have undertaken the analysis somewhat differently. We would still have had to undertake the high-level semantic content mapping process because in order to answer our research questions we need to consistently map out the topics discussed and the ways in which they are discussed. But there are certain areas of data conceptualisation (commonly called ‘coding’) which would perhaps have been more focused more quickly had the analysts been involved in the transcription. So that the dilemmas we encountered about how to code for certain aspects would have been pre-empted.
All projects are different and researchers have to respond to their characteristics in order to enable systematic and high-quality analysis. It’s always a balance between practical and analytic needs. I’m not saying that our analysis would be better if we had done the transcribing ourselves, and within the parameters of the funding for this project, that wouldn’t have been possible anyway. But the way we approached the analysis would certainly have been different. We have had to build in certain steps to overcome the lack of overview of content that would either not have been required, or would have been different.
So, the point about transcription that the DMO project underlines is that transcription is an analytic act. This is not a new idea, but one that is often overlooked or suppressed. What you decide to transcribe and how you decide to format your transcriptions, affects how you can go about analysis. Therefore transcription shouldn’t be undervalued as a process. It’s probably true that as researchers progress through their careers they become less likely to be the ones undertaking transcription. It’s very common for transcription to be contracted out and there are many professional services for doing so. In funded projects like DMO contracting out transcription is a practical issue, as its often just too expensive for professional researchers to undertake transcription within tight budgets. This doesn’t have to be a problem – as our experience shows, analysis can be designed to overcome what is lost by not transcribing oneself.
However, don’t undervalue transcription. If you’re a student you have the luxury that you may never get again to engage with your data during this important process. Thinking of transcription as a ‘moment of contact’ with data, during which you can take notes about content and potential avenues for analysis, rather than a boring task you just want to finish, will free you to make the most of your data.
Christina Silver

Qualitative Analytic Design #2: Phase One – High-level mapping of semantic content


Following on from discussion the factors informing the design of or analysis, I promised to outline each phase of our analytic plan. Here’s the first.

First, though, it’s useful to illustrate the analytic plan in its entirety, because in undertaking any phase of analysis it is always crucial to build on what has gone before, and anticipate what will happen as a result. That’s what makes analysis focused.

The diagram below shows the four phases of analytic plan as it currently stands. You’ll notice that in my last blog post I said our plan had three phases. Since then we’ve got further into the analysis and now are thinking about the phases slightly differently. That’s the nature of qualitative research design, it develops as the project proceeds.

Screen Shot 2016-01-21 at 09.46.27

It’s important to note that this diagram relates specifically to the qualitative analysis of the MOP writers’ narratives, but this is only one part of the Defining Mass Observation project. How the quantitative analysis of writers’ characteristics integrates with the work I’m discussing here will be the topic of a separate blog post later on.

The first phase takes the form of mapping out the content of the materials at a high level. This is necessary because we didn’t have an overview of the materials at the outset. See here for a discussion of why this is. 

There are three elements to this mapping process, that are undertaken in parallel.
– indexing the narratives according to semantic content (via descriptive coding)
– capturing the emotional tone of the writings (coding for expressions)
– reflecting on and summarising each writers’ narrative in relation to the Research Questions


Indexing semantic content
Our pilot analysis identified key areas that we need to know about if we are to be able to answer our research questions. For the starting research questions for My Life Line and Social Divisions see here.  

There are several things we need to do in order to be able to answer these questions. First, we need to know which events writers discuss in their responses and how they express themselves when writing about them. For example, with respect to the My Life Lines Directive, we cannot analyse how and why certain events are significant, meaningful or how they structure writers’ lives unless we first know what events are reported and how they are discussed.

These factors underlie the need for a multi-staged approach and our focus on mapping out semantic content at a high-level first.

Some areas that we need to index are specific to the Social Divisions and My Life Lines Directives, although as shown below, there are many overlaps. The overlaps are one of the ways that we well be a able to integrate analysis of the two directives later on.

Screen Shot 2016-01-21 at 09.37.08


So what do we mean by ‘semantic content’ and how do we go about ‘high-level mapping’ in MAXQDA?

To quote Braun & Clarke (2006:13): “With a semantic approaach, the themes are identified within the explicit or surface meanings of the data and the analyst is not looking for anything beyond what a participant has said or what has been written.” What this means for us, is that we are capturing – through coding – the content of the material in our key areas (the concepts in diagram above) from the ‘surface level’ of what is written. This means that the codes we’re using during this stage are not ‘themes’. They are areas of interest that we need to index in order to be able to answer our research questions – during this stage their purpose is essentially descriptive – to map out the content of the data according to the ‘surface’ or ‘semantic’ level.

Capturing the emotional tone of the writings
In addition to mapping out the semantic content of writers’ responses to the Directives, we also need to capture the way in which they write, the ways the present their accounts. The research team had extensive discussions about the best way to go about doing this, discussions that were driven by the overarching objectives of the study, but also informed by what we know is possible using our CAQDAS package of choice for this project, MAXQDA. See here for a discussion of the factors informing our choice of software. Any qualitative text can be read at different levels, for example, what is explicit in the text, what is implied and what can be inferred through interpretation.

Given that this phase is generally about mapping out the semantic content of the material, our aim in capturing the emotional tone of the writings is to interrogate whether certain topics and events are written about in a generally positive, negative or neutral way. Coding for expressions beyond what is explicitly present in the writing – for example because we know from earlier statements the writers’ feelings about certain topics – would mean that we lose that they have recorded an event or experience in a particular way. For example, in the My Life Line material, coding the straightforward statement “my cat died” as ‘negative’ because we know from earlier comments that the writer loved her cat, took her cat everywhere with her, perhaps partly as a result of having few close family members or friends, would lose that she records the event of her cat dying in a neutral, matter-of-fact way. The decision was therefore made to index every statement to one of the following codes: positive, negative, neutral, mixed. Doing so sets up the possibility to interrogate in the next phase of our analysis, whether certain events, experiences and topics are generally discussed in different ways.

Reflecting on and summarising each writers’ narrative in relation to the Research Questions
So, in indexing the semantic content and emotional tone of the writings, at this stage we’re not focusing on capturing our interpretations as analysts within the coding. No doubt this needs to be done, within the interpretive paradigm within which thematic analysis resides. And we’ll do this later on – but only once we’ve identified and prioritised core areas, which of course we cannot do until we have mapped them out.

However, whilst we go about this mapping process we do, of course, have thoughts and insights, make connections and interpretations. And we don’t want to lose them. Amongst the things that are core to what we all do in qualitative data analysis – whatever our objectives, methodologies, analytic strategies or tactics – is to reflect. We reflect on everything, all the time, we can’t help it. And nor should we. A key benefit of using a CAQDAS package, whichever one you choose, is that the thoughts you have can be captured within the software project, at the time you have them and, crucially, be linked to the data that prompted them.

We therefore developed a template for capturing these reflections during the process of indexing the semantic content and capturing the explicit emotional tone of the writings. This included the following elements: Opening, Topics, Language use, Classifications, Writing styles. As each writers response to the My Life Line and Social Divisions directives were indexed, these templates were filled in by the analyst. Each analyst had the freedom to make notes about these different elements in the ways that seemed appropriate for the individual response, but we each did so in relation to the overarching starting research questions for each directive, and the consistent template ensured the focus of our reflections was consistent. In particular, it meant that we were capturing the thoughts and insights we had about each MOP writer as we were reading and indexing their responses.

Screen Shot 2016-01-21 at 09.42.12

One of the issues with taking a high level semantic indexing approach is that it can be quite difficult to only think at this level when reading directive responses. For the reasons discussed above and in the blog post about the factors informing the design of our analytic approach, it was important to work initially at this level. However, we didn’t want to lose the more in-depth interpretive thoughts we had about writers whilst undertaking the indexing of semantic content and explicit emotional tone. The use of our structured template for reflections ensured that we could keep the coding at the right level, whilst capturing our interpretive thoughts. Both will be of use for the later stages of our analysis.

My next blog post will discuss the second phase of our analysis, analytic prioritisation.

Christina Silver


Qualitative Analytic Design #1: Factors underlying our approach

In her recent post, Rose commented on the variety in the responses to the 2008 Your Life Line Directive. This variety has shaped the way we are approaching the qualitative analysis of this and the 1990 Social Divisions Directives. So I thought I’d outline our analytic design and share how we are implementing it within MAXQDA (see here for an explanation of our choice of CAQDAS package). I’m doing that in a series of 4 posts, this is the first, check back over the next few weeks and months as our analysis proceeds for the next 3 posts, which detail the way we are going about each analytic phase.

Framed by the projects’ overall objectives, research questions and methodology, our analytic design evolved out of a pilot analysis phase when different approaches were trialled on a sub-sample of narratives from both Directives.

The resulting design involves three phases: i) high-level mapping of semantic content, ii) thematic prioritisation, and iii) in-depth latent thematic analysis. Each phase will be the subject of separate blog posts.

Here, though, I’m briefly discussing four factors that underlie this approach:

1) the nature of the data

2) the need to both keep separate and to integrate the analysis of the two Directives

3) the practicalities of the project

4) the need to develop a transparent and transferable process


The nature of writing for the Mass Observation Project (MOP) and the data that is generated

Because the writings of Mass Observation volunteers are only loosely guided by the questions in the Directive, we did not have an overview of the general content of the material at the start. This is symptomatic of this type of secondary qualitative data analysis. The narratives were not generated for the purpose of this study and therefore we have had no influence on the nature or content of the material we are analysing.

This is very different to the type of situation where researchers design and undertake interviews or focus-groups, or observe naturally occurring settings or events. Had we been involved in designing the Directive questions for a specific substantive purpose we might have had some ‘control’ but even then, the very nature of the MOP results in very varied responses to Directives. Some are short, others much longer; some specifically seek to answer Directive questions, others attend only very loosely to the Directive questions; some are written in a quite structured form, for example using bullet points, others are longer more free-flowing, discursive-style narratives; some are written in the first person and reveal detailed insights into personal experiences and opinions, others are more cursory, brief descriptions that upon first reading appear to reveal little about the feelings and opinions of the writers.

This varied nature means we have a very rich set of materials – just what qualitative data analysts love – but when we started out we had no idea of the content of this large body of varied writings. We therefore needed to design an approach that first provides us with an overview, so that we can evaluate the extent to which our research questions are answerable by the data. We could have achieved this by first reading all the Directive responses, but with almost 600 Social Divisions and almost 200 My Life Lines responses, some of which are many pages long, and a short-time frame, we couldn’t do this. We needed to be coding whilst reading. Had the analytic team done the transcribing, we would have had the broad content view we’re looking for from that process, but again, the project resources didn’t allow for that and temporary typists were employed to transcribe the responses. We did informally ‘interview’ the transcribers about their impressions of the MOP writers when they had finished, though, and this informed our thinking. But we could not solely rely on their opinions.

Analysing different sets of responses separately then integrating our analyses

Initially our intention had been to analyse responses to both Directives together, because one of our objectives is to explore the extent to which perceptions and lives have changed between 1990 and 2008 amongst writers who responded to both Directives (that we call ‘serial responders’). However, the pilot demonstrated that despite the need to analyse the ‘serial responders and despite the synergies across the two Directives’, starting out with all the data together would be impractical and would affect our ability to maintain focus whilst coding each Directive.

In addition, it became clear in our pilot coding that we cannot know at the outset which areas of our substantive interest offer potentials for looking across the two Directives. There are many potential synergies, and the longitudinal element of exploring the identities of writers that have responded to both Directives is an important part of our work. However, the Directives are very different and the context of the time in which these were responded to is important.

The practicalities of the project – number of coders and time-frame

Any project needs to attend to ensuring coding is consistent. The involvement of three coders and the short time-scales mean that we needed to design an approach that maximizes consistency from the outset. We could have mapped out the content of the Directive responses by undergoing an “open” coding exercise as a means of initial theme generation; this would have been similar to the usual first stage of Grounded Theory-informed projects. Indeed we did this in our pilot work and this informed the focus of phase one. However, the time available and the uncertainty of content means it is more systematic to focus first on the descriptive content and undertake more interpretive work once we have a clear idea of content.

The need to develop a transparent and transferable process 

One of the objectives of this project is to open up possibilities for using MOP as a source of secondary longitudinal qualitative data. This means that verified and assured processes for analysing MOP data that can be adopted or adapted by other researchers are amongst the projects outputs. The three-phased approach not only serves our analytic purposes but also offers a method that can be easily documented and illustrated.


Like qualitative research design in general, ours is iterative and emergent – we expect to need to refine our initial research questions as we progress – in light of our growing understandings. I will outline the three phases of the design (high-level mapping of semantic content, thematic prioritisation, and in-depth latent thematic analysis) and how they are being implemented in the CAQDAS package MAXQDA, in future posts.



Our choice of qualitative software explained

In this blogpost I’m sharing the process we went through in choosing to use the software package MAXQDA to analyse the writings of volunteers in relation to the MOP Directives “Social Divisions” and “Life Lines”. I hope this will help others make their own informed choice between the range of software available.

Software designed to allow researchers to manage and analyse qualitative data are collectively known as CAQDAS packages. CAQDAS stands for Computer Assisted Qualitative Data AnalysiS. These packages have a range of features including data management and organisation, content searching, code and retrieve, writing and visualisation, querying (e.g. Boolean, proximity) and output. For more information see the CAQDAS Networking Project.

There are many different qualitative software packages to choose between (see here for some reviews). There is no “best” package; each has its pros and cons and therefore the decision-making process is best made in relation to the characteristics of particular projects. For some it might not be a given to use software at all. If that’s the case, I’d still recommend going through a process like the one described here because it will help rationalise whether using one of these software packages would be useful.

How to choose

The first step is to list and prioritise the needs of the analysis, then compare products in terms of how their features can be harnessed for those needs. The aim is to choose the program that will enable you to fulfil your prioritised needs in the most streamlined way. There are usually some compromises – because not all the programs have the same features. Prioritizing the project needs will help you see which compromises you can live with and which you cannot.

The first step is to be clear about the broad objectives of the study. For the Defining Mass Observation project we have a number of overarching objectives relating to the whole project which were important to consider in choosing a CAQDAS package:

Overarching project objectives

  1. a) to illustrate the value of MOP data for social scientific enquiry;
  2. b) to facilitate its opening up for the purposes of secondary analysis; and
  3. c) to contribute to the representativeness debate around the use of MOP data.

You can read more about these objectives here.

In addition are the specific research questions we are starting out with. In choosing a CAQDAS package these relate to analysing the writings provided by volunteers in response to the two MOP Directives we are focusing on in this part of the project. There will be a blogpost about the research design of the project as a whole later on.

It’s worth saying at this point that qualitative research design is emergent. That means that the research questions you start out with likely change as the analysis proceeds, which is why those that informed our choice of software are called “initial research questions”. We have three, two of which have sub-questions:

Initial research questions

  1. How do writers describe their perceptions of their own and others’ identities, in relation to class, politics, race and gender?
    1. how to writers perceive their educational trajectory to have influenced their class status and social mobility?
    2. do the ways MOP writers reflect on identities reflect socio-economic classifications used by social researchers?

2. How do writers perceive the structure of their lives?
A. how and why are certain events significant?
B. how and why are meanings are attached to events and decisions?

3. Of the writers who have responded to both Directives, how have their perceptions of their identities and their lives changed between 1990 (Social Divisions directive) and 2008 (Life Lines directive).

In order to fulfil the overarching project objectives and the research questions, we listed our key analytic and practical needs:

Analytic needs

– we need to be able to combine numeric information about MOP writers’ characteristics with their actual writings;

– we need to undertake an ‘inductive thematic analysis’ of the writings (see Braun & Clark, 2006 for a nice overview of the different types of thematic analysis );

– we need to be able to track the contributions of writers who responded to both the Directives;

– we need to be able to access keywords and phrases used by MOP writers that indicate the concepts we are interested in;

– we need to be able to map concepts within writings in response to each Directive in isolation, as well as compare across both Directives;

– we need to be able to interrogate the ways writers discuss Directive themes according to their characteristics.

Practical needs

– at least 3 researchers need to be able to work with the qualitative software;

– the software needs to be intuitive and easy to learn;

– we need the software project we create to be accessible to non-expert users.

Having thought these needs through in more detail, we prioritised them by putting them into a table in order of importance. This allowed us to rank the needs, as shown in the table below. As a result we chose to use MAXQDA. You can find out more about MAXQDA from their website.

Note that the table shows the results of our evaluation of MAXQDA only. I’m not showing in this blogpost the comparison with other CAQDAS packages, because the point is to outline the nature of our decision-making process rather than to compare products. In reality we constructed a table with several other columns – one each for the programs we looked at. I find it useful when doing this to colour the cells to show visually in which dimensions the different programs are evaluated positively. Here I’ve made the text bold here to highlight the reasons why MAXQDA was seen as a good choice. In the larger comparison chart, the cells were highlighted green for positive evaluations, red where a particular program does not enable a requirement, and left white where any of the programs had the required features – resulting in a heat-map type matrix of requirements by software program features.

Requirement – in priority order Detail Evaluation – of MAXQDA features in relation to requirements
1 Team-working There are three researchers on the project who will contribute to the analysis and use the software. They have different responsibilities and roles and therefore we need the ability to isolate and integrate aspects at different times. Although MAXQDA does not allow concurrent work by multiple analysts, whereas other packages do, its teamwork import/export features are sufficient for our needs.

Two of the researchers have prior experience of working with MAXQDA in team projects and therefore are familiar with the protocols we need to put in place to enable systematic and streamlined team-work in MAXQDA.

2 Intuitive and easy to learn


This project runs over 15 months and therefore timescales are tight. When making the decision we did not know if the researcher we are recruiting would have experience of using any package and therefore a program that could be familiarised with quickly was important.

The overarching objectives of illustrating the value of MOP for scientific enquiry and opening it up for secondary analysis also necessitates an accessible means of   communicating our analytic process and findings to other researchers who may not conversant with CAQDAS packages.

Intuitiveness of software and ease with which individuals learn to harness it for sophisticated analysis are subjective. We therefore used our experience of teaching CAQDAS packages as a proxy measure for this requirement: MAXQDA is amongst the most straightforward to teach.
3 Assessable to non-expert software users An intended output is to make the software project we develop available for other researchers to access. That MAXQDA has a free Reader-version that enables those without software licences to open projects and view analysis was an important factor in our choice, as it will enable us to share our database with others easily.
4 Linking of quantitative information with qualitative texts This is a key analytic priority of this project. Quantitative information about writers’ characteristics is being collated and cleaned for the purposes of statistical analysis in order to contribute to the debate around the representativeness of MOP writers and the value of these materials for social science research. All of the leading CAQDAS packages allow numeric information to be imported and linked with the qualitative materials to which they correspond; although this is a fundamental need of our research, it was thus not a determining factor for the choice of software.
5 Data-driven qualitative analysis We are adopting an inductive thematic analysis of MOP writers’ responses to two Directives. We therefore need software features that we can use for this purpose. Any of the CAQDAS packages would enable in-depth qualitative analysis so this was the least important requirement in our prioritized ranking and not a factor that impacted upon our choice.
6 Track responses of writers contributing to both Directives An important aspect in opening up MOP materials for research purposes is to illustrate how the writings of individuals over time can be tracked. All of the leading CAQDAS packages will allow for tracking via combining or grouping data contributed by the same individuals. So this was not a determining factor in our choice of software.

However, MAXQDA’s Document Comparison chart provides an alternative way of comparing application of (groups of) codes at the level of data files, not available in other packages. We considered this will be of benefit when considering whether and how individuals perceptions differ when responding to different MOP Directives.

7 Map concepts within and between Directives The nature of the data (i.e. that it was generated in response to open-ended Directive questions) means we have no overview of content at the outset, as is the case when generating data through customary qualitative data collection methods, such as interviews or focus-group discussions. An initial analytic task will therefore be to map out the content of writings at a high level, in order to identify the prevalence of concepts so that we can focus analysis. All the leading CAQDAS packages enable high level mapping of the application of codes in data files in tabular (numeric) format with access to the corresponding texts for qualitative interpretation.

MAXQDA’s Code Matrix Browsers, however, are particularly easy to generate and provide clear and accessible visualisations. The requirements of 2) and 3) make these features attractive.

8 Interrogate Directive themes by the characteristics of writers Exploring the extent to which the perceptions and experiences of MOP writers differ according to their characteristics is one way in which we hope to contribute to the debate about their representativeness.

This means we also need an accessible means of showing other researchers, who may not be familiar with the software, the results of our work.

There are several “mixed methods” features in MAXQDA that we can use to present joint displays of dimensions in the data. The ones we evaluated as being particularly useful for this project are Crosstabs, Configuration Tables, turning codes into Categorical Variables and constructing Typology Tables.

The Quote Matrix feature enables a joint display of quantitative characteristics and qualitative texts to be directly outputted which will allow us to share findings easily.

Udo Kuckartz (2012) paper on mixed methods in MAXQDA gives a clear overview of these features.

9 Access and auto-code for keywords and phrases This is particularly important for the Social Divisions Directive because we want to consider the extent to which established socio-economic classifications are understood by and reflected in writers’ accounts and to focus on evocative language used in the texts. All of the CAQDAS packages have word search type tools which we could use for this purpose. In MAXQDA we can create and save our own dictionaries to locate multiple keywords and phrases which is attractive but not thought to be a major aspect of our decision.

Choice of software is always a combination between analytic and practical priorities. The outcome of a process like the one discussed here might prioritize analytic needs over practical ones. For Defining Mass Observation the practical needs outweighed the analytic because of the overarching project-level objectives.

It’s important to say that we are not claiming that this project could not be undertaken using a different CAQDAS package; far from it.

But when in the position of being able to make a choice for a specific project it is important that the choice is guided by the needs of the project. That requires both an understanding of the differences between programs as well as being clear about the overarching project objectives and the specific research questions.



Braun, V., Clarke, V., 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 77–101. doi:10.1191/1478088706qp063oa

Kuckartz, U., 2012. Realizing Mixed Methods Approaches with MAXQDA.