Modelling process workflow for thesis writing

Recently I’ve been finding that whenever I’m stuck in my odyssey towards writing up my dissertation, modelling my process flow in a concept-mapping software (such as VUE) usually helps. In this (hopefully) final stage of my PhD project there are so many resources scattered around in various software and folders on my computer that I need a formal “concept map” (if that’s the right term) to pull them all together and work out the relationships and interactions between them.

Here is for example my last concept map that I’ve knocked up when I was unsure how to proceed with writing up the first four chapters of my dissertation. There is nothing particularly scientific about this map and it probably doesn’t follow any of the conventions of process workflow modelling. But who cares: it did the trick and allowed me to plan out the next stages of what I need to do.

Actually at least 2 or 3 days of deliberation are captured in this chart. First, I needed to decide whether I was going to use ConnectedText or something else for doing the actual writing. Through trial and error I established that it’s better to use another software because however much I love working in CT, it does have some limitations. One of them is that you can only have one instance of CT running and only one edit/view window open. Since I’ve decided to use CT as my database for my reading notes, I need to use another software, so I can be writing in one software in one monitor, while referring to the CT notes in the other. Also, there isn’t an easy way to track the word count of your document while writing in CT.

I had considered WhizFolders briefly as an alternative, but I find its interface too busy to be able to concentrate on the actual writing. So I settled on Scrivener for Windows, which works well both as a two-pane outliner and as a writing tool with decent word-count tracking.

As the sequence of the process flow is not apparent from the chart, let me describe it briefly. I start with importing my master outline with inline notes from Outline 4D (via Word). The reason I created my outline in Outline 4D is because it is a single-pane outliner that allows you to have inline notes, which you can also view in an index card view on a corkboard. Then I use Scrivener to break up the imported document into a 2-pane outline using Scrivener’s handy “Split with Selection as Title” command. As I start writing the actual text (I’m working on the first 4 chapters of my thesis, which need to be contextualised within their respective literatures, namely the Introduction, the Literature Review, the Conceptual Framework, and the Methodology), I begin to review my existing reading notes.

Over the years I have read all kinds of things that are no longer relevant. Therefore I need to deploy some kind of a filtering process to select the most important notes, as well as any new reading that still needs to be done. To consolidate my final reading list (a list of bibliographic references), I use a Natara Bonsai outline. First I import into Bonsai an existing outline document that contains some of my selected references that I have kept on my iPod/iPad in CarbonFin Outliner. Then I go through my old conference papers and other writings to extract references that are still relevant and which are kept in Word files and an old Scrivener project.

Simultaneously to this process I have also designed a ConnectedText project for keeping my final reading notes and quotes, using a similar model to the one I have developed for my empirical analysis. As my old reading notes and quotes are kept in a WhizFolders database, I will need to review those and transfer them one-by-one to the CT database (I deliberately don’t want to import them en mass, as I need to separate the wheat from the chaff). I will also use the CT project for recording any new reading I still need to do. I am designing this CT database not simply just for this writing project. Very likely it will become my main database for all my future readings for years to come. This is just an opportune moment to get started with it, as I no longer want to use WhizFolders for this.

Getting back to the chart, there are basically two important elements to it: 1) the big blue Scrivener rectangle which represents my writing, and 2) the big green rectangle below it which represents the CT reading notes database. If we look at the arrows pointing to the latter, we see mostly the data that needs to be transferred (by carefully sifting through) from my old files, as well as new reading notes that will be created in iPad.

As for the arrows coming in or out of the Scrivener project, those have to do mostly with referring to external sources. In the end I won’t need Excel for planning the word count because Scrivener has good enough tools for that. I will also use Dragon NaturallySpeaking for dictating, whenever I feel the need. Sometimes it’s easier to write without it, other times it speeds things up. As for EndNote, it is simply the central database for my references, which are linked to the PDFs that may need to be read for the first time or reviewed.

But my main point here is that it was the creating of this concept map that was crucial for getting me started with the whole writing stage. Without it I would have probably sat in front of a blank page with a writer’s block for days. Now I feel fairly confident that I know what I need to do next.

Hi Steve,

Great review! Thanks for following up on your promise :)

A few comments. Perhaps it’s worth pointing out that in effect you had given examples of how CT can be used both as a single-pane outliner and as a two-pane (or even a hybrid three-pane) outliner.

I would argue that there is even a third way (and possibly even more) in which CT can be used as an outliner, if we take “outlining” in a broader sense as building hierarchies. E.g. simply by having your obligatory home page and having outbound links from this home page you in effect are creating an outline, which can be viewed in the Navigator.

The limitation of the Navigator view is that you can’t really manipulate the order of the child items displayed, as they are automatically displayed in the chronological order in which they have been created or modified.

Regarding the TOC version of outlining in CT I’d add that it has the added benefit of being able to do real-time or reverse outlining, as the items (headings) being added to either a developing or finished text are displayed in real time in the TOC pane, as one types them.

As for not being able to drag headings around in the TOC, I believe that is going to change in v. 6, as I’ve seen it in the beta already. Not only will you be able to move around headings (dragging with the mouse or via keyboard shortcuts), but the associated text under the heading will also move, so the TOC will turn into a true two-pane outliner, where you can use the TOC to restructure even a completed text.

Regarding the Freemind export, I’m not sure what has gone wrong. However, I can say that exporting into Freeplane (which for me is an improved version of Freemind) works beautifully, and as a bonus, the links to topics are preserved, so by clicking on them, Freeplane will launch CT and open the corresponding topic. The only thing is that you must have a single top-level item in your original CT outline to which all sub-levels belong because Freeplane can only have one central starting node (a limitation on the part of Freeplane, not CT).

Thanks again!

Welcome to Sherwood

This is the long-awaited review of the outliner in ConnectedText. I’m not going to actually make this part of the OneNote Smackdown, because it has been too long since I was in that mind frame and I can’t reproduce it well enough to do an accurate comparison. But outlining in CT is interesting because there are some unique wrinkles, so here we go.

Really there are two ways to outline in CT. I’m going to start with and concentrate on the dedicated outlining window, but I’ll cover the second way within this context toward the end of this review.

Open the outlining window (via the View menu), you’re presented with this unassuming little window:

The window can be free floating or docked to the main window and opens in the position it was in when you last closed it.

Use the CTRL-Enter key combination to create your first item…

View original post 769 more words

Abstraction through extraction

If you’re wondering, I’m towards the end of the qualitative data analysis process that I’ve been describing in my ConnectedText (CT) tutorials (and in particular in this chart on the right). More specifically, I’m working on my “Findings” topic (a ‘topic’ is a document in CT’s lingo), and just today I have finally managed to complete the analysis and evaluation of all the underlying topic levels. This means that the “Findings” topic has now collected (by way of CT’s magical “include” markup) all the =Final findings= sections of its child-topics, gathering all the findings of my empirical research on one page.

This is obviously an important moment for my research project, as this will be the first time that I will be able to survey all the disparate conclusions I have drawn from nine case studies. The primary data that I have imported into CT amounts to around 800,000 words. The secondary material that I have externally linked to and which I have also reviewed could probably double that figure. This material and its associated analysis are contained in exactly 560 topics in my CT database as of today. My “Findings” topic is pulling the analysis from all these topics together into a single topic. Under the =Summary of final findings= I have now a structured list of conclusions, with several levels of headings.

The text of the “included” findings amounts to 2,834 words. This is the output of what I half-jokingly referred to as my “idea-sausage machine,” which had allowed me to process and reduce 1 million+ words into 2,834 words. In a sense this model was really a kind of machine, as part of the process involved mechanical extraction of text from one document and incorporating it into another document, over and over again. However the other half of the machine was my brain, where the “theory filters” had been applied to the data and where abstraction was carried out.

Nevertheless, both processes depended on each other: the mechanical extraction was part of the mental process of abstraction, but so was abstraction part of extraction. Tool and thought were very much co-dependent and they needed each other to produce the result: 2,834 words, which hopefully are transporting some interesting messages that can help answer my original research question.

So what next? Tomorrow I will need to reduce these 2,834 words further, possibly to a single (thesis) sentence of 15 or so words, which will be the short answer to my research question, and then to a 150-word abstract and also some more verbose formulations of my findings. How will I do that? Well, as my attached chart shows, I have been through the hoops and loops of “abstraction by way of extraction” a few times, as I gradually climbed my way up the daisy chain of my CT model.

Basically, I will need to use some tools in association with some brain cells once more.They key process involves classifying the remaining text on the basis of the themes and arguments it contains, and organising these into a hierarchy, where the more general points (conclusions) rise to the top of the hierarchy, while the specific points (examples, supporting evidence, details) are relegated to lower levels of the hierarchy, and superfluous points are relegated to the bottom of the list or deleted altogether. This task calls for some kind of an outliner software.

Now, CT is fairly well equipped to provide you with tools to guide you through this entire process. If the text to be analysed is simple enough, you could do the above analysis in the body of a CT topic itself, or in CT’s Notes pane, or in its dedicated Outliner tool, which can be docked or undocked. Indeed, I have constructed some enormous outlines with CT’s outliner, e.g. one with over 1,200 items. However, when it comes to the very last stages of drawing conclusions and making sense of complex outlines with important information, I like to switch to my favourite outliner, Natara Bonsai (Desktop Edition – see my mini-review of it here).

CT allows you to export its outline as an OPML file, which then can be imported into Bonsai (if you install this OPML filter here). However, if you don’t have a CT outline, you can just as well copy your text in CT’s view mode and paste it into the body of a new Bonsai outline, and it will look decent enough. Bonsai’s killer features for the type of analysis I need are the following:

1) Its ability to choose different colours for different levels of the outline hierarchy. This just makes the analysis so much easier, especially if you end up staring at an outline with a thousand items for several hours. 2) One-click collapsing of levels, so you can choose to see only level 1 items (which at the end of the analysis will be your main findings, as they will have risen to the top of the hierarchy), or also level 2, 3, or 4 items, or have all items expanded. This allows you to toggle on and off layers of different degrees of detail (with each layer or level being a different colour). Finally, its feature to zoom in and out of a branch of an item (also called “hoisting” in other outliners) with one click again makes it very quick to shut out the noise and focus on analysing just a single theme or issue.

When I’m done with my analysis and my most important findings have been promoted to the top of the outline hierarchy in Bonsai, I could just export the outline as OPML and import it back into CT as an outline. Although it’s easy enough to do, there is a quicker way. I could just directly copy and paste my new outline into the body of a CT topic. The slight problem with that is that if you select all the top level outline items in a collapsed view, you will still copy and paste the underlying levels of the hierarchy as well, which you may no longer need (as we are after abstraction here).

This is where a wonderful little tool comes in very handy: ABBYY Screenshot Reader. It just sits in my Windows toolbar at the bottom right, and when I click on it, it allows me to select any area of my screen, take a screenshot of it and via OCR extract the text and copy it into my clipboard. It is literally two clicks, select the area, and CTRL+V to paste it. All I need to do is use ABBYY to read my screenshot of the collapsed top-level hierarchy of findings in Bonsai, and presto, my abstractions are extracted and pasted into CT. Abstraction through extraction…

By the way, you can also do this ABBYY trick (i.e. extract the top level of a hierarchy) with BrainStorm, which is another VERY interesting tool I’ve been recently playing around with to carry out this final sorting of lists of findings. (See Manfred Kuehn’s post on how he uses BrainStorm with CT.)

Make your own research tool

I have described the chart below that depicts my use of ConnectedText (CT) for qualitative data analysis as representing a “conceptual model” and “a process flow.” But these terms don’t quite get the idea across that in fact CT had allowed me to construct my very own data analysis machine.

Once you have the basic structure and the logic of this system set up in CT, it works almost like a “sausage machine” with some filters put in. All you need to do is start pumping your empirical data in at Step 1, and as long as you follow the procedure and apply your theoretical filters during the abstraction process, the machine guides you through the production of some “truths,” i.e. your qualitative findings, the answers to your research question.

This brings me back to my earlier points (here and here) about why I prefer to do my qualitative analysis in CT, rather than in Atlas.ti or NVivo. There is no question that those other two dedicated CAQDAS software have more data analysis features and capabilities than CT. However, CT trumps them in one regard hands down: rather than just allowing you to analyse your data, it in effect allows you to create and operate your very own research tools, such as my “idea-sausage machine” below.

Check out my tutorial here, if you are interested in creating your own research tool.

CAQDAS model for ConnectedText

Below is a generalised conceptual model and process flow for organising qualitative data analysis in ConnectedText. It is intended to illustrate further – but in more general terms – the qualitative data analysis process I have described in my tutorials on how to use CT as a CAQDAS. An implementation of this model with a concrete example can be found here.

Here is how to read this chart. Each white box represents a ConnectedText topic (a single text document). “Research project dashboard” is the home page of the wiki, sitting at the top of the parent-child hierarchy and at the centre of the network of topics it links together (click here for a complete illustration of such a hierarchy). There are five levels of hierarchy represented in this image, with “Research project dashboard” being level 1 and “Case study 1 interview 1.1” being level 5.

The qualitative analysis (coding) process begins with Step 1 in the “Case study 1 interview 1” topic. In the first instance this is the topic that contains your interview transcript (or any other type of empirical material that you want to analyse). Using the coding process I described here and here, use the headings markup (=Heading 1= etc.) to annotate and code the material. In Step 2, use the “cut to new topic” command to move completed sections of the text into their own sub-topics, which will leave a link to them in the parent-topic.

In Step 3, review the codes for your coded material and aggregate and organise them under the heading =Conclusions= at the bottom of the topic. This is also a conceptual process of analysis, evaluation, and abstraction. It is up to you to whether you just collect your codes here or subject them to further processing and abstraction.

Step 4 utilises ConnectedText’s “include part of other topics” markup, which uses the notation


to include a section of text under a particular heading in one topic in the body of another topic. This operation also creates a type of a link between the two topics, by leaving an “Edit” button in the latter topic to lead you back to the source topic. This type of link will allow you to trace your steps back through the hierarchy and the “daisy chain” of “includes” and abstraction all the way from the home page at level 1 to the bottom at level 5.

Step 5 and any consecutive steps repeat the sequence of Steps 3-4-3 , by drawing conclusions and findings from previous conclusions and findings, which then get “included” in the next level up in the hierarchy, all the way until you reach the top. I like to compare this process of abstraction rising through the levels to bubbles rising to the surface of water or cream rising to the top of milk. The idea is that good stuff (meaning) finds its way to the top, which in our case is represented by the project dashboard. The project dashboard contain the research question. Ideally at the end of this process the answer to your research question should rise to the top and meet with it there, thus completing the search.

I’ve tried to use colours consistently to represent the various steps and commands described above. The “cut to new topic” command is represented by the blue arrows and the blue links left behind in the original topic (“Case study1 interview 1”). Red arrows represent the process of inclusion and connect the source topics with target topics. The “include” markups are also in red in the target topics. The green arrow represents the process of abstraction that takes place within a topic, by either drawing conclusions and formulating findings from the coded empirical material (at level 5) or drawing conclusions and findings from “included” conclusions and findings (at all the other levels).

“Included” content is marked in the same background colour (of its box) in the target topic as in  the source topic, to indicate that the actual text content is identical and to also make it easier to follow the relationships between the inclusions.

I hope this clarifies further the concrete case study I worked through in my previous post.

P.S. If you’re wondering, I used SmartDraw 2012 (which is on my list of favourite software) to create the above chart.

Summary and example of coding in ConnectedText

In this post I would like to summarise how I use ConnectedText as a CAQDAS for coding qualitative research material, and illustrate it with examples and screenshots. This current post should be read in tandem with the previous post that had laid out the coding process flow and discussed the main markups and commands involved. I will continue using the same example called “DRA case study.” If you would like to reproduce the following steps in your own copy of CT, I recommend you read my post on “Preparing for coding in ConnectedText” first.

To remind ourselves, here is a visual representation of the coding process using the DRA case study example:

Steps of the coding process in CT:

  1. Import your document into CT. Give it an appropriate name. (I will call mine “DRA1 Why CT”). Link it to the appropriate topic to place it in the hierarchy in the system. (I will put it under the “DRA case study”, which in turn belongs to the “empirical data” topic. All these relationships are created by simply typing the name of the topic in double square brackets. Entering [[DRA1 Why CT]] inside the “DRA case study” topic will create a link to the “DRA1 Why CT” topic and establish a parent-child relationship (if viewed in the Navigator pane).
  2. Open the topic you want to code (I will open “DRA1 Why CT”) in edit mode. Have the Table of Contents pane open on the left, and the Notes pane open on the right.
  3. Start annotating the content by using the headings markups (e.g. =Headings 1=) to record your codes (or observations). Note how the headings appear simultaneously in the Table of Contents.
  4. When a large enough section of the document has been coded and a clear enough thematic group has emerged (under a top-level heading), use the “cut to new topic” command to relegate that chunk of text from the current topic into a topic of its own. Use an alphanumerical system to name the sub-topic in such a way that it will show up under the parent-topic in the Topic list window. (I will name my sub-topics DRA1.1, DRA1.2 and DRA1.3 in my example.) In the Navigator you can see how the newly created topic is linked to its parent topic. By the end of the coding all the content in the parent topic (DRA1 Why CT) should be cut away, so that only links to the child-topics remain.
  5. Add an “include part of other topics” markup at the bottom of the parent topic (DRA1 Why CT) under a =Summary of conclusions= heading like this:
  6. Go to each of the child topics (DRA1.1 etc.), switch to view mode, copy the contents of the Table of Contents box in the topic pane (not in the Table of Contents pane), and paste it at the bottom of the topic under the heading =Conclusions=. You can also highlight the most important finding(s) in colour (e.g. yellow). You may want to apply bullet points or numbering to your list. You can also edit and reduce your list to the most important findings. Now if you go back to the parent topic (DRA1 Why CT) and switch to view mode, you will see that the “include” markup ((DRA1.1==Conclusions)) has now pulled in the text from DRA1.1 =Conclusions=.
  7. Repeat this process for each of the child topics, until all the text has been evaluated and all the conclusions have been included in the parent topic (DRA1 Why CT).
  8. Open “DRA1 Why CT” in view mode and review the included conclusions from its child-topics. Record your comments in the Notes pane on the right. Then copy them, switch to edit mode, and paste them under a new heading called =Final conclusions= at the bottom of the topic. Use colour highlighting to mark out the most important finding(s).
  9. Go to “DRA case study” (one level up in the hierarchy) and add “include” markups to collect all the =Final conclusions= from the documents the level below (DRA1 Why CT). Add a =Findings= heading below them, then review and draw conclusions as described in the previous point.
  10. Continue with this daisy-chaining process of aggregating and abstracting findings until you rise through all the levels of your hierarchy and arrive at your “Findings” topic, which should contain all the top level abstracted findings from your entire research study. The “include” markup leaves an “Edit” button in the topic in which it had been included, which means that by clicking on it you will be able to trace back where a particular finding came from, if necessary all the way back to the bottom of the hierarchy, the actual empirical evidence. And this is it. You have completed the coding of your material and have arrived at your findings, which hopefully will answer your research question.
  11. When you have finished coding a document and ended up with a number of sub-topics, you may want to take the opportunity to add “categories” to each topic, which is another way of classifying them and finding them in the future. “Attributes” and “properties” are additional advanced features for classifying and finding topics. Learn about them in CT’s Welcome project [2.9MB].

If you have any questions about this or suggestions to improve this process, feel free to comment below or email me using the Contact page.

Coding process flow in ConnectedText

In this post I will describe the process flow of how ConnectedText can be used as a CAQDAS for coding qualitative data (see my caveats and qualifications in the previous post). Unfortunately I can’t use my own CT project as an example because the information in it is confidential and it would just take too long for me to anonymise it. Instead, I will use the silly little example I came up with in a previous post.

Imagine you are conducting a research study into how people use technology. As part of your research, you have decided to interview me about my unorthodox use of CT as a CAQDAS tool. You have recorded and transcribed the interview. You have followed the procedure recommended in an earlier post to import the document into CT and use the suggested naming convention to entitle the topic containing the interview as “DRA1 Why CT.”

The “Why CT” bit is a reference to the main interview question: “Why do you use ConnectedText for qualitative data analysis?” This was the first interview in a series of interviews with Dr Andus, and it was “filed under” the Dr Andus case study, which in your CT is represented by a topic called “DRA case study.” We thus have a case study topic (the parent topic) and several interview topics linked from that topic as child-topics.

Using this example, I will now describe the overall coding work flow and the structure of the CT project before and after the coding, so you can have an overview before we delve into the details. The whole thing might not immediately make sense because you may also need to see how this was actually implemented in CT (which I will show you partly below, and partly in my next post). I strongly recommend that you try to replicate this in your CT test project, so you can experience how these various commands and markups work.

Here is a visual representation (created in VUE) of the coding process flow and the resulting structure of topic relationships. Please note that this follows the same overall CT project design I had suggested and described in this post. This chart just adds our concrete example of the DRA case study to the previous general model (this time only focusing on the “empirical data” and “findings” topic branches). It is a hierarchical structure with horizontal levels, where the Home page is level 1, “Empirical data” etc. is level 2, “DRA case study” etc. is level 3, “DRA1 Why CT” is level 4, and DRA1.1 etc. is level 5 (and it could go on as far down as you need it to):

The other thing to note about this chart is that it shows the end result of the coding process. Initially when you first import your document to code, there would only be level 1, 2, 3, and 4 topics. Level 5 topics (DRA1.1 etc.) grew out of the coding process. While you could set up the “DRA case study findings” topic in advance (I usually do), initially it would be empty (except for some “include” markup which sits there patiently until the topics they point to get gradually filled up with content) [more about the “include” markup below].

Let me now explain what we are seeing in this chart in full detail. As I said, it is a mixture of a process flow chart (follow the arrows) and an organisational structure chart (top-down hierarchy), which shows the elements of the CT project structure. All the boxes represent individual CT topics. Black arrows represent the parent-child relationships between the topics (links emanate from the parent topic to the child topic).

Red arrows represent the direction and operation of the “include part of other topics” markup, which is a critical command for this whole system to work. The “include” operation generally flows upwards from child topic to parent topic (remember my analogy of meaning emerging like “bubbles or cream rising to the top“?), except in the case of  the relationship between “DRA case study” and “DRA case study findings”, in which case they are “siblings”, sitting at the same level of the hierarchy.

Let me explain how the “include” markup works because it is such an important part of this system. When you add the following markup to the body of a topic in CT, it includes (pulls) the contents of the target topic in the current topic:


More importantly (and this is the killer feature for this project design and process flow), you can also specify to only include text that has been entered under a specific heading, by adding the title of that heading to the “include” markup like this:


To use a concrete example from our chart above, we will need to add the following markup to the bottom of the text in the topic “DRA1 Why CT”, if we want it to collect the conclusions that have been derived from and gathered at the bottom of topics “DRA1.1”,  etc.:


As if by magic, the text contained under the heading “Conclusions” in these sub-topics will instantly appear in “DRA1 Why CT”. Moreover, should you make any changes to that content in DRA1.1 for instance, the changes will be immediately updated in the “collector” topic as well. If you now take another look at the chart above, you will see that the red arrows effectively describe a daisy chain structure by which elements of some topics are included in the next level of topics until we reach the top level. Here is the whole hierarchy of “include” commands for the structure in the chart, cross-referenced with the topics they would need to be added to:

((DRA case study findings==Final findings)) - to be included in "Findings"
((DRA case study==Findings)) - to be included in "DRA case study findings"
((DRA1 Why CT==Final conclusions)) - to be included in "DRA case study"
((DRA1.1==Conclusions)) - to be included in "DRA1 Why CT"

You should also consider that this is not simply a mechanical operation but also a conceptual one. Each new inclusion at the level above includes content that has been extracted and abstracted from the topics at the level below. Meaning is becoming purified as it rises to the top, while retaining physical linkages (references) that allow you to trace your findings and conclusions all the way back to the ground level of the empirical material. You have to agree that that’s a brilliant functionality!

Let’s now turn to the other two features mentioned in the chart: the “headings” markup and the “Cut to new topic” command. The “headings” markup is also very important for this system because it is actually the notation by which the qualitative coding can be carried out the most easily.

[There are some other, more sophisticated ways of annotating a few words and aggregating them in another topic (such as “properties” and “attributes” – you can read up on those in the Help file or see them in action in this post by Steve Zeoli) but not of larger chunks of text (although this is likely to change, as CT’s developer is actively considering introducing a special command for marking up and collecting larger passages which would work very much like a standard CAQDAS feature).]

The “headings” markup is our best option now for coding, partly because it’s very quick and the results can be seen in real time in the Table of Contents pane, but also because the text under a given heading can be included in other topics, as I have explained above. Here is what you need to type to define a headings hierarchy (there are max. 5 levels) (or select them from the context menu by right-clicking on the selected text in edit mode):

=Heading 1=
==Heading 2== 
===Heading 3===
====Heading 4====
=====Heading 5=====

When I talk of “coding”, what I mean is adding your “codes” (annotations) as headings above the sections of the text that you are analysing. E.g. =Heading 1= could be your interview question. Then underneath you could add sub-headings, in effect annotating the interviewee’s responses. The 5-deep hierarchy allows you to structure the interview text in a meaningful way, by bringing out its implicit logical outline, which gets gathered and displayed in the Table of Contents pane.

=Why do you use ConnectedText for qualitative data analysis?=
==Because he didn't like the hierarchical nature of NVivo==
===NVivo separated the codes out of its original context===

Now, if you are doing this kind of coding on a 20,000-word document, sooner or later the Table of Contents might fill up the entire page in its pane. While it can be collapsed, the better option in fact might be to remove the coded content from the workspace altogether, partly to make it less cluttered but also in order to organise the content into thematic groups.

This is where the very handy “Cut to new topic (CTRL+ALT+N)” command comes in. Just highlight the section you have already coded that constitutes a coherent thematic group (e.g. the answers given to question 1, 2, 3 etc., if you have numbered your interview questions), right-click, choose “Cut to new topic” and voilà! – CT packs away the selected topic into a new topic, leaving a link to it in the original topic. This is where the Navigator window comes in handy, as you can monitor it visually how the sub-topics that you have coded and cut away gradually grow and appear at a sub-topic level.

When you are finished coding (marking up your topic with headings and sub-headings) and you have cut all of them away, all you should end up with in the body of your original document (“DRA1 Why CT” in our case) is a bunch of links to all the sub-topics (the double-square brackets signify that these are internal wiki links) (and you can of course give the topics more descriptive names after the initial code, such as “DRA1.2 about NVivo”):


Let’s say you started with a 20,000-word document (a 2-hour interview transcript). After coding, you might have ended up with 15 sub-topics on average with 1300-words each. Now you can go into each of those sub-topics (DRA1.1 etc.) and complete your evaluation of the codes and the content there.

The simplest way of doing this is to copy the contents of the “Table of Contents” (not in the TOC pane but in the view mode of the topic itself) in this sub-topic and then either paste it into the Notes pane for some further evaluation, organisation or pruning, or you can directly paste it at the bottom of the topic after a heading called =Conclusions=. Here you can carry out additional operations, such as turn this into a numbered or bulleted list or use highlighting to mark out especially important information. (You can also add a “Category” to the sub-topic itself, to have yet another way of classifying the content. See the Help file.)

Remember that only content that is under the =Conclusions= heading in the DRA1.1 sub-topic will be pulled into the parent topic, “DRA1 Why CT.” For this to work, in this parent topic you will need to include the markup ((DRA1.1==Conclusions)). It is best to collect these =Conclusions= from the sub-topics under a heading such as =Summary of conclusions=. Once all your coding is finished, all the conclusions from the sub-topics will be listed in the body of the parent-topic, “DRA1 Why CT.”

This is where the “reverse cascading” of the abstraction process begins, as now you will be ready to evaluate these collected conclusions from level 5, record your conclusions from these conclusions, e.g. by initally writing them in the Notes pane, and then pasting them under a new heading at the bottom of the “DRA1 Why CT” topic, called =Final conclusions=. These “final conclusions” then get pulled into the “DRA case study” topic, where you can repeat the whole process, draw conclusions from the conclusions of the conclusions and call them “findings” this time, under a heading called =Findings=. This in turn will be included in the “DRA case study findings” topic, the =Final findings= of which will eventually be included in the “Findings” topic itself. And that’s it. The meaning (bubble or cream) has reached the surface. Hurray!

I realise it might be a bit heavy to follow all this without seeing what the actual topics look like in CT. However, this post is getting quite long already, so I will summarise all the steps involved in this process in a final post, illustrated with screenshots.

Preparing for coding in ConnectedText

We are nearing the holy grail of this tutorial series, namely the post on how to code qualitative data in ConnectedText as a CAQDAS. However, before we get there, there are a few more things to clarify. I should add a caveat that those who expect a full replica of the coding functionalities of NVivo or Atlas.ti will be disappointed. CT hasn’t been designed as a dedicated CAQDAS. What I’m going to show you here is how you can model a qualitative analysis process in CT that could replace some forms of coding in NVivo or Atlas.ti.

I would say that the process I am going to describe is a form of “soft coding,” if by “hard coding” we mean a grounded theory-type implementation of qualitative analysis. The difference between “soft” and “hard coding” lies in your interpretation of whether qualitative analysis is more of an interpretive art or a scientific process modelled on the natural sciences. If you want to turn qualitative analysis into a routinised activity in order to adhere to positivist ideals of science, then you are better off with NVivo or Atlas.ti. However, if you approach qualitative analysis as more of a poetic, interpretive process where you would like to focus on the development of the emerging meaning rather than on the mechanics and accuracy of coding, then CT might be for you.

How does “soft coding” work in CT? Essentially what you will be doing is annotating (coding) qualitative material; aggregating these codes, and evaluating them all within their “natural habitat:” the document and the case study from which they had emerged, as opposed to let’s say NVivo, where the codes are aggregated by separating them from their documentary and case study context. CT instead can allow you to trace the whole trajectory of how an interpretation emerges from the original document and rises through several levels of aggregation and abstraction until it finds its place as a distilled finding in the final draft.

What I will describe here is a “reverse cascade” (from bottom to top) of abstraction. We will start out with the empirical data at the bottom rung, and interpretation and meaning will be gradually abstracted through several layers of filtering and reduction, rising to the top like bubbles or cream.

Before I launch into my demonstration and explanation of how “soft coding” works in CT, let me run through a checklist that I recommend you go through if you want to try to model this process in your own copy of ConnectedText.

Before you start coding:

  • set up your CT desktop layout as suggested here.
    • Have the Table of Contents docked on the left, edit/view window in the middle, and the Notes pane, Topic list, and Categories docked together in the right pane;
    • display the Navigator in a second monitor, if you have one. If not, just open it when needed, or dock it in the right pane.
  • customise your markup colours, as described here.
  • set up your overall CT project design and create the required topics as described here.
  • import your document using an appropriate method, as suggested here;
    • slot it into the hierarchy by linking it to where it belongs;
  • familiarise yourself with how to do the following (by reading the relevant topics in CT’s Welcome project [2.9MB] and practicing on a test project):
    • add headings (by using the context menu or typing markup)
    • highlight text (using the button)
    • create numbered or bulleted lists
    • use the “cut to new topic” (CTRL+ALT+N) command in the context menu
    • use the “include part of other topics” markup.

If you are comfortable with all that, you are ready to be introduced to the coding process flow in my next post.

Importing your data into ConnectedText

As part of this tutorial series on how to use ConnectedText for qualitative data analysis, in the previous post I have suggested that it might be a good idea to design and set up your research project and work flow in CT before importing the data that you want to analyse. Now we are ready to discuss how to import your qualitative research data into CT.

Please note that I won’t go into all the possible ins and outs of importing stuff into CT. You can read all about that in CT’s Welcome project (Help file), which is available here [2.9MB]. The topics of interest are “Importing text,” “Images,” “Movies,” “Embed Youtube video in a topic,” “Files,” “URL,” and “Application button.” Instead, here I want to focus on the organising of the import process, with some import tips that may not be found in CT’s Welcome project.

As I suggested in the previous post, I prefer to conduct the import process as part of the analytical filtering process of sorting out what’s important and what’s not. Importing stuff into CT is an opportunity to execute one operation of reduction on the long road of reduction and abstraction that leads to the producing of your research report or dissertation.

After all the entire qualitative research process is about reducing things meaningfully. You start out with collecting possibly millions of words (and hundreds or thousands of media and other types of files). Then the challenge is to reduce these millions of words (captured in your interviews, participant observations, collected materials etc.) into an 80,000-word dissertation, and beyond that, into 10,000-word journal articles.

For this reason I recommend importing material incrementally, as and when needed for the analytical process. Import stuff for one of your case studies or import one type of data (such as all the interview transcripts) and then organise and possibly even analyse (code) them before importing the next batch. During the analysis of the first batch you will get new ideas about how to organise the entire project, so that the next batch can be imported into a more organised space.

As you are importing stuff, you can use the Navigator to see which topic to link the newly imported material to. Effectively you are building a meaningful hierarchy, so that it can be easily navigated by just simply following logical links from the Home (dashboard) page to where you need to go.

The Topic list pane can also be used for introducing some order. For example, when you import a series of interviews belonging to the same case study, you may want to name the topics according to a convention using the same starting letters and/or numbers, so that they appear in a particular (alphanumerical) order in the Topic list.

E.g. let’s say you are conducting an interview series with me about my use of CT. After you had transcribed your recorded interviews in MS Word, you could start importing the Word files and naming the resulting topics as “DRA1 Why CT,” “DRA2 Getting started” etc. (DRA being the code for the “Dr Andus” case study and the number signifying the chronological order of the interviews).

Then when you are analysing “DRA1 Why CT” and you want to break it up into smaller chunks (new topics) using the very helpful “Cut to new topic (CTRL+ALT+N)” context menu command in CT (which will leave a link to the newly “cut away” topic in the parent topic), you can name the new topics “DRA1.1”, “DRA1.2” etc. This will keep all the topics belonging to the same case study next to each other in the Topic list, making it easier to find them (especially after you end up with thousands of topics a few months or years down the road).

Now, back to some of the mechanics of importing data into CT. “Importing” may not in fact be the right term, if by ‘importing’ you would expect files and their contents to be all included in CT’s project file, like it happens in NVivo for instance. As CT is a wiki (and works as a website), mostly it is only text, markups (e.g. from HTML imports) and scripts that actually get imported into a CT topic (which are all some form of text anyway). Beyond that what I mean by “importing” simply includes creating links in a CT topic to external files, such as image files, PDF, PowerPoint, Excel and even programme files. Even though the image will be displayed inside your topic, it is effectively linked in from outside of CT’s topic and overall project file.

Remember, CT is called “ConnectedText,” and there must be a reason for this: it mostly works by connecting textual information (although numerical information is also a form of text for CT’s purposes, as long as we’re talking about the contents of a topic). Why is this important? So you understand that when you “import” a file that is other than text, such as an image file or a PDF, those files continue to sit in a folder deep inside your PC and not inside CT. If you move them or delete them, it might break the file association and they won’t display or launch from your CT topic. One way to avoid this problem is to copy these files into CT’s dedicated folder in its own folder structure.

But I find that too much hassle and I don’t want to duplicate hundreds or thousands of files. Instead, I consider the importing/linking of an external file as a part of the analysis process. Once I have decided to review a folder on my PC and decided what to import from there into CT, I am done with that folder for ever. While I will leave it where it is for archival purposes and in order not to break the link path to the CT topic, I never intend to go back there and duplicate my effort. The whole point of importing stuff into CT is to continue the rest of the analysis and processing of that information inside CT, as the main tool.

As I said earlier, for me importing is part of the analytical/filtering process. Before I decide to link to a file, I open it, read it, analyse it and extract only the most important information to be copied and pasted into the body of a CT topic (see how I use NoteTab to clean formatted text and to repair broken lines in text copied from a PDF here). I do include a link to the given file at the bottom of the topic but not because I ever want to return to it to analyse it again but only as a reference, so I know where that information comes from.

This process can work particularly well when you are reading for a literature review. You only want to import the summary and some quotes from a 20-page PDF article and not all of its 10,000 words. And also, you probably wouldn’t want to have to do that process all over again a year later, when you want to remind yourself of that article. You won’t have to because the summary and the key quotes are now in your CT topic, with a link to the original file just in case.

Let’s turn now to the specific mechanics of importing some important file types. Importing text files is easy, and copying and pasting might be the quickest way of doing that. However, most people would probably keep their interview transcripts and participant observation notes in some kind of a word-processor file such as Word. If it’s unformatted text, a simple copy and paste will do. However, if you already have rich text formatting applied, such as headings and font formatting (bold, italics), and some images embedded, you will need to use an import procedure.

Although CT’s import tool (Project > Import…) includes Rich Text Format (.RTF) as one of the options, I wouldn’t recommended it as it haven’t produced good results for me. Instead (and this was a tip from a CT forum user), save your MS Word document as “Web Page, Filtered” and import it as HTML using CT’s import tool. This is what you need to do in Word and then in CT:

  1. Open your Word document.
  2. Go to File > Save As.
  3. In the “Save as type” pull-down menu (right under “File name”) select “Web Page, Filtered.”
  4. Change the name of the file to the intended topic name according to the naming convention I suggested above.
  5. Click “Save.”

  1. Open your CT project.
  2. Go to Project > Import…
  3. In the import wizard select HTML for “Files to be considered.”
  4. Under “Source file” navigate to the folder where you had saved the converted Word file (now it’s an HTM file) and select it.
  5. Under the option “If topic exists” choose “Create a new one”, to make sure you don’t accidentally overwrite a topic that you may have already imported and edited before.
  6. Click “OK.”
  7. If you hadn’t done so in Word yet (which might be preferable), then after the import you can rename the imported topic according to the naming convention I suggested above.

As for “importing” (i.e. linking) content that’s other then text, it’s a simple “drag and drop” of the file from Windows Explorer (or your favourite file browser such as Directory Opus) straight into a CT topic that is open in edit mode. CT will recognise the type of file and will insert the necessary markup automatically. If you drag an image file, it will add an image markup so that the image can be displayed inside the topic in view mode. If you drag some other file type (PDF, Excel, PowerPoint etc.), then CT will create a file link, and clicking on that link will open the file in its respective application.

If you drag a URL straight from a web browser window, CT will convert it into a URL markup and create an external link that can be launched either in CT’s browser or an external browser like Firefox. Finally, if you drag a .exe file, CT will create a button in view mode, on the clicking of which the associated programme will launch. Neat or what? There are some other possibilities as well, but for those you will need to read CT’s Welcome project (Help file).

The main thing to note here is that CT can become the main depository of all the information related to your research project because you can either import it in text form or create a link to it. CT will become the brain and the central nervous system of your research project. But it also has some other internal organs to help you distil the essence of your project, which can emulate the CAQDAS process knows as “coding your material.” But more about that in my next post.

Designing your QDA project for ConnectedText

If you have completed the steps suggested in the previous posts (here and here) for this tutorial on how to use ConnectedText for qualitative data analysis (QDA) (or if you are already a CT pro), then you are ready to move on to the next stage of the CT QDA process, which has to do with designing your QDA project for CT. I am suggesting that before you import and dump all your qualitative data and other notes in CT, it might be a good idea to come up with an overall shape for your project and work flow. It is certainly possible to ignore this advice and dump all your data in CT first and worry about organising them later. However, I found that being methodical and designing the project first and then importing data (strategically and incrementally) has helped me keep my head above the water – the data ocean, so to speak.

Let me first present you with my concept map (created in VUE) for the design of my PhD research project in CT, and I will explain how it works below.
Essentially what you are seeing here is a mixture between a top-down hierarchical model for organising topics in ConnectedText and a flow chart indicating a process flow (roughly from left to right) for the qualitative analysis of data and the production of the eventual report, in this case a PhD dissertation. At the very top sits the “project dashboard,” which is the tip of the data iceberg, if represented in this hierarchical model, or the central node of a flat network, if you try to visualise it as the home page of your personal Intranet system.

At the second (horizontal) level of the hierarchy (which are the topics that are linked to from the project dashboard/home page) you will find the main elements of the project. Let’s tackle these one by one.

“Meta project considerations” is the topic that contains or links to information that pertains to the overall organisation of the CT project, the work flow, the project plan and related tasks. This is the place to collect those thoughts and materials that are looking at the project from the outside or from above and are concerned with the overall design and operation of the system as a whole. (For example me reflecting on the design of my PhD project right now is an instance of such a meta-consideration. I will be including the above concept map under this main topic in my own CT project.)

The second main topic, “empirical data (case studies),” is the heart of the project and will contain the bulk of the material. It contains all the empirical data that I have collected as part of my research. It is organised into individual case studies, which contain such material as interview transcripts, participant observation notes and collected files (such as emails, PDFs, MS Office files or even URLs). The red arrows indicate the flow the qualitative data analysis process that I will be focusing on in future posts, showing how CT can be used as a CAQDAS. The main objective of the analysis is to extract findings from each case study, which will be eventually aggregated and evaluated in the fourth major topic, “findings.”

I have skipped over the third topic, “theory notes.” These include notes of all such reflections or interpretations that I have produced myself but which are not strictly speaking part of the empirical materials. It is debatable whether these observations should be included in the empirical data, if they were triggered by – or during – the data collection process. But I prefer to separate out material that was more part of the interpretation of the data than the data itself. Nevertheless, note the dashed lines which indicate that these “theory notes” are closely related to “empirical data” and feed into the “findings.”

While you are analysing your empirical data and evaluating your findings, simultaneously you will start having some ideas about the significance of these findings and how they should be presented later on. You might even want to select quotes to be included and discussed in the final draft itself. So the next two topic areas, “outlines” and “draft” are very closely related to the analytical process (from empirical data to findings) and start developing simultaneously with it. Hence the dashed lines coming out of “case study 1 findings”, which start to inform the outlining and drafting (writing) processes.

You might be in the middle of analysing an interview and have a sudden insight into how this material might fit into the overall or individual chapter outline, and you might even want to engage in some ad hoc writing and type up some paragraphs in the corresponding draft chapter topic. Nevertheless, outlining and drafting/writing will emerge as important processes in their own right, once the coding and analysis of the empirical material had concluded.

The final topic is an “inbox” for uncategorised and unprocessed material that had been imported into CT but has not been allocated to any of the aforementioned topics. I would generally advise against importing too much of such material, as it will just sit in CT and clutter the workspace. Regarding importing material, I found it helpful to work on one case study at a time and only import materials that relate to that case study. Also, I have treated the importing of material as a filtering process and a quality control process. After all, what’s the point of importing stuff that turns out to be utterly useless? It would just end up sitting in CT as dead weight.

Now, “inbox” needs to be understood metaphorically here, as CT is a wiki and therefore there are no folders or boxes into which you can drop stuff. Nevertheless, you can emulate an inbox in two ways: 1) either by creating a category label called “inbox” or “uncategorised” and append it to all new topics that need to be put into this virtual inbox, or 2) use a topic as an inbox and drop text, links to files and URLs into that topic. The same is true for all the other topic “areas”: they are not so much areas or folders as local networks of interlinked topics, for which the top level topic acts as the central node.

As for the overall project, it is essentially a process of trying to find an answer to a question. You can include the research question in your project dashboard, interrogate your empirical data with your chosen conceptual tools (theories), develop your findings, develop outlines to organise your argument, and write the eventual draft, which should hopefully provide an answer to your original research question. The great thing about CT is that it has tools for conducting this entire process in one place, within one software.

There are a few glaring omissions in my model above: a literature review topic, a conceptual framework (theoretical lens) topic, and a methodology topic. You could certainly include them here. I had worked on those phases of my PhD before discovering CT, so I haven’t had a need to include them in my CT project yet. However, as I will be moving onto writing up my dissertation, it is very likely that I will add in those topics and import the related material into CT as well. For the purposes of this blog and this CT tutorial I have decided to focus the above model on the qualitative data analysis process. However, it is easy for you to include those elements. All you need to do is type [[literature review]], [[conceptual framework]], and [[methodology]] into the body of the dashboard topic, and these topics will be automatically created for you and linked to the dashboard.

I have found writing this blog post a very useful exercise. This reflection allowed me to improve my model, as up until this morning it looked a lot less organised in fact. I didn’t have my meta considerations included in an organised way, neither did I have an inbox for uncategorised data.This type of meta-reflection on the design of your project and your work flow can be an important quality process in the ongoing development and improvement of your overall system.

Finally, let me include a couple of screenshots of the above model as implemented in CT. First, the edit mode (I have used a vertical tree view in the Navigator instead of the horizontal tree view of my concept map, so that it could fit into the left-hand-side pane. However, there is also a horizontal tree option in the Navigator, if you prefer that):

And then in view mode:

Please note that in the “PhD project dashboard” (or Home) topic only the words in blue are active links (e.g. “meta considerations“). The bullet-pointed text in black underneath (“CT project design” etc.) is only there to remind me what is inside that top-level topic (or link). Links to “CT project design” etc. are inside the “meta considerations” topic, which is currently not visible in the view window (as we are editing/viewing the “Home” topic), however the relationship can be seen in the Navigator pane on the left. Had I made those links live in the “Home” (PhD project dashboard) topic as well, it would have resulted in a messier Navigator picture, as they would have also showed up as part of level 2 hierarchy.

The Topic List pane on the right simply displays all topics in alphabetical order.