Phase one: Developing empirical generalizations
Between 2001 and 2004, data collected in qualitative studies of healthcare work and organization was subjected to secondary analyses, and sets of empirical generalizations were derived from these. These related to four domains of research: the normalization of telemedicine systems [16, 17]; professional-patient interaction and the organization of healthcare work in chronic illness [18, 19]; and the social production and operationalization of evidence in the clinical encounter . These treated 'normalization' as the endpoint of an implementation process in which some new technology came to be routinely employed in service. The comparative synthetic methods used to generate empirical generalizations about telemedicine implementation processes were then used to perform a similar analysis on data collected in the other studies. The methods by which secondary analyses were carried out have been described in detail elsewhere .
The empirical generalizations produced by these processes were general conclusions about regularities in the data, and were framed as formal propositions. They are given in Appendix 1. They did not in themselves make a theory because they were specific to particular contexts (i.e., although they were generalizations, they were not necessarily generalizable), and were not linked together by some account of causal relations, generative mechanisms, or organizing principles. In other words, they were observational rather than explanatory.
Phase two: Building an applied theoretical model
Between 2003 and 2007, using grounded theory-building techniques [21, 22], an applied theoretical model of normalization processes – the Normalization Process Model, or NPM,  – was derived from earlier empirical generalizations across all four domains of study. This was framed as a set of analytic propositions (see Appendix 2) that were supported by rigorous data analysis. This aimed to develop what Stinchcombe  has called an applied theoretical model of the factors that promote or inhibit the work of routine embedding of some new health technology in practice. These were first subject to critical review from a large group of researchers to whom manuscripts of different iterations of the model were informally circulated, and discussed at a series of seminars.
The purpose of the NPM was to identify and explain those factors that promoted or inhibited collective action that led to the routine embedding of complex healthcare interventions in service settings. There were four of these (interactional workability, relational integration, skill-set workability, and contextual integration) and we defined these as the constructs of the NPM. At this stage, the NPM synthesized empirical generalizations from groups of related studies, and producing taxonomies, maps of relations between concepts, and generalizations . These were linked together by sociological explanations of the relations between its constructs, their dimensions, and components. Taken together these set the scene for possible empirical verification.
Refining and testing the NPM
As an applied theoretical model, the NPM was restricted to a specific field of activity: the operationalization of complex healthcare interventions . To develop it further we sought to define and stabilise the way that we conceptualized theory itself. We assigned to theory three kinds of work :
Accurate description: A theory must provide a taxonomy or set of definitions that enable the identification, differentiation, and codification of the qualities and properties of cases and classes of phenomena.
Systematic explanation: A theory must provide an explanation of the form and significance of the causal and relational mechanisms at work in cases or classes of the phenomena defined by the theory, and should propose their relation to other phenomena.
Knowledge claims: A theory must lead to knowledge claims. These may take the form of abstract explanations, analytic propositions, or experimental hypotheses.
Further development of the NPM involved applying it empirically, in a process of 'road testing' the theoretical model . An important critique of theory building is that it is sometimes precipitate, proceeding before the generalizability of the phenomena it is concerned with is properly established . A second critique is that theory-builders focus too early on the problem of defining and measuring variables supposed to be relevant, without sufficient consideration of the coherence and robustness of basic concepts and constructs of the theory itself . 'Road testing' the NPM enabled us to work through these problems and provided a context in which to make rational decisions about face validity, and to ask whether the NPM merited formal testing. This consisted of two main pieces of work, quantitative data analysis and research synthesis.
Qualitative data analysis
We integrated the NPM in qualitative data analysis in three large studies (of the implementation of e-health technologies , the integration of telecare systems [29, 30], and the operationalization of a large randomized controlled trial). As we did this, we sceptically sought evidence for the adequacy of the NPM to perform the three functions of theory that we had previously claimed for it – to define phenomena, explain mechanisms, and form knowledge claims. It is important to be clear that this was not formal testing, because we did not at this stage seek to falsify the NPM. Instead, we practically tested its usefulness as an analytic tool.
Elwyn et al.  undertook a parallel critical analysis of the NPM by applying it to the problems of operationalizing shared decision-making tools in medical consultations. Participants in that process mapped the constructs of the NPM against data from evaluation and other literature, including primary studies and systematic reviews, and produced a set of attributions about the conclusions of these studies. The NPM was then applied to these attributions to determine whether it usefully explained them. Elwyn et al.  concluded that the NPM offered stable explanations of the collective work involved in shared decision-making processes and operationalizing decision-making tools.
By the end of 2006, the NPM in its published form  was sufficient as a set of conceptual tools to analyse specific processes, and it has been successfully applied to this purpose [32–36]. 'Road testing' showed that it had utility in explaining factors that promoted and inhibited collective action in operationalizing practices. It did not, however, explain how practices were formed in ways that held together, how actors were enrolled into them, or how they were appraised. These were three domains in which NPM could usefully be expanded. This recognition informed the next stage of theory building.
Phase three: Making a formal theory
After 2006, we worked to solve these problems. Between 2006 and 2009, the applied theoretical model of the NPM was extended, new constructs defined, and generative mechanisms defined, so that it formed a formal middle-range theory – NPT.
The production of a formal theory is a quite different enterprise than the work that goes into the identification of empirical generalizations or applied explanations. The goal of theory-building at this level is to isolate the generic properties of phenomena and understand their operation . To do this, we had to reformulate the healthcare-specific constructs of the NPM as generic or abstract propositions, and then to extend the theory by writing three constructs that related to domains we had previously established were absent. We called these coherence, cognitive participation, and reflexive monitoring. Although at this stage we still regarded our work as framing an extended NPM, we had embarked on a process that would lead to a generalizable, middle-range, formal theory:
We had defined NPM constructs as factors that promoted or inhibited collective action leading to the routine embedding of some intervention. We used additional analyses to identify macro-level analogues of the constructs of the model [30, 38]. These took the form shown in Appendix 3. We then constructed full definitions of the macro-level analogues of the NPM constructs and tested them against already collected data.
We operationalized macro-level constructs in a way that mapped on to the existing constructs of the NPM (see Appendix 4). For example, we construed collective action as a macro-level construct (with micro-level constructs of interactional workability, relational integration, skill-set workability, and contextual integration).
As we worked through macro-level constructs, we also began to use a much more structured model of theory-building in which generative mechanisms and relations required definition [39, 40]. In this context, we shifted attention to coherence work not as a macro-level abstraction of contextual integration, but rather as a generative mechanism through which an intervention was subjected to sense-making procedures by its users.
We drew maps of the processes with which we were concerned. This method for identifying the constituents of conceptual models is called analytical theorizing by Turner . This led to a map of the expanded NPM at work. We then followed Lieberson and Lynn  in reframing macro-level constructs derived from the NPM as descriptors of 'generative principles'.
The extended NPM that was derived from this work now had a general character, and the generative mechanisms and components to which it referred were not exclusive to complex interventions or even healthcare. They referred instead to generic properties of implementation processes and offered an explanation of them without reference to specific social contexts. We therefore presented it as a general, and generalizable, middle-range theory, NPT [5, 41]), that seeks to explain the processes of implementation, embedding, and integration of material practices in formally defined contexts, relates these processes to causal social mechanisms , identifies components of those mechanisms, and defines the investments that are required to energize them. The mechanisms of the NPT are described in detail elsewhere , but synopses are provided in Appendix 4 and Appendix 5.
Road testing the NPT
Just as development of the NPM involved a process of 'road testing' to decide whether it was sufficiently plausible and robust to merit formal testing, so did the NPT. We accomplished this using multiple methods. It is important to emphasise that the purpose of this work was not to formally test the theory, but rather to demonstrate that it was fit to be tested:
Assessing the stability of NPT constructs: Researchers working in very different contexts and on very different studies (including studies of e-health implementation and reconfiguration of primary care mental health services in the State of Victoria, Australia) worked with the constructs of the NPT to develop analyses of implementation and embedding processes [43, 44]. The criteria for stability were that the generic constructs could be translated into specific contexts without the addition of ad hoc conditions, and that sceptical researchers were able to use them in practice with minimal support.
Critical comparison of NPM and NPT constructs: A key question was whether or not expanding the scope of normalization process analysis to the higher-level constructs of the NPT has practical value. In other words, we wanted to be clear that there was an advantage to using the NPT. To this end, we coded two sets of data (interview transcripts from a study of e-health implementation processes, and qualitative data collected in systematic review of e-health implementation studies) using both the NPM and NPT .
To summarise, 'road testing' NPT required that we establish that its constructs actually defined mechanisms, components, and investments that could all be prospectively revealed by empirical research, and that these could be characterised in a stable way. We then had to demonstrate that these constructs could be operationalized in a way that conferred an analytic advantage. We sought confidence that NPT covered the ground we claimed for it, and that propositions could be derived from it that could effectively test the data and explain phenomena. This process was important because it paralleled the final revisions of the NPT as subsequently accepted for publication.
Relationship between the NPM and NPT
The formal theory (NPT) does not conflict with the applied model (NPM) from which it was drawn. In fact, it extends it. The constructs of the NPM are central to the formal theory and constitute its collective action component. The NPM is unchanged by this, and researchers can continue to successfully use the NPM in settings where only those factors that promoted or inhibited collective action are at issue [32–35, 45, 46]. The NPT, however, extends the applied theoretical model to include the ways by which actors make sense of a set of practices (coherence), the means by which they participate in them (cognitive participation), and the forms of appraisal that they apply (reflexive monitoring)
NPT is a middle-range theory
Although it has been developed through a series of multi-disciplinary collaborations, NPT is a sociological theory in that it takes as its focus the contribution of social action to implementation, embedding, and integration. It is also a middle-range theory [47, 48]. Following Merton ), we use this term to mean the following: the theory is 'sufficiently abstract to be applied to different spheres of social behaviour and structure' but does not offer a set of general laws about behaviour and structure at a societal level; the scope of the theory is defined by a limited set of assumptions from which can be derived hypotheses that may be confirmed or disconfirmed by empirical investigation; the limited scope of the theory leads to the 'specification of ignorance'. That is, the limits of explanation within the frame of the theory are established, and it does not 'pretend to knowledge where it is in fact absent'.
Specifying the range of the theory is important. Recent debates about theory in the social sciences [13, 50, 51] have emphasised the search for 'medium-scope patterns and mechanisms [that] distinguish between a complex social reality and an intentionally simplified analytical model of this reality' . The limited scope, conceptual range, and claims of middle-range theories are important because they are what make them practically workable in analysing practice.