Corps de l’article

Music as a Self-Reflective System : Music as Structure

Music has been studied traditionally as a structure that is conceived outside of time. The score is a typical example. Being a collection of prescriptive symbols which make possible the realization of a sounding articulation, it can be considered a static artefact that provides a comprehensive mental view of the successive elements of its unfolding. Such synoptic overview has the advantage of simultaneity and virtuality : all the symbols are there at once, be it at a representational or virtual level of reality. Sounding music, however, is a temporal art, and listening to music involves the consumption of time. As such, there is a distinction between the inherent static structure – the intelligible structure as conceived by the composer – that resides in the score and the dynamic structure of the music as heard – the apprehended structure as experienced by the listener –, which calls forth epistemic interactions with the sounds, somewhat analogous to the distinction Lerdahl has drawn between composing and listening grammars (1988). A competent listener should be able to grasp the intelligible structure, but in many cases this is not the case. A score, in this view, can be helpful, not only as a means to secure the production of sounds but also as a means to analyze the structure of the music. It can be considered both as an a priori notation before sounding or as a post hoc symbolic lasting mark after sounding, which makes it possible to navigate mentally through the music, moving forwards and backwards and transcending the inexorability of time. As such, it calls forth primarily the concept of structure.

One should take care, however, not to equate the structure of the music with the score. Scores are merely collections of notes and notes are symbols that refer to abstract sonic entities. As such, they have a vicarious function, representing the sounds but not being the sounds themselves. A more fruitful approach, therefore, is to conceive of the musical structure not in terms of the symbols, but in terms of its sounding qualities. In order, however, to grasp them as a structure, it must be possible to represent them at a glance by relying on memory and representation. As such, there is a dynamic tension between the music considered as a static or dynamic structure. The latter conceives of music at a perceptual level, tapping the moment-to-moment history of successive now-moments in real time, the former involves a conceptual level that recollects these now-moments in memory and representation (Reybrouck 2015).

Music, therefore, can be processed “in time” and “outside of time”. It has both an intelligible and apprehended structure, with the former referring mainly to the relations and configurations between the individual elements and the latter to the unified coherent experience over time. In order, however, to grasp its overall structure, time should be considered as a modality. Music, then, can be considered a materialized idea, a network of relations or syntactic construct that founds a logic of sounding relationships (Faltin 1978), somewhat analogous to the formalist aesthetics that was advocated by Hanslick. Music, in this view, is not considered from its referential aspect, but as “sonorous moving forms”. Unlike language, with its centrifugal tendency of linguistic meaning, where attention is directed away from the text in order to grasp the meaning outside of the written text, music is characterized by a centripetal tendency with a focus on the auditory material (Kyndrup 2011). In terms of Saussurian semiotic terminology, this should mean that signifier (that what signifies) and signified (that what is signified) blend together, in the sense that musical signifieds are internal to the musical system, without any reference to something outside of the system. The signifieds, in this view, are not denotative or lexical (Imberty 1979 : 4-5) but self-reflective, in the sense that they refer mainly to themselves without any possible reference to something outside of the music. As such, it is possible to conceive of them in terms of internal semantics with meaning without relation to the external world (Cariani 1991) but relying mainly on self-reference, with the identification of sonic events and their interrelations. This involves a process of semanticity, a hermeneutic moment in defining “something as something” (e.g. the sound of a clarinet, a typical chord or cadential formula, …) with the denotation of elements being dependent on the process of recognition through identification and differentiation (Martin 204; Reybrouck 2009). What is eligible for denotation, therefore, is not reducible to extramusical reference, but is referring to the sonorous articulation and its identifying qualities. Music, in this sense, is a carrier of immanent meaning, with sounding elements as recognizable entities that can be assigned some meaning or semantic weight. The (external) reference, in this view, collapses to blend with the actual sound that acquires some conceptual quality (such as the note “g” which does not refer to the vibratory sound event, but to the category that embraces the actualization of this event). The reference, therefore, is not external but internal to the musical system.

The internal semantics approach, further, challenges the distinction between syntactics and semantics. It is reminiscent of the syntactization of semantics, which began in the 1930s with the logical semantics of Carnap and the model-theoretic semantics of Tarski. Such a syntactization is accomplished by completely encoding the world so that the elements (mostly formal symbols) are seen in relation to completely logical-symbolic structures without need of specifying any set of observables and without need of verifying their truth values with respect to an outer world. In case that the elements establish a relation to the outer world, however, they should be conceived no longer in terms of internal but of external or real semantics (Cariani 2001).

Music, as a sounding phenomenon, relies on internal and external semantics : the elements are referring to themselves (internal semantics) but they may trigger processes of sense-making (external semantics) as well. The internal/external distinction, therefore, is not to be seen as a dichotomy but as a dynamic continuum, which is related also to the experiential/computational dichotomy. To the extent that listeners experience a particular sound as a real sound of the external sounding world, there is an aspect of external reference. As soon, however, as they start doing mental computations, they may conceive of them not in their experiential qualities but in their symbolic form. Music, in that case, is conceived “in absentia” and not “in praesentia”, with the fullness and richness of the presentation to the senses being abandoned in favor of a more abstract level of processing. As such, there is a transition from presentational immediacy to representation, allowing the listener to deal with mental replicas of the sound rather than with their actual sounding qualities (Reybrouck 2004).

Music, in this view, can be considered as a closed system with elements that are conceived “outside of time” and which are accessible in a kind of virtual simultaneity. This is an old idea that stresses the synthetic function of the mind that goes back to Kant (1790, 1787), who claimed that imagination generates much of the connecting structure by which we have a coherent, significant experience over time : it goes through the manifold to put it together and to make knowledge of it by linking the separate impressions in an act of apprehension. The idea has been elaborated further by Husserl (1928) who, starting from Brentano’s statement that the grasping of a succession of representations involves that they are the simultaneous object of a single act of consciousness, argued for a kind of knowledge which summarizes all of them in time. This summing up is not articulated through time, as a series of successive representations, but must be considered as the experience of a temporal space or distance between distinct time moments at different time points of their unfolding. It entails a relational consciousness, which embraces at a glance a whole area of consciousness. This is the phenomenological constitution of time, which combines phenomenal and real or objective time, constituting a relational framework which goes beyond the mere description of temporal order and which offers a temporal experience that deals with actual and virtual time simultaneously. Such an experience is to be conceived as a time-constituting consciousness, which is directed to the past and to the future. It involves a tension to the past (retention) and to the future (protention).

The role of conceptualization must be considered here, both with respect to the units and their relations. Perception, in fact, is not limited to a succession of discrete now moments but calls forth a kind of relational continuity which is effectively experienced by us in our stream of consciousness. This idea, which was introduced by James (1976 [1912]), has been elaborated also in other conceptual frameworks that do justice to the dynamic ongoing characteristics of the sonorous articulation through time. They basically revolve around the distinction between the temporal unfolding, conceived as a concatenation of discrete time-moments and the conception of experienced time as an indivisible whole. This latter conception was advocated by Bergson (1962 [1889]) who conceived of time as pure duration without distinction between the component parts. Time, in this view, cannot be conceptualized as a spatial structure, which is divisible and homogeneous. Its real characteristic, on the contrary, is the dynamic and kinetic character of continuity and development, with past and present being intermingled in an organic unity. As such, we should consider the mental synthesis, which provides the connecting structure rather than solidifying the sensations in a kind of geometric structure. It is the basic distinction between time-as-quality (temps-qualité) and time-as-quantity (temps-quantité). The latter allows us to conceive of duration as an extensive quantity which is the projection of a fuzzy multiplicity into a distinct multiplicity which is measurable and observable. It stresses the sensational aspect of time rather than the mental synthesis, which puts the component parts together.

Music as Experience : A Processual Approach

Music can be considered either as an intelligible structure or as being structured by the listener. This structuring relies on processes of sense-making that embrace perceptual immediacy and conceptual abstraction. As such, it is possible to distinguish between several dichotomies such as (i) the continuous/discrete, (ii) the perceptual/conceptual and (iii) the bottom-up/top-down dichotomy. Each of them holds positions that are opposed to each other but it possible also to stress the dynamic tension between the opposite ends of each continuum, thus making the listening experience the rich experience it ideally should be. Bringing together these dichotomies should provide a semiotics for which the abstract is really material, a real semiotics of singular potential that is grounded in the real and natural experience.

A starting point is a processual and experiential approach to music (Reybrouck 2005), somewhat related to the early claims of cognitive musicology, which stated that music is above all a human experience, not merely a set of artifacts or structures. Music, in this view, is something which is heard and “enacted” upon – rather than being merely imagined or represented – with its meaning being characterized in terms of the “experience” of the human beings who are doing the cognizing. As such, there is a subjective element that highlights the tension between music as a structure and the actual experience of the sounding music.

The continuous/discrete dichotomy is a central issue of musical sense-making. The flow of sensory impressions is continuous, but a lived experience must be interrupted by the perceiver in discontinuous points or zones of focal attention in order to be meaningful (Ricoeur 1981). This basic claim of phenomenology can be translated also to the realm of music, which means that the processes of discretization of the sonorous unfolding involve a “quantal aspect” of perception (Godøy 1997). They make it possible to conceive of music in geometric terms as a kind of distributed substrate with discontinuities and focal allocations of semantic weight. Such intermittent sense-making, further, is discrete rather than continuous. Proceeding at several levels of resolution, it calls forth a dynamics of representation that describes the transformation from a flux to some kind of objectification, allowing the listener to think of a sounding flux in different temporal representations, from real time and fine-grained moment-to-moment sequential unfolding – in the range of high-frequency (10 milliseconds) or high-resolution processing of perceptual units (2-3 seconds) to low-resolution processing of temporal events that are extended over time – to concentrated overviews that represent longer stretches of time in a kind of instantaneous and synoptic overview (Godøy 1997 : 66; Wittmann 1999; Wittmann & Pöppel 1999-2000). It brings together continuous and discrete processing and does justice both to the idiosyncrasies of the sonorous unfolding (continuous) and the process of sense-making (intermittent) by applying discrete labels to slices of the temporal unfolding. Continuous processing, moreover, proceeds in real time and is time-consuming; discrete representation proceeds in a much more economic way by reducing temporal unfoldings to single representations with an all-or-none character. Both ways of processing are related to the distinction between categorical and acoustical perception (Handel 1989) with the former being mainly propositional in assigning a discrete meaning to an event that is evolving over time and the latter relying on acoustical listening and provides a phenomenological description of the sounds in terms of their acoustic qualities. Listeners, in the latter case, do not perceive the acoustical environment in terms of its acoustic qualities but in terms of recognizable “events”, which are continuous in their unfolding but discrete in their labeling. They make it possible to “recognize” an event rather than “experiencing” it with the danger of stopping acoustical processing in favor of categorical labeling, which is the hallmark of cognitive economy (Reybrouck 2005).

The perceptual/conceptual distinction is related to the way human listeners structure the perceptual flow. It goes back to Dewey and James, who both have elaborated on having an experience. Dewey, in particular, considered an experience proper as a form of heightened vitality, signifying active and alert commerce with the world and stressing the full richness of sensory experience (1958 [1934]). James has argued on similar lines by introducing a very original epistemology (radical empiricism) that deals with the tension between concept and percept (James 1958 [1934]; McDermott 1968). It stresses the role of knowledge-by-acquaintance – as the kind of knowledge we have of a thing by its presentation to the senses (the percepts) – and states that the significance of concepts consists always in their relation to perceptual particulars. What matters is the fullness of reality – the existential particulars – which we become aware of only in the perceptual flux. Conceptual knowledge can extend this knowledge but is inadequate to the fullness of the reality to be known. It is needed only in order to manage information in a more economical way but it remains superficial by its abstractness and discreteness (1976 [1912] : 245).

Skillful listening embraces both processing strategies. It embraces perceptual immediacy by stressing the idiosyncrasies of the sonorous unfolding and conceptual abstraction by applying discrete labels to slices of the sounding flux. As such, it calls forth the bottom-up/top-down dichotomy with sensory information being presented to the senses (bottom-up) and the mind applying discrete labels to chunks of information (top-down). The bottom-up processing provides the raw perceptual material, relying on continuous sensory stimulation; the top-down processing reduces the fullness of the sounding flux to conceptual categories or single cues, which have the advantage of speed of processing. As such they are important cognitive tools that transcend perceptual bonding and allow autonomous processing without peripheral connection to the senses (Langacker 1987).

These three dichotomies are merely starting points. As theoretical constructions they do not grasp the whole complexity of listening as a real-time experience, but they stress the broadening of reductionist approaches by bringing together seemingly opposed points of view. What they have in common is the role of musical sense-making with a shift from ontological (what is music?) to epistemological questions (what is music cognition and how can it be acquired?) with as major claim the construction of meaning out of the perceptual flux. This involves the semiotization of the sonic world with listeners not being considered passive recipients but agents that try to build up semiotic linkages with the world (Reybrouck 2001a, 2005). What matters is not merely ways of objectifying the sonorous articulation by providing means for portraying the continuous acoustic signal, but also the perceptual and cognitive processes of the perceiver, i.e. the role of the way how human listeners structure the acoustic flow.

As such, it is possible to conceive of music either in terms of sensory realia or their symbolic counterparts. Critical in this distinction the distance the listener takes with respect to the actual unfolding of the music. From a processual point of view, however, it is arguable to take a “realist” position as a starting point. Making sense of music, however, must go beyond a mere acoustical description of the sound. What matters is not merely the objective description of the continuous flow of matter in the physical world but also the perceptual and cognitive processes of the listener who structures the acoustic flow. This is the basic tension between the bottom-up and top-down approach : do listeners process all sensory information that is presented to their senses in a continuous way or do they rely on cognitive mediation, with the mind applying discrete labels to this unfolding in order to schematize the perceptual experience in a more economical way (Reybrouck 2005).

Our cognition, further, is not merely reducible to naive realism but has the mark of our cognizing with our minds. It means that knowledge is constructed as the result of an ongoing interpretation that emerges from our capacities of understanding – this is cognitive realism – that are rooted in the structures of our biological embodiment but which are lived and experienced within a domain of consensual action and cultural history (Varela, Thompson & Rosch 1991 : 150). Experience, in that view, is not only related to the richness of perception but is characterized in terms of the human beings who are doing the cognizing. It is a basic claim of cognitive semantics (Jackendoff 1987; Johnson 1987; Lakoff 1987), which claims a priority over real semantics in stating that we cannot take for granted the “real world” as the domain of entities to which language refers. Rather, the information that is conveyed must be about the construal of the external world, which is the result of an interaction between external input and the means available to internally represent it (Jackendoff 1987 : 83). Applied to music this means that we should consider the sonic environment in terms of the listener doing the cognizing.

This brings us to the enactive or experiential approach to cognition as an epistemological position that focusses on the realization of systemic cognition in the context of a living system‘s interactions with the environment (Varela et al., 1991). Cognition, according to Varela et al. “is not the the representation of a pregiven world by a pregiven mind but is rather the enactment of a world and a mind on the basis of a history of the variety of actions that a being in the world performs” (1991 : 9). Crucial in this approach is the grounding of cognitive activity in the embodiment of the actor and the specific context of activity. As such, it is related to the embodied approach to cognition (Johnson 1987) with embodied action being dependent upon the kinds of experience that come from having a body with various sensorimotor capacities which are embedded in a more encompassing biological, psychological, and cultural context (Varela et al. 1991 : 173). Embodied cognition, therefore, is a typical example of “non-objectivist” semantics that accounts for what meaning is to human beings, rather than trying to replace it by reference to an account of a reality which is external to the human experience (Lakoff 1987 : 120). It has received growing attention in musical academics and is likely to foster a lot of challenging research (Godøy 2006; Krueger 2011; Leman 2007; Schiavio, Menin and Matyja 2014; Reybrouck 2005).

Analysis and Beyond : Dynamics of Representation

The processual approach to musical sense-making holds a dynamic tension between the perception of now-moments and the grasping of a more synoptic overall structure with a lot of subjectivity in the scope of the perceptual focus of each individual listener. The question can be raised, therefore, whether it is possible to provide an operational description of the listener’s attentional strategies. The latter, in fact, are mostly not gratuitous but are ecologically and psychologically constrained. It is up to the listener, however, to comply with these constraints or to go beyond their limitations.

A first group of constraints embraces the psychological operations of grouping and segmenting, which can be at the level of conscious and deliberate control but which can occur at lower levels of psychophysical processing as well. The latter are well known and can be summarized as principles of perceptual organization (Deutsch 1999; Bregman 1993) with a major distinction between first order grouping of perceptual elements at a local scale and higher order grouping as in musical phrasing and segmenting (Deliège 1987; Clarke and Krumhansl 1990). This field of study has proven to be fruitful in showing a lot of overlap between theoretical claims and empirical findings, many of them from the domain of Gestalt psychology. From a semiotic point of view, however, it is possible to generalize still further and to conceive of basic thetic (grouping) and lytic (segmenting) operations, relying not only on the acoustic features of the sounds but on structural features that remain invariant under transformation. As such, they are related to the elementary logico-mathematical operations (addition, subtraction, multiplication, division, and looking for equality or difference), as described already by Piaget (1967 : 15, 25). Taken as a whole, they challenge the traditional concept of analysis – which stresses only the lytic part of the operations – by arguing for a broader set of mental computations that may be used to make sense of the music as structure.

What matters in this computational approach is the distinction between focal attention and synoptic overview, which are both related to the scope of predication. Music has temporal as well as atemporal aspects of organization, which may be processed either as a succession of now moments or in a simultaneous way, somewhat analogous to the distinction Langacker has drawn between summary and sequential scanning :

Summary scanning is basically additive, and the processing of conceptual components proceeds roughly in parallel. All the facets of the complex scene are simultaneously available, and through their coactivation [...]. Sequential scanning [...] involves the successive transformations of one configuration into another.

1987 : 248

Both modes of processing illustrate the conceptual flexibility to experience a complex scene successively with the passage of processing time or to activate also the component states simultaneously and to superimpose them as to form a single gestalt. They can be described also in terms of processual predication and episodic nominalization : processual predications follow the temporal evolution of a situation and represent different phases of the process as occupying a continuous series of points in conceived time; episodic nominalizations, refer to just a single instance of the process which can be characterized as a bounded region in some domain (Langacker 1987 : 191, 244).

This brings is to what Godøy has called the dynamics of representation. In describing the transformation from a flux to some kind of object he considers the possibility of

... thinking a musical object in different temporal representations, from “real time” versions to extremely compressed, i.e. “instantaneous” or “synoptic” kinds of representations, which have also been called “outside time” representations of musical objects.

Godøy 1997 : 11

Each velocity of representation can provide a different kind of perspective and a different kind of knowledge of the musical substance. The synoptic representations represent high-speed or broad-band types of representations, showing larger temporal unfoldings at a glance, but lacking sensory resolution. They gain, however, in abstract and conceptual autonomy. The sequential representations represent slower frame-by-frame overviews with, of course, higher sensory resolution (Godøy 1997 : 66). Listeners, therefore, can focus attention on individual sounds, but also on groups of sounds, and even on larger spans of time that may extend over several minutes or longer. As such, there are two major mechanisms of attentional strategies : the temporal extension or the scope of representation and the fine-grainedness or resolution of the distinctive elements (Godøy 1997; Reybrouck 2004). The difference in scope of representation has implications for the actual way of listening, with a major distinction between focal versus synoptic listening, somewhat related to the distinction Kramer (1988) has drawn between the linear or active and non-linear or still-spectator mode of listening. The latter considers the listener as a still spectator while the music is moving; the former conceives of the listener as the mover and the music as a static structure. Both modes of listening have been coined also as “in time” and “out-of-time” representations (Xenakis 1992) with the epistemic interactions with the sounds relying on presentation to the senses (in time) or on representations in a kind of symbolic space (out-of-time).

These different modes of representation clearly suggest a lot of freedom with respect to the focus of attention. Much depends here on the listener’s perceptual learning histories and attentional strategies, which may be deliberate and consciously mediated. It is the listener, in fact, who selects at will and focuses attention to things and events which he or she considers to be meaningful. This means that perception is not totally constrained and that there is a lot of epistemic autonomy in the way the listener builds up semantic relations with the sonic world, which can be determined empirically for each individual listener. There are, however, constraints which reduce considerably this autonomy, such as the limitations of psychophysics and psychoacoustics for lower level perceptual processing (fusion of spectral components such as harmonicity and synchronous onset) (Bregman 1990; Huron 1993), the Gestalt principles of perception for the delimitation of larger structural units (Deutsch 1982; McAdams 1984; Bregman 1981 and Reybrouck 1997 for an overview) and the constraints by ecological perception (Clarke 2005; Gaver 1993a, 1993b; Reybrouck 2005, 2012; Windsor 2004). The latter, in fact, entail a whole machinery of ‘‘semanticity’’ and ‘‘semiotization’’ of the surrounding world, which reduces its complexity to major categories. Or put in other terms : what we are listening to are not sounding things, but things as signs which shape our world. As such, the search for information is a major claim of ecological listening. It means that observers do not perceive the environment in terms of phenomenological descriptions – as in purely acoustical or auditory listening – but in terms of ecological events (Balzano 1986; Handel 1989; Lombardo 1987), which can be defined in an operational way as sequences of stimuli which are extended in time and which can be described in terms of structural (e.g. the sound of a clarinet) or transformational (e.g. the way the sound of the clarinet is articulated over time) invariants. These evens can be defined in an intuitive way as ‘‘things that happen’’, involving ‘‘changes in objects or collections of objects’’ (Michaels & Carello 1981) with their invariants acting as a kind of glue that ‘‘unitizes’’ sequences of stimulus information into coherent events (Bartlett 1984). They make it possible to describe events either at a glance or in their temporal unfolding, providing both a discrete and a continuous description of invariant patterns over time. As such, they behave as basic building blocks which function as units in perception and memory. This is in a nutshell the event perception hypothesis (Gibson 1966, 1979; Bransford and McCarrell 1977) which states that there is no clear dividing line between the traditional domains of perception and memory, and that the units of memory or perception can be greatly extended in time. Events, in this view, are the appropriate units of analysis, whether they are fast – as in perception – or slow – as in memory (Bartlett 1984).

From Experience to Sense-Making

According to the ecological approach to perception the way observers make sense out of the perceptual flux is not gratuitous but ecologically constrained. Starting from Haeckel’s definition of ecology as “the science of the relations between the organisms and the environmental outer world” (1988 [1866] : 286), it can be argued that sense-making is related to how organisms interact with their environment. The idea has been elaborated in depth by Gibson (1966, 1979, 1982) who provided a wealth of conceptual tools for giving an operational description of perceptual sense-making. In what he coins as direct perception (see Michaels & Carello 1981), he conceives of perception as occurring immediately without the mind intervening in this process, relying on direct contact with the sensory stimuli, and with reactions being elicited in a kind of lock-and-key approach. Information, in this view, is processed in an “all-or-none way” as a “discrete” reaction to stimuli which are continuous. Direct perception thus involves presentation to the senses (presentational immediacy) and direct reactivity to the solicitations of the environment, stressing the role of information pickup rather than information processing. The speed of processing provokes a quick response but at the cost of the richness of the sensory experience.

The concept of direct perception, however, is a somewhat ill-defined category (Reybrouck 2005). It calls forth direct reactivity to the environment, but is dependent on processes of learning and development as well. As Gibson himself has stated, perceivers ‘‘search out’’ information which then becomes ‘‘obtained’’ information. They pick up information which is already part of the environment and which affords perceptual significance for the organism. In order to do so, they must lean on “perceptual systems” which are tuned to the information that is considered to be useful (Gibson 1966 : 47).

The lock-and-key approach, further, might suggest a kind of causality between stimulus and reaction, though the concept of direct perception does not claim any linearity in the stimulus-reaction chain. It is arguable, therefore, to go beyond the level of psychophysical and ecological constraints and to distinguish several levels of processing which are rooted in our biological functioning (Reybrouck 2001a). At the lowest level, there is mere reactivity to the sounds without any cognitive mediation by the mind. This is the case in lower animals but also in modes of listening that process the sounds at very low levels of processing (reflexes, brain stem). There is, in fact, a closed system of causal chains with wired-in and closed programs of behavior that trigger reactions in a quasi-automatic way with specific stimuli eliciting specific reactions. As soon, however, as environmental stimuli become more challenging, this closed system must open up, “giving way increasingly to choice responses, to modifiability and plasticity of behavior, and to an increasing trend toward learning through individual experience” (Werner & Kaplan 1963 : 12-13). The stimulus-reaction chain, then, goes beyond causality by introducing intermediate variables between stimulus and reaction (Paillard 1994; Reybrouck 2001b).

This cognitive mediation or cognitive penetration – to coin Pylyshyn’s (1985) term – allows listeners to deal with music also at a higher level of processing. There are, however, several possibilities for such mediation, depending on whether the listener uses a top-down or bottom-up approach : the former relies on pre-existing cognitive schemata that interfere with the perceptual input; the latter relies on the building up of knowledge-as-acquainted and as the result of interactions with the sounds.

As such, this calls forth the early claims of biosemiotics which provides important insights about signification processes which are typical for living organisms in general. As an area of knowledge which describes the biological bases of the interaction between an organism and its environment (Hoffmeyer 1997, 1998; Sebeok an Umiker-Sebeok 1992), it focuses on the study of the behavior of living systems in their interaction with the environment. Music, in this view, can be considered as a challenging “environment” and the listener as an “organism” that must adapt itself in order to cope with this environment (Reybrouck 2001a, 2005).

Crucial in this approach is the role of circularity, as advocated already in the early writings of von Uexküll and Piaget. von Uexküll introduced the concepts of functional tone and functional cycle (1957 [1934]), which both stress the importance of functional and semantic relations that biological organisms establish with their environment by perceiving the world through a network of functional relationships. This network of relations constitutes the organism’s own phenomenal world, which can be considered as the sum total of perceptual cues among the stimuli in the environment. They act as trigger mechanisms – perceptual cue bearers – that select a number of objects, which then receive a special relevance to act as functional cue bearers. Both perceptual and functional cue bearers are related to each other in a circular way with interactions that consist principally of “perception” and “operation”, harpooning, so to say, neutral objects from the environment as meaning-carriers by a perceiving organ in order to be modified by an effector organ (as meaning-utilizer). The actions that are elicited thus change the functional meaning of what is perceived, which means that the functional qualities affect the perceptual ones – hence the concept of functional cycle – by transforming the object of perception by giving it a functional tone.

Our relation to the world, in this view, is not merely representational, but functional, which means that the number of objects which an animal can distinguish in its own world equals the number of functions it can carry out (von Uexküll 1957 [1934] : 49). The objects, which an organism or animal confronts, therefore, are not neutral objects but objects that are transformed into meaning-carriers so that a situation as it is perceived leads to an activity that is evaluated in terms of its beneficial or expected results. What matters, in this view, are not merely the actions proper but their results. The “circularity” of stimulus and reaction, therefore, is a central topic in the epistemic interactions with the world.

Piaget’s claims about reflexive action (1967) should be mentioned here as well. Reflexive action, as he conceives of it, essentially consists of three parts : a pattern of sensory signals; an activity which is triggered by the particular pattern of sensory signals and the experience of some change which is registered as the consequence of this activity and which turns out to be beneficial for the actor. The parts, taken together, build up an action schema which increases the internal organization of the organism, allowing it to act in the face of perturbation. As such, it supersedes the traditional concept of the reflex arc – as a linear stimulus-reaction chain – in favor of a basic principle of sensorimotor learning that goes beyond pure reactivity

The concepts of circularity and reflexivity have proven to be fruitful. They have been elaborated in the perception-action cycle (Cutsuridis, Hussain & Taylor 2011) that conceives of the mind as a central processing mechanism that co-ordinates sensory input with motor output (input-output mappings) and which is of primary importance in sensory-motor learning. It is an important conceptual tool, in particular for continuous perception that proceeds in real time such as the musical experience. The latter is time-consuming with mechanisms of sense-making that keep step with the actual unfolding in time, relying on continuous interactions with the sounds, either at the actual level of real sounding music or at the virtual level of imagery and representation.

Experience and Computation : The Concept of Symbolic Play

Nature and life are continuous in their appearance. They are not segmented but come to us in ranges, shades and gliding scales. In order to make sense, however, there is need of discontinuity and differentiation. It is the human mind, however, who introduces moments of semanticity, as a primary mode of consciousness to allow a transition from “qualifying” to “generalizing” in order to make distinctions and observables. This semiotization of the surrounding world entails a transition from an analog-continuous to a digital-discrete way of perceiving. The former is more suitable for exploring and perceiving as it is more sensitive by working beyond the limitations of fixed thresholds for distinctions; the latter is more suitable for labeling and measuring, by constraining the real world from a relatively large or continuous set of values to a relatively small set of discrete and quantized values, which have the advantage of distinctness and communicability. They allow observers to share an experience without actual living it and illustrate dramatically the economy of abstraction as against the subtlety of experience. Or to state it in another way : they highlight the difference between an analog image system and a languagelike or propositional system (Watkins and Dyson 1985 : 72). In passing from the sensory to the cognitive representation, there is, in fact, a systematic stripping away of components of information which reduces the experience of the phenomenally rich thing to only one or some of its components (Dretske 1985). This is a digitalization or conceptualization with a piece of information being taken from a richer matrix of information in the sensory-analog representation and featured to the exclusion of all else. Both processes focus on generic features that group together the maximum of information with the least cognitive effort by considering as equivalent a number of things that can be distinguished from each other but which can be subsumed under the same conceptual category. As such, they neglect the idiosyncrasies in order to allow discrimination at a more abstract level of similarity and to “recognize” things rather than to “experience” them.

Musical “sense-making”, accordingly, may proceed at higher levels of abstraction than the sensory “experience” of the sounds. At a cognitive-conceptual level listeners do not process the concrete-sounding sonorous events as physical data but as data that are disengaged from their existential dependency from the particular thing they are referring to. Such a way of processing involves a discretization of the sonorous flux, allowing a computational approach that deals with symbols that function as mental replicas of the sounds and which can be considered as musical denotata – to denote all the sounding material that can acquire some conceptual quality (Reybrouck 1999). Conceptualization, in fact, holds a symbolic approach to cognition. As a means for conceiving of something that is not physically present, it relies on signs in the scholastic conception of reference “aliquid stat pro aliquo”. By keeping distance from the perceptual flux it provides a representational mode that reflects the influence of human linguistic capacity on music cognition, allowing listeners to “share” experiences. Musical denotata, in this view, are considered in terms of recognizability, prototypicality and communicability. They stress the shared experience rather than its perceptual qualities and can be considered as structural units that are describable in a formal way.

One of the tasks of future research, therefore, will be to develop fully a formal, artificial, explicit language which can take into account all the units one can find in music and their combinations. This is an analytical methodology that reduces structural units to a purely formal level, stressing the more essential parts and eliminating nonessential aspects. The way of doing this is using signs and symbols instead of real things, representing objects at a reduced level of cues, in the sense that the sign will not call forth all the responses that the object itself will call forth. This is the price we pay for the transposability of the sign system that is used instead of the less transposable original. The advantages, on the other hand, are numerous. Symbolization, in fact, is a means for conceiving of things of events that are not physically present. The symbolic or semiotic function, therefore, takes an important place in developmental psychology, and is at the core of human representation and communication of knowledge. Representation, however, distances itself from reality, in the sense that to focus on something, one sometimes has to move back. This is the main idea of Cassirer’s conception of “concept” : one has to remove the presence (Präsenz) in order to come to representation (Repräsentation) (1954).

Musical denotata thus imply a generalized reflection of sonic reality, which can be formalized as assigning attributes to the sensory material. This can be done in a in a logical sense as a proposition with the sounding material as a subject and the attributes as a predicate. The attributes, however, should not refer to something external to the music, but to some generality to be assigned to the entity which is denoted. As such, the act of denotation is a primary modus of conscious experience, and a primitive marking system for singling out the noteworthy. As an act of mental pointing it begins with the emergence of a kind of quality in combination with an insistent particularity (e.g. “this is important”, “that is difficult”) (Whitehead 1968). This fusion of a large generality with an insistent particularity is related to the distinction between direct or immediate knowledge, which relies on particularity, and symbolic knowledge, which relies on generality (Whitehead 1927 : 13). It calls forth the role of presentational immediacy which is important in setting out the borders between perception and consciousness, with the latter involving complex mediated processes that supplant the immediacy of natural perception (Vygotsky 1978).

This holds true also for dealing with music where the listener is the critical factor in the delimitation of the denotata. He or she must take a symbolic stance to the sounding stimuli, which means that something is selected as a subject of focal attention and that it is assigned some semantic weight, both relying on the innate dispositional machinery to cope with sounds and on learned and acquired cognitive constructs. It is the listener, finally, who decides which distinctions will be made in order to enhance the grip on the observables by choosing, selecting and delimiting some of them and raising them to the status of things which can be denoted deliberately (Reybrouck 2004).

The attended elements, further, can be focal points or temporal zones with a certain extension in time, somewhat analogous to the distinction between snapshots of a movement and continuous gestures that make up this movement. The latter involve the consummation of the sounding flux by keeping track with the music as it unfolds over time. This sound tracking has a temporal extension and is perceptually bound, which means that it is dependent upon what is presented to the senses. The snapshots, on the other hand, involve a level of abstraction. In freezing a continuous perceptual image at a particular focal point in time, they resume and collect a lot of information that can be related to one thing-as-signified and that can be labeled also at a discrete-symbolic level. As such, they present a heuristic guide for sense-making which allows the listener to single out focal points of attention and to conceive of these perceptual elements in terms of salience, value, valence and semantical weight, somewhat related to the mechanism of cue abstraction (Deliège 2001) which focuses attention on salient elements that are prominent at the musical surface and summarize the sequences from which they arise. As such they provide key structures that play a foreground role in the musical work and help to grasp its design. Two questions can be raised here : how does a listener delimit these denotable things as signified? and what are the relations between these entities? There is, in fact, a distinction between a mere collection of selected elements and their putting together in a more encompassing structure.

This brings us to the concept of mental computations and symbolic play. Rather than relying on the “online” perceptual mode of sound perception which is dependent upon presentational immediacy, the symbolic mode takes distance from the perceptual flux, by relying on an “off-line” representational mode that is to be considered as a mode that proceeds outside of time (Bickerton 2009). The latter supposes the ability to operate on abstract mental representations when being detached from the immediate environment, and allowing the thinker to elaborate on these representations in a kind of virtual symbolic space, where all elements can be interrelated infinitely with the imagination providing the connecting structure.

Applied to music, this should mean that we conceive of listening in computational terms (Mazzola 2002), allowing us to lean upon the conceptual framework and tools of mathematics, not in terms of tunings and temperaments – with mathematical models of musical scales – or working with note values (adding, ratios, fractions), but in terms of mathematical activities such as counting, measuring, classifying, comparing, matching, ordering, grouping, patterning, sorting and labeling, inferring, modeling and symbolic representation. What is meant is an approach to mathematics which stresses the mathematical experience and the cognitive approach to mathematics rather than conceiving of it in terms of ciphering and arithmetic. Translated to the domain of music, this should mean that we can conceive of musical ‘‘objects’’ and ‘‘processes’’ in terms of formal and syntactic operations which take place at the level of imagery. It brings us to the concept of thinking as a kind of mental arithmetic or computation, in its broad definition of embracing the whole range of mental operations that can be performed on symbolic representations of the sounds and that finds its philosophical roots in the writing of Hobbes who claimed that reasoning is nothing more than reckoning. He took the calculating activity itself as his model of the mechanisms of the mental operations and conceived of thought as symbolic computation, as a kind of rule-governed manipulation of symbols inside the head.

Conclusions and Perspectives : Music Shaped in Time

The symbolic approach to music cognition has many advantages. It reduces temporal unfoldings to single representations with an all-or-none character which lean themselves to symbolic computations which can carried out on them. Music as a sounding art, however, is continuous in its unfolding. Music, therefore, can be dealt with in a mixed analog-discrete approach with two representational modes which are complementary rather than opposed. Dealing with music, in this view, is dependent upon the continuous sonorous display, which proceeds in linear time as well as on its symbolic and discrete counterparts. The latter can be conceived outside of time, with the symbols being stored at an abstract level of imagery in a representational format. As such they can receive a discrete and static character. To the extent, however, that these symbolic counterparts are perceived also in real time, it is possible to adjust their semantic weight in a moment-to-moment description, allowing a description of each of them as functions of time. They then receive an analog/continuous signature and it is up to the listener to update and adjust continuously the information that is provided to the senses.

Listeners, in this view, are the final arbiters as to what is attended to. They may go beyond their dispositional biases and perceptual constraints, challenging the primitive concept of reactivity. What really matters are the continuous manifest and/or epistemic interactions between the listener and the music. Dealing with music, in fact, relies on sensorimotor and computational activity, with the combination of both modalities making the process of dealing with music a richer experience that allows the listener to process music in a perceptual and conceptual way. It does justice to both the subtleties of the sonorous articulation and the more abstract and internal dialogues that allow the listener to simulate the actual unfolding through time. Rather than relying merely on symbolic representations as perceptual sensations in the absence of corresponding sensory input, it may be argued that musical sense-making should be co-perceptual as well, which means that the conceptual processing is added to the actual experience over time.

As such, it is important to stress the real-time listening experience. Making sense of music, in fact, involves an act of imagination that grasps the sonorous unfolding as a processual figure that unfolds through time. What is meant is merely a path of becoming, a kind of continuous transformation that is not restricted to a single state (Reybrouck 2001b), or put in other terms : music is shaped in time with the process of musical sense-making moving back and forth between an analog and discrete approach.

Translated to an actual real-time experience this should mean that the listener can make distinctions in the sounding flux, which is continuous. This discretization of a continuous phenomenon can be so fine-grained that it even reflects the idiosyncrasies of the particular experience. It is possible, however, to go beyond the particularities of concrete experiences as well and to generalize from mere particulars to broader and more encompassing categories. As such, there is need of a combined analog/discrete approach that describes the musical experience as a time-consuming experience with processes of sense-making that conceive of musical elements as functions of time. Much is to be expected here from the dynamic systems approach, as a rather young field of research that describes behavior that unfolds in real time, with the nervous system, the body and the environment continuously evolving and simultaneously influencing one another. To quote Port & van Gelder :

The cognitive system does not interact with the body and the external world by means of periodic symbolic inputs and outputs; rather, inner and outer processes are coupled, so that both sets of processes are continually influencing each other.

1995 : 13

The question can be raised, therefore, whether a more “dynamic” definition of musical sense-making should be conceivable that does justice to a conception of musical events as higher-order variables that can be defined as functions of time (Reybrouck 2004, 2015). Such a definition argues for a broadening of the scope of the concept of symbol from a discrete to a continuous kind of representation that combines a discrete/symbolic with an analog/continuous approach.