Research Ethics in Internet-Enabled Research:
Human Subjects Issues and Methodological Myopia

Joseph B. Walther
Department of Communication
Cornell University
336 Kennedy Hall
Ithaca, NY 14853-4203 USA


he widespread use of the Internet provides new vantage points from which to observe conventional behavior, views of new kinds of behavior, and new tools with which to observe it all. Accompanying these opportunities come two specific concerns about research approaches: how new research methods using the Internet may or may not affect the ethical protections to which human subjects are entitled, and the validity of data collected using the Internet. In some cases, these issues converge: Presuming that research must hold promise of advancing knowledge in order to justify any intrusion on human subjects, Dr. Jeffrey Cohen of the U.S. National Institutes of Health’s (NIH) former Office for Protection from Research Risks is quoted as saying, "Research that is invalid has no benefit…(a)nd if there’s no benefit at all, any inconvenience to subjects isn’t worth it" (Azar, 2000, p. 51).

The issues, and debates over them, are beginning to take public form. They occupy sessions in many fields and association meetings (e.g. The Association of Internet Researchers), and are starting to appear in some journals (e.g. The Information Society, among others). Perhaps the most prominent statement addressing these issues is a report (Frankel & Siang, 1999) of a workshop sponsored by the National Institutes of Health (NIH) and the American Association for the Advancement of Science (AAAS) that was convened to articulate these very issues and to provide tentative recommendations to Institutional Review Boards (IRBs), the committees at universities and research organizations in the United States that must consider approval and oversee research protocols. This report has been disseminated via the World Wide Web, and cited since its publication in outlets ranging from disciplinary journals and magazines, to US government publications such as the National Bioethics Advisory Board (2001) report that was sent to IRB directors (and addressed to the President of the United States). The recommendations of the NIH/AAAS report request that IRBs and researchers consider a number of complex issues pertaining to the observation and reporting of online behavior. These issues fall under concerns over protecting privacy, mitigating harm, and validating data collected from subjects, using the Internet.

IRBs in the United States must take their guidance from the Code of Federal Regulations Title 45, Part 46, Protection of Human Subjects (hereafter, CFR), which in turn was created in the spirit of the Belmont Report (National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, 1979), a document outlining that research on humans must take care to respect autonomy (free will), beneficence (minimizing harm and preserving privacy), and justice toward human subjects. The Belmont Report and the CFR both recognize limitations and offer rationales for exceptions from their general requirements (e.g., when it is understandable and allowable not to seek informed consent from subjects prior to data collection). Although both documents were created prior to the advent of the Internet, they both offer general principles and practices, rather than instructions about specific media (i.e., no distinctions are made between face-to-face versus telephone-based research; although archival research is treated differently from interactive research, no distinction between paper or other archives is made).

There are numerous problems, however, with the recommendations developed in the workshop report (hereafter, the Report) as a whole. That is, while each specific observation has some merit in one specific research context or another, the problem is that the admonitions are not confined by context. Rather, the Report tends to characterize "Internet research" in a more or less monolithic way, as though the issues it considers pertain to most kinds of research conducted online. The recommendations, however, do not pertain among all types of research. In some cases, the warnings in the report pertain to research that does not, literally (by NIH definition), involve human subjects. The ultimate effect is a report that is unfortunately tangled, undifferentiated, and methodologically myopic.

For IRBs, these problems with the Report may put a great chilling effect on the review and approval of research that involves the Internet in any form. Taking the report seriously may lead an IRB to require assurances from investigators that are impertinent, irrelevant, impossible, and unwieldy, depending on the nature and methodology of the specific study being proposed. Conscientious IRBs that attend to these suggestions, yet which may not themselves have the expertise to sort out the applicability of the report’s admonitions, will find themselves bogged down in unclear recommendations; and conscientious investigators may be required to take steps and give assurances that may have nothing to do with their methods. The following discussion identifies some problems with the Report, and attempts to offer some realistic parameters with regard to its concerns. Additionally, some historical and comparative methodological paradigms are mentioned, in an effort to demonstrate that some issues raised by the Report with respect to the Internet are, in actuality, rather old topics that are commonplace in research using alternative media. While they should not be forgotten, they are not as dramatically new or uniquely problematic in the Internet context relative to other contexts for research.

It is important to remember that the Belmont Report and Federal codes guiding IRBs are designed to protect human subjects. Ironically, some of the research methods that the AAAS Report identified as presenting problems actually do not constitute research on human subjects. There are several issues here, some of which are reflected in the Report, but with little resolution: Whether the analysis of posted messages constitutes an intervention of any kind; whether posted messages constitute public or private behavior; and whether the specific research methods provide or occlude the subjects’ identity. As background to these issues, a review of the systems and the Panel’s arguments follows.

Many messages that Internet users have posted for the readership and response of specific online groups remain available for retrieval by others, including researchers, for a long time afterward. These may be postings in virtual communities, i.e., on archived, Internet-based discussions (such as Usenet News or its archives, stored electronic distribution lists, bulletin boards, or asynchronous discussion archives that accrue within other venues such as real-time chats and similar systems). Such message archives are stored on one or many computers, and are accessible through the Internet. The Report points out that when participants post messages to these systems, they often do not expect public inspection of their comments, and do not post them with any nascent permission for them to be reproduced or analyzed. They have, it is argued, an expectation of privacy, and the right to be protected from harm to reputation that may arise by being quoted in whole or in part in the public dissemination by researchers of what they expected to have been private behavior. Moreover, even if a researcher publishes the text that a subject posted, without the subject’s name or even with a pseudonym, the dynamic searchability of Internet-accessible archives creates the possibility that a third-hand reader of the research can trace the comments back to their original writer. The Report makes frequent mention of the notion of "virtual communities," and suggests that message posting on Internet venues is experienced by users as an activity intended to be confined to a specific audience.

Human Subjects Research and Archival Research

To its credit, the Report acknowledges that the CFR offers careful definitions of what qualifies as human subjects research. Human subjects research is that in which there is any intervention or interaction with another person for the purpose of gathering information, or when information is recorded by the researcher in such a way that a person can be identified directly or indirectly with it. Even asking someone a question may constitute interaction, and when such interaction with a subject takes place an investigator may be required to seek exemption, if not full review approval, from an IRB. This is important, since it classifies what appear to be benign opinion surveys and other simple requests for honest, unmanipulated responses as subject to consideration of the need for further protections. However, the research use of spontaneous conversations, if gathered in a publicly accessible venue, is not human subjects research by this definition. An offline parallel might be the analysis of public meetings, although in that setting there is common knowledge that records may be made. Some argue that a better analogy might be the recording of conversations in a public park (see Jacobson, 1999a); people do not expect to be recorded or observed although they understand that the potential to do so exists. Behavior in public settings is in fact not protected from recording for research; the only communication outside one’s personal space that is protected by the CFR due to an expectation of privacy is that which occurs within very restricted contexts such as one’s physician’s, therapist’s, or attorney’s office.

With regard to identification between the data and its human source, if the records are not linked by the researcher to the subject, it also fails to be human subjects research. It is also very clearly the case that if the research involves the collection or analysis of existing data (documents or records) when these sources are publicly available, this also qualifies the research for human subjects exemption (CFR Subpart A, §46.101 [b] [4]).

It can fairly be argued, then, that since the analysis of Internet archives does not constitute an interaction with a human subject, and since it avails itself of existing records, then for IRB purposes, it may be no different than research using old newspaper stories, broadcasts, the Congressional Record, or other archival data, for research.

This issue is extremely paradoxical: Whereas many participants in online venues do not expect that their remarks are likely to be read by others outside the virtual community, they take umbrage at the thought that their words might appear in a research publication (see Bakardjieva & Feenberg, 2001; King, 1996; cf. McArthur, 2001). The aforementioned arguments are not intended to suggest that these sentiments should not be taken seriously. However, it is important to recognize that any person who uses publicly-available communication systems on the Internet must be aware that these systems are, at their foundation and by definition, mechanisms for the storage, transmission, and retrieval of comments. While some participants have an expectation of privacy, it is extremely misplaced. More fruitful efforts might be made in educating the public about the vulnerability of Internet postings to scrutiny–an inherent aspect of many Internet venues–than by debating whether or not such scrutiny should be sanctioned in research. As far as an IRB is concerned, while there may abstract ethical issues to consider, a definitional issue seems pretty clear: The analysis of Internet archives is not human subjects research, if a researcher does not record the identity of the message poster, and if the researcher can legally and easily access such archives. No further review or human subject protections are due such research under the CFR. Although the Report recognizes this position, it reports it as one of several, leaving IRBs in limbo.

Another aspect of law bears on the reproduction of another person’s texts. As Jacobson (1999) very thoroughly explored and argued, the creation and recording of any written message–regardless of the medium by which it is recorded–is, by definition, protected by US copyright law. However, Jacobson also notes that research is allowed to make "fair use" of copyrighted material (subject to restrictions of length of excerpt and proportion of the original work), thus overcoming in many cases the otherwise restricting effects of copyright as far as the present purposes go.

While the Report raises a number of these questions it does little to answer them. Based on existing guidelines, for better or worse, it seems fairly clear that the analysis of publicly-available Internet-stored conversations does not constitute human subjects research, and may therefore be exempted by IRBs from human subjects regulation. Researchers must make their own individual ethical decisions with regard to activities such as quoting or reflecting names or pseudonyms in their ultimate publications, and should indeed do so in mind of some of the points that the Report raises. But users’ sense of fairness and researchers’ personal ethics must not be confused with cross-media institutional guidelines that are reasonable and clear.

Methodological Myopia

Aside from the issue of whether archival research infringes real or expected privacy, there is a much more important issue that the Report ambiguated, which affects subjects’ identifiability in a much more fundamental way. The issue is whether the very research methods that an investigator employs relies on the identification of subjects as data are collected, analyzed, and/or reported.

The Report superficially acknowledges that there are different types of Internet research that may be done, but does not allocate its numerous concerns according to the types of research or methods that any specific study might employ. It makes mention only of survey research and observational research, although it treats archival research at length without labeling it as such. Instead, the Report refers often to research on virtual communities, and characteristics of virtual community members’ expectations, fluidity of membership, anonymity and pseudonymity of identity, and potential non-representativeness of Internet subjects with respect to the larger population of global individuals. Most of the concerns in the Report seem to pertain to the privacy of and potential harm to participants through their identification directly or indirectly by quoting them. (Other concerns including obtaining informed consent and ensuring debriefing will be treated below.) Although it is not explicit, by foregrounding of these particular issues the Report seems to privilege qualitative methods, the reporting of which often involves illustrative quotations as evidence for its claims. In such an approach, concerns over the identifiability of tacit participants is not hard to understand. Although its authors probably intended no such partiality, they nevertheless do not identify any limits to identification problem in "Internet research," suggesting by omission that it is a pan-methodological issue. It is not.

It is extremely important to recognize that there are numerous kinds of research methods amenable to online interactions, many of which could avail themselves of the same kinds of subjects and/or archives as we have been discussing, yet for which neither the identity of the subjects nor the representation of their original messages has any bearing on the kind of research being done or on how it may ultimately be reported. Most quantitative research would be immune from this issue. Survey or questionnaire research that is conducted among online characters, the purpose of which is to identify normative attitudes or practices (e.g., Parks & Floyd, 1996; Parks & Roberts, 1998; Utz, 2000), is a potent example of this. Another example where there is no message content and no names involved would be experimental or quasi-experimental research in which stimuli are presented and data are gathered from Internet-administered forms (e.g. Walther & D’Addario, 2001).

Even more pertinent, and nowhere alluded to in the Report, there are cases where existing messages are analyzed in such ways that the features of messages, not the messages themselves, are analyzed and reported. Content analysis, thematic analysis, or any other examination of the matter of such comments has no reason to relay or report the identities of message posters nor pieces of their messages. For instance, Herring’s (1993) analysis of the masculine/feminine linguistic styles of online characters required that the researcher make judgements about the names of message writers, for classification and analysis purposes. In this kind of research, neither the names, nor the messages, need be reported, in favor of the prevalence of the linguistic characteristics of the messages themselves. Similarly, Sherblom’s (1988) analysis of the variables affecting whether one adds one’s name to the end of organizational email messages required the researcher to see if names were used or not; but it makes no difference to the reader of that research what the names were, or what the subjects said. Sherblom’s later (1990) research on personal pronoun use online, like various studies analyzing the frequency and demographics of emoticon use (e.g. Rezabek & Cochenour, 1998; Witmer & Katzman, 1997; Wolf, 2000) show interesting applications of conventional content-analytic methods to cyberspace records. These examples depict studies where the methods matched the research questions; the methods were no richer or leaner than the theoretical questions demanded, yet were entirely benign with respect to the identifiability of subjects or their messages. In some forms of ethnographic or anthropological research, or other discourse analytic approaches, names or messages may be useful to report–indeed Jacobson’s (1999b) research about online names and the expectations they create is a good example, and one in which subjects consented throughout the research. In such cases the mixed advice of the Report is worth considering.

In other cases, however, the advice is irrelevant and as such it is potentially obstructionist. The Panel’s Report jumps in a fuzzy way and without notice among paradigms with no specification or limitation, and with vague and unjustified suggestions. For instance, the Report recommends that "IRBs should consider having members of the virtual communities studied represented in their deliberations" (p. 17) although it does not say how or why, why a member rather than administrator might be preferred (as if a single member held representative views of the entire community), how such a representation would be or could be made (via mail? chat? phone?), nor why any greater credence should be afforded to a potentially pseudonymous individual in a role as a representative than as a subject in a study. By offering in such dire ways such cautions and vague suggestions, it is easy to imagine that an IRB could hamper an otherwise clean research effort that used Internet archives, where content characteristics or other message features were the subject of study, where there were no risk at all to message posters’ identity, regardless of the question of whether it was human subjects research or not. This presents and ironic and perilous predicament: While raising cautions about what is ethically required with regard to human subjects' protection, the Report’s lack of clarity may lead IRBs not to authorize rather morally unambiguous research that has the potential to advance knowledge, constituting malfeasance by the IRB through ethically wrong indecision.

Concerns over Data Validity

The Report reflects the CFR’s stipulation that human subjects research must provide benefit either to participants or to scientific understanding, and it echoes the argument that there can be no benefit unless data are valid and reliable. It also contends that "conducting research on the Internet raises questions about data sampling techniques and the validity and reliability of the data collected" (Frankel & Siang, 1999, p. 3). Along with the Internet’s "access to a potentially wide geographical and diverse population" (p. 4), the Report claims that the Internet-using population distribution is skewed on gender, race, and geographical distribution relative to other samples, such that Internet research incurs "non-representative sampling" leading to "misleading findings, and perhaps misguided policy" (p. 4). Moreover, according to the Report, because it is "quite easy to mislead others about one’s geographical location, gender, or race" (p.4), false respondents and false responses threaten the validity of conclusions from Internet-gathered data. These concerns focus on two issues: sampling and the veridicality of the participant’s identity.

Sampling

Embedded in these observations is an undifferentiated assumption about research type and sampling strategies. It is almost assuredly the case that Internet users differ in a number of ways from non-users along a variety of potentially important factors such as education and income (although the gaps seem to be narrowing in some ways, there still remains a digital divide; see Pew Internet and American Life, 2001a). However, the concern that samples obtained via the Internet are non-representative of a target population reflect assumptions that (1) random samples of Internet users are sought in any study, and (2) that an Internet sample is not generalized to other populations. The first case is only one of among numerous sampling strategies; the second issue seems obvious, yet is not without parallel in other research contexts.

There is, to the dismay of some, no way to use the Internet to collect a random sample of the global population, much less the population of Internet users, and researchers know this. While it is not uncommon to post solicitations for research participation to well-trafficked websites (e.g. ABCNews.com, 1999b) or email lists, it is impossible to send an Internet-wide solicitation (although see Loundy, 1995, regarding the case of the Green Card Lawyers). Thus any solicitation must be conveyed to some specific population or populations. This does not suggest that the population’s characteristics are always known or knowable, but in some cases they can be estimated. Researchers can choose carefully where to recruit subjects; they can stratify samples across similar or different topical discussions in order more carefully to gather a more or less heterogeneous sample, as needed. It is unfortunate that some researchers extrapolate their findings to the entire Internet-using population (e.g., Greenfield concluded from the results of self-selected respondents to an Internet addiction questionnaire, that 6% of the Internet-using world, or 11 million individuals, are addicted; ABCNews.com, 1999a). But this is scientifically no better or worse than generalizing to the world on the basis of any other convenience sample (such as a study involving 40 college sophomores in a psychology course). Are such data valid or invalid? It depends entirely on the extent to which the researcher designs and describes the sampling strategy, and qualifies the degree of generalization the sample provides, and so that readers of the published results can judge for themselves whether such research reflects a widespread dynamic or a first step in the research process. These qualifications do not in and of themselves impugn the Internet as a data-gathering device or the respondents to Internet methodologies as any more or less useful than any other sample.

The issue is even less pertinent in cases where the Internet is used to record responses after respondents have been solicited or selected through alternative means, using a "closed sampling" technique. The Internet can provide a method to reach and record responses from samples defined through some offline, existing organization or entity (e.g. all employees at XYZ Corporation; a stratified random sample of members of the DoReMi Division of the 123 professional association). Shoppers at a mall can be screened and qualified, offered an incentive, and asked to respond to a research web page either at the mall or at their convenience at home. Or, as the Pew Internet and American Life project has done, individuals may be called by phone using a random-digit dialing sampling frame, and screened for Internet use and other characteristics. Their responses may be gathered over the phone or online (see e.g. Pew Internet and American Life, 2002). This is research using the Internet and/or about the Internet, too, but the sampling technique has nothing to do with the Internet.

Finally, sometimes discrete sub-populations of Internet users, while tenuously generalizable to other populations, are very good samples. A Usenet group focused on a particular hobby, for instance, may be a very useful bank of respondents in which to qualitatively or quantitatively gather information about a commercial service related to the electronic provision of that hobby, such as an e-commerce site; indeed, Armstrong and Hagel (1996) suggest that savvy businesses take advantage of these self-created and knowledgeable groups, not so much in order to exploit them, but in order to maximize the business’s ability to provide for these clientele, and for clientele to get what they want. Likewise, a questionnaire posted to the Amazon.com site certainly does not describe all the nation’s or the world’s potential book buyers. But it may generalize to a good number of Internet users who, for one reason or another, surf the Amazon.com site. These individuals’ reactions to site layout, the prominence of book cover graphics, search methods, etc., may be of tremendous value to Amazon.com, an e-commerce researcher, or visual communication theorists. This kind of sampling strategy seems sensible, and shows that it may sometimes be better to have a well-defined, electronic sample than an electronic random sample, or an offline sample at all. While the Report does not impugn this context, it simply raises a flag that the Internet population is not representative–although it does not say to what or whom it is not representative–and leaves it at that.

The points here are that it is almost inconceivable that a researcher often wishes to survey the whole world, or even the whole Internet, even if s/he could. Second, researchers who use the Internet, just as researchers using any other venue, can and should describe their sampling strategies and parameters–not generalizing with certainty beyond the boundaries of the sample–so that others can interpret their findings in this light. Third, the Internet can be used to collect data from known, defined offline samples. These points should offer some stark limitations on the applicability of the Report’s concerns over the ethical merits and potential value of Internet research. The sampling issues challenge the Report’s concern over invalid design and research results. The previous discussion of qualitative versus quantitative data also restrict the Report’s concerns: Research that actively seeks subjects’ responses and records them in numeric form, or the quantitative analysis of stored messages, are common online research techniques that present no more threat to subjects’ privacy or identity than any equivalent offline methodology.

Identity Deception

Another issue raised in the Report is that various degrees of anonymity afforded by Internet discussion venues either provide or promote the misrepresentation of identity with respect to gender, age, geographic location, and with regard to the one-to-one correspondence between an online persona and an offline individual (so that a single physical person may be more than one virtual subject). These instances no doubt do occur, and instances in which they have taken place in non-research contexts constitute some of the most sensationalistic stories about the Internet (see e.g. Van Gelder, 1985). Researchers who have raised these concerns argue that in non-Internet research, it is easy for a researcher to determine someone’s age and gender through the physical appearance characteristics (that are apparent, one must presume, in face-to-face data collection settings), and thereby no such data-threatening misrepresentations are likely. However, two related issues should be taken into consideration when considering these possibilities: (1) that the degree to which these misrepresentations take place across Internet research contexts is (a) an empirical phenomenon as yet little explored, (b) probably highly inflated in public perception; (c) questionably linked to the motive to present dishonest responses to research questions; and (2) that it implies an ahistorical and naïve view of alternative research methods which have dealt with the same problems for many years.

Deception. Despite the fact that one can misrepresent oneself online, it is useful to ask why someone would misrepresent his or herself online, in order to consider how widespread the phenomenon is and whether it would take place in research settings. It seems an eminently do-able research question, to find out when and what proportion of Internet users misrepresent themselves and on what characteristics. However, very little empirical research has addressed these questions. What little systematic research exists shows that (a) in online courtship, men may shave a little off their actual ages and send an online paramour an older photo (with a better hairline; Levine, 2000); (b) most Internet-using teens report that they have been approached by another user in real-time chats presenting him- or herself with a false identity (although how they know this is not indicated in the research; Pew Internet & American Life, 2001b); and that (c) in online role-playing games (where users adopt certain characters in order to spontaneously collaborate in fictive interactions) some users have tried changing gender on occasion, in most cases as the need for gender-defined characters occurs in a game context (e.g. someone needs to be Captain Kirk in order for the Star Trek role playing game to continue, no matter what the biological sex of the next typist on; see Roberts & Parks, 1999). Furthermore, there is no research to date demonstrating that Internet users are any more likely to gender switch than people are in other contexts (such as the incidence of occasional or frequent cross-dressing in the larger population). Few have suggested any compelling reason why the Internet should cause individuals to gender deceive (see for exception Flanagin, Tiyaamornwong, O’Connor, & Seibold, 2002), although some philosophical texts speculate on the therapeutic aspects of doing so in ongoing real-time chat spaces (e.g. Turkle, 1995). No one has even come close to suggesting what would be served by gender-deceiving in research settings, particularly questionnaire research, and even if a respondent did so, whether or not the façade would bias reports of the typist’s attitudes or behaviors.

More Methodological Myopia. The Report contends that because of the uncertainty of respondents’ identity in Internet research, data may not be valid, and if not, research should not be conducted. It is well worth considering that historically there are other venues for research that offer very bit as much opportunity for deception: mail surveys, telephone interviews, questionnaires passed out in large classrooms, and other approaches. It is sobering to think that an investigator preparing to send a thousand surveys by surface mail may be asked by an IRB how he will demonstrate the true gender (or race or location) of his respondents; or how a researcher preparing to use a random-digit dialing sample protocol may be challenged that her research cannot produce benefit because not everyone has a telephone or because she cannot guarantee that a 17-year-old absolutely will not pretend to be 18. It is easy to imagine cases in which a questionnaire with explicit instructions to be completed by the "male head of the household" is in fact completed by the spouse. While these anomalies are unfortunate, and are to be avoided, they are not terminal threats to a large study, in which a certain degree of error is to be expected, analyzed, and reported. It is unfortunate that the Report has failed to consider these non-Internet methodologies, both in terms of the similar problems they present, and in terms of the history of research about research on how to deal with them.

Consent and Debriefing

Another set of problems identified by the Report as pertaining to research via the Internet has to do with the administration of pre-observational informed consent procedures, and post-observational debriefings. The consent issue itself is two-pronged. The simplest aspect has to do with the logistical difficulty or impossibility of obtaining a signed informed consent document from a subject, on paper, which is the traditional manner in face-to-face experimental settings. Informed consent documents generally ask the subject to attest that he or she has been informed about the nature of the study, his or her participation in it, the potential risks or lack thereof, that s/he is free to discontinue participation at any time with no penalty, and other standard warnings as apply in the respective instance.

However, the Report identifies secondary problems that may impact the informed consent process, resting again on the uncertainties of knowing who respondents really are, online, as compared (only to) face-to-face research: The "ease of anonymity and pseudonymity of Internet communication’s also poses logistical difficulties for implementing informed the consent process" (Frankel & Siang, 1999, p. 8). And reflecting the CFR’s special requirements that informed consent cannot be given by children or by those with diminished mental capacity, the Report argues that since Internet interaction does not involve physical co-presence (as does face-to-face research), a researcher recruiting subjects via the Internet cannot know the age or mental competency of prospective respondents. "Minors," it observes, "could respond to a study involving inappropriate materials for their age" (p. 8). Without certain knowledge about age and competence, and without the "face-to-face dialogue that can help to ascertain whether the subject adequately comprehends," researchers cannot be sure if subjects truly understand the risks and procedures to which they are explicitly or tacitly agreeing to accept.

Consent Processes

That the Report presents these concerns as inherent to the Internet and unique to the Internet is, once again, troubling, and taken at face value could place an undue or impossible burden on researchers using the Internet in their research. The Report fails to consider several issues about informed consent and the parallel problems in other research settings. First, in addition to leaving hanging the argument that non-experimental observation and analysis of archived messages are not human subjects research, it fails to acknowledge that many kinds of human subjects social research that do involve some kind of interaction or intervention may also be exempt from IRB concern (that is, may apply for and be granted exemption from further review and oversight) due to the lack of harm the research presents (aside from the respondent’s time and effort). These categories of research involve no risk, and include surveys of a non-sensitive nature, research where there is no activity atypical of normal day-to-day behavior, the administration of standard psychological tests, tests of individual or group behavior in a non-manipulated way, and where research does not involve the collection of identity in association with response data. Many data collection efforts involving the Internet may fall under these classifications. Some that do not may be allowed to use an "implied consent" procedure.

Implied consent is invoked when, prior to the collection of research responses online, the prospective participant is presented informed consent information in electronic, written form. By agreeing to continue in the study (usually by clicking an "accept" button on a web page), and providing data, the subject’s consent is reasonably inferred (see King, 1996). This approach seems to have become an acceptable substitute in many cases, meeting the functional requirements of informed consent through logistics adapted to the Internet. The Report, however, asks whether such an approach can be valid without certain knowledge "of the age, competency, or comprehension of the subject" (Frankel & Siang, 1999, p. 10).

Where questions of legal age may arise, two responses pertain. First, that there is nothing new about them. While the Report acknowledges that telephone surveys may rely on verbal consent, it does not consider that mail surveys and methods using other traditional media face the same issues with respect to the uncertain knowledge about whether people really are who they say they are. The question "are you at least 18 years of age?" is a stock screener in many face-to-face research studies, as well as telephone surveys. Does the face-to-face encounter provide that much better a forum than any other media? One would think not, if we can look for example to contemporary practices in some retail stores, where anyone who appears to look younger than 40 years old will be asked to provide proof of age before being sold tobacco products. Apparently the eyeballing method has not been a satisfactory method of discerning 16- and 17-year-olds from 18-, 25-, and 30-year olds in the face-to-face business. Secondly, for a scrupulous researcher to expose subjects to material that could be inappropriate for children, in the context of research via the Internet, seems very remote. It is difficult to examine this further given the hypothetical and decontextualized nature of the admonition that is so frequent in the Report. Exposure or inquiry about adult-oriented topics is an uncommon research activity and the steps required to exchange such information with subjects face-to-face would also be extensive. Most IRBs would pay careful attention to such cases without need for additional alarms and warnings.

As far as knowing whether the subject really understands the information presented as part of the informed consent process, the Report privileges face-to-face communication in an unreasonable way. Its contention assumes that in a face-to-face setting, visual cues provide a researcher with adequate information to judge whether a subject can or did understand. Can, in assuming that mental incompetence is visually obvious; did, in assuming that something along the lines of raised eyebrows, blank stares, or stupid looks are always produced when a receiver does not understand something exactly the way the sender intended.

It is already a requirement of the CFR that informed consent material must be written in simple and understandable language, and need not be read aloud. IRBs in my experience have always felt free to request simplification in the prose of informed consent forms, and with grade-level writing analysis built in to some contemporary word processors, it is a simple matter to inspect one’s prose for simplicity. In all, there is no reason to place greater burdens for on researchers using the Internet unless some aspect of their study–other than that it happens to be using the Internet–calls for it. Highly risky Internet research should probably avoid strictly online sampling. But to require all Internet researchers either to acquire signed informed consent, or to be asked to demonstrate the real ages and competence levels of subjects, especially in a minimal-to-zero harm project, in ways that are not required of alternative methodologies, seems unreasonable.

Debriefing Processes

Debriefing is explaining to subjects the true nature of a research study, after their behavior is observed, if the nature of the research was withheld from them prior to their participation (i.e. withheld from informed consent instructions). This process is usually carried out when the research involves some experimentation, manipulation of the subjects, or deception of some kind. The concern of the Report is that membership in virtual communities is "very fluid" (p. 4), and that participants may not happen to come back to an online venue by the time a researcher makes available some kind of debriefing. Acknowledging that this issue pertains to a limited, experimental context (and not the passive observation of virtual communities), Nosek, Banaji, and Greenwald (2002) offer several suggestions for researchers doing online research who believe that a debriefing is needed:

    1. Participants could be required to enter an e-mail address at the beginning of a study. Debriefing statements could later be e-mailed to the participant.
    2. A "leave the study" button, made available on every study (web) page, would allow participants to leave the study and still direct them to a debriefing page.
    3. The program driving the experiment could automatically present a debriefing page if the participant prematurely closes the browser window. (p. 163)

We should not lose sight, however, of the fact that this problem is, again, nothing new. Participants in any research, including face-to-face experiments, are told that they are free to leave at any time without penalty. They need not stay for debriefing. While most do, it is not clear how many pay attention (especially in the case of college students, who in some cases would just as soon be awarded their five dollars or extra credit, and go off to do whatever else they would rather do, than to listen carefully to more lab talk). Researchers must find clever ways to hold subjects’ attention, online or off, and that is quite a challenge. With no existing empirical evidence that such efforts to debrief subjects are any less effective online that they are offline, no further burdens of proof seem appropriate.

Caveats and Conclusions

While this article has raised a number of criticisms of the Frankel and Siang (1999) AAAS/NIH Report, there are some points in the Report worth praising. Those most especially useful occur when the Report considers some truly unique aspect of Internet communication, and the implication of that aspect for human subjects protections. For instance, in conducting e-mail surveys over sensitive topics, the Report urges concern over privacy when people share their computers. Indeed, the "out box" of a respondent’s e-mail software may keep a copy of the questionnaire response. If a household or office shares computers, these traces may uncovered by other users. E-mail respondents should be advised of this by researchers and instructed that, if they do not know how to remove such traces or are unwilling to bother taking such steps, they should not participate. Using web forms may mitigate this somewhat, but in this case respondents must completely close their web browsers, not just click away to some other page, and researchers should advise as much. Other specific and pragmatic suggestions should be cultivated from the Report, so that researchers have problem-solving techniques at their disposal, and IRBs can make better decisions without the paralysis of analysis currently confronting them.

The preceding discussion takes an admittedly limited perspective: How research that involves the Internet as a tool of its protocol or as its focus may be evaluated for adequate protection of human subjects. It also takes a limited ethical code as its reference point: The U.S. Code of Federal Regulations pertaining to this issue. While these limitations may be relatively restrictive in light of the larger dialogue about ethics and research, they are nevertheless vast and influential in their own domain. IRBs at numerous research universities and institutions draw on these procedures and guidelines many times a year in adjudicating whether specific research protocols satisfy widely-held and formally-encoded standards. To the extent that the Frankel and Siang (1999) Report may influence such boards, this essay is intended to counteract some of the extreme and decontextualized barriers that the Report might otherwise encourage IRBs to raise into otherwise good research designs with adequate protections for subjects. Relatedly, these ideas may facilitate discussion about problematic IRB proposals that miss some aspects of human subjects protections, but do so as a result of specific and repairable design problems, rather than because they involve the Internet as such. This discussion is intended to mediate the unfortunate conundrum identified by Bakardjieva and Feenberg: "The early online data rush which treated every content found on the Net as open to downloading, analyzing and quoting has been countered by an ethical perfectionism leaving almost no space for research on virtual forums" (2000, p. 233).

At the same time, the approach offered here avoids some issues entirely, and conscientious researchers will want to take note of concerns expressed by other authors. For instance, just because the analysis of stored comments of electronic conversations may not constitute human subjects research the way it has been defined here, does not mean that people will not have strong feelings about the matter; nor does it imply that there are no concerns to be had about the larger scale questions of data-mining into people’s past conversations, online buying habits, or other electronically-traced commercial behavior (see Nissenbaum, 1998). These issues do warrant discussion and consideration. Whether they warrant the suspension of scientifically designed and theoretically-motivated research is another question. By analogy, in my home state of New York, one may now register with a state database in order not to receive commercial telephone solicitations by phone; violators may be fined. The self-imposed exile does not protect one from calls from polling organizations or academic research, however. While a dinnertime phone call might be annoying no matter who makes it, the public appears to differentiate between research activity and commercial activity, and affords some latitude to those whose motives appear to be beneficent. This should not be latitude to violate identity or cause harm, but in terms of gathering information, some purposes appear to be more appreciated than others. Whether the CFR is up to the challenges of the Internet is a completely different question than whether the CFR as it currently exists proscribes as much of Internet research as the Report would suggest. These existing rules, however, considered in the context of the specific research method proposed by a researcher, viewed historically with respect to alternative methods, compared to alternative media, and applied judiciously, may well suffice in most cases.

References

ABCNews.com (1999a). Can’t Resist the Online Pull: 6 Percent of Net Users Addicted. Retrieved Sept. 1, 1999, from http://abcnews.go.com/

ABCNews.com (1999b). Internet Addiction Survey. Retrieved Sept. 1, 1999, from http://abcnews.go.com/

Arthur Armstrong and John Hagel. The Real Value of On-Line Communities. Harvard Business Review, 74(3): 134-141, 1996.

Beth Azar. Online Experiments: Ethically Fair or Foul? Monitor on Psychology, 31(4): 50-51, 2000.

Maria Bakardjieva and Andrew Feenberg. Involving the Virtual Subject. Ethics and Information Technology, 2: 233-240, 2001.

Code of Federal Regulations Title 45, Department of Health and Human Services, National Institutes of Health, Office For Protection From Research Risks, Part 46, Protection Of Human Subjects. Nov. 13, 2001. Retrieved May 19, 2002 from http://ohrp.osophs.dhhs.gov/humansubjects/guidance/45cfr46.htm

Andrew J. Flanagin, Vanessa Tiyaamornwong, Joan C. O’Connor, J., & David R. Seibold. Computer-Mediated Group Work: The Interaction of Member Sex and Anonymity. Communication Research, 29: 66-93, 2002.

Mark S. Frankel and Sanyin Siang. Ethical and Legal Aspects of Human Subjects Research on the Internet: A Report of a Workshop June 10-11, 1999. Nov., 1999. Retrieved May 15, 2002 from http://www.aaas.org/spp/dspp/sfrl/projects/intres/report.pdf

Susan C. Herring. Gender and Democracy in Computer-Mediated Communication. Electronic Journal of Communication, 3 (2), 1993. Retrieved October 19, 1999, from http://www.cios.org/getfile/HERRING_V3N293

David Jacobson. Doing Research in Cyberspace. Field Methods, 11: 127-145, 1999a.

David Jacobson. Impression Formation in Cyberspace: Online Expectations and Offline Experiences in Text-Based Virtual Communities. Journal of Computer-Mediated Communication, 5 (1): 1999b. Retrieved May 18, 2000, from http://www.ascusc.org/jcmc/vol5/issue1/jacobson.html

Storm King. Researching Internet Communities: Proposed Ethical Guidelines for the Reporting of Results. The Information Society, 12: 119—128, 1996.

Deb Levine. (2000). Virtual Attraction: What Rocks Your Boat. CyberPsychology & Behavior, 3: 565-573, 2000.

David Loundy. Lawyers' Electronic Ads Leave Bad Taste. Chicago Daily Law Bulletin, Mar. 9, 1995: 6; Rpt. retrieved May 12, 2002 from http://www.loundy.com/CDLB/Spam.html

Robert L. McArthur. Reasonable Expectations of Privacy. Ethics and Information Technology, 3: 123-128, 2001.

National Bioethics Advisory Commission. Ethical and Policy Issues in Research Involving Human Participants. Volume 1: Report and Recommendations of the National Bioethics Advisory Commission. National Bioethics Advisory Commission, Bethesda MD, 2001.

National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. The Belmont report: Ethical principles and guidelines for the protection of human subjects of research, 1979. Retrieved May 19, 2002 from http://ohrp.osophs.dhhs.gov/humansubjects/guidance/belmont.htm

Helen Nissenbaum. Protecting Privacy in an Information Age: The Problem of Privacy in Public. Law and Philosophy, 17: 559-596, 1998.

Brian A. Nosek, Mahzarin R. Banaji, and Anthony G. Greenwald. (2002). E-Research: Ethics, Security, Design, and Control in Psychological Research on the Internet. Journal of Social Issues, 58(1): 161-176, 2002.

Malcolm R. Parks and Kory Floyd. Making Friends in Cyberspace. Journal of Communication, 46: 80-97, 1996.

Malcolm R. Parks and Lynne D. Roberts. "Making MOOsic": The Development of Personal Relationships On Line and a Comparison to their Off-line Counterparts. Journal of Social and Personal Relationships, 15: 517-537, 1998.

Pew Internet & American Life. (2001a). More Online, Doing More: 16 Million Newcomers Gain Internet Access in the Last Half of 2000 as Women, Minorities, and Families with Modest Incomes Continue to Surge Online. Retrieved Sept. 1, 2001, from http://www.pewinternet.org/reports/

Pew Internet & American Life. (2001b). Teenage Life Online: The Rise of the Instant-Message Generation and the Internet’s Impact on Friendships and Family Relationships. Retrieved Sept. 1, 2001, from http://www.pewinternet.org/reports/

Pew Internet & American Life (2002). Methodology. In Vital decisions: How Internet Users Decide What Information to Trust When They or Their Loved Ones Are Sick. Retrieved May 29, 2002, from http://www.pewinternet.org/reports/

Landra L. Rezabek and John J. Cochenour. Visual Cues in Computer-Mediated Communication: Supplementing Text with Emoticons. Journal of Visual Literacy, 18: 210-215, 1998.

Lynne D. Roberts and Malcolm R. Parks. The Social Geography of Gender-Switching in Virtual Environments on the Internet. Information, Communication, and Society, 2: 521-540, 1999.

John Sherblom. Direction, Function, and Signature in Electronic Mail. The Journal of Business Communication, 25: 39-54, 1988.

John C. Sherblom. Organizational Involvement Expressed through Pronoun Use in Computer Mediated Communication. Communication Research Reports, 7(1): 45-50, 1990.

Sherry Turkle. Life on the Screen: Identity in the Age of the Internet. Simon & Schuster, New York, 1995.

Sonja Utz. Social Information Processing in MUDs: The Development of Friendships in Virtual Worlds. Journal of Online Behavior, 1(1): 2000. Retrieved April 7, 2000, from http://www.behavior.net/

Lindsy Van Gelder. The Strange Case of the Electronic Lover. Ms. Magazine, 14(4): 94, 99, 101-104, 117, 123, 124; Oct. 1985. Rpt. in Charles Dunlop and Rob Kling, editors, Computerization and Controversy: Value Conflicts and Social Choices, pages 364-375. Academic Press, Boston, 1991.

Joseph B. Walther and Kyle P. D’Addario. The Impacts of Emoticons on Message Interpretation in Computer-Mediated Communication. Social Science Computer Review, 19: 323-345, 2001.

Diane F. Witmer and Sandra Lee Katzman. On-Line Smiles: Does Gender Make a Difference in the Use of Graphic Accents? Journal of Computer-Mediated Communication, 2(4): 1997. Retrieved May 23, 2000, from http://www.ascusc.org/jcmc/vol2/issue4/witmer1.html

Alecia Wolf. Emotional Expression Online: Gender Differences in Emoticon Use. CyberPsychology and Behavior, 3: 827-833, 2000.


Introduction
Charles Ess

Ethical Issues of Online Communication Research
Rafael Capurro & Christoph Pingel

What is special about the ethical issues in online research?
Dag Elgesem

Studying the Amateur Artist: A Perspective on Disguising Data Collected in Human Subjects Research on the Internet
Amy Bruckman

Representations or People?
Michele White

Ethics of Internet Research: Contesting the Human Subjects Research Model
E. H. Bassett & Kathleen O'Riordan