Skip to Navigation | Skip to Content
Faculty Resource Network Home

Approaching Assessment: Perspectives

November 21-22. 2003

Moderator: Ted Marchese, Senior Consultant, Academic Search consultation Services; former Vice President of the American Association for Higher Education and former Executive Editor of Change magazine.

Panelists: Mary Brabeck, Dean, The Steinhardt School of Education, New York University; Margaret “Peg” Miller, Professor, Curry School of Education, University of Virginia, and President Emerita of the American Association for Higher Education; and Sean Fanelli, President, Nassau Community College.

Both the Democrats and the Republicans have bills out right now for incentive programs to minority-serving institution programs. But when you get deeper into it you start to see some of the differences, which is what Jane was identifying earlier—how they are introducing some of the old proprietary and for-profit schools back into the system. But the Democrats aren’t holding us completely harmless. They are not being as liberal as one would assume, and for a while we were wondering where they were. The higher education community had to defend itself against the Republicans, and so it wasn’t until October that the Democrats emerged. So we’re in the middle of kind of a match—it’s not as Liberal and Conservative. We’re just at a point where we’ve got both ends of the spectrum filling in and we’re going to end somewhere in the middle—hopefully where we get more funding into the system and fewer rules.

Marchese: Let me tell you about our panelists. Peg Miller is the former Vice President for academic affairs at the State Council of Higher Education of Virginia. She was the President of the American Association of Higher Education in the late 90’s, and is now a faculty member at the University of Virginia. Peg is associated with the Patrick Callan National Center on Public Policy and Higher Education in San Jose, California, and is Director of a project called Measuring Up, run by the National Forum on College Level Learning.

Sean Fanelli is President of Nassau Community College, which is the largest community college in the State University of New York system. Before that he was Academic Vice President at Westchester Community College. He is a biologist by training. Sean is also president of the National Institute for Leadership Development, which identifies and prepares women for leadership positions of higher education. And he is the recipient of the Alexander Michel-John award from the AAUP for his defense of academic freedom, and secondly the William Brandon Award for his defense of free speech.

Mary Brabeck is Dean of New York University’s Steinhardt School of Education. She was trained as a psychologist at the University of Minnesota, and her field is professional ethics. She spent 23 years at Boston College, the last seven of those as dean. She has been paying attention to what has been going on with testing, among other things, throughout the whole of higher education, so we are going to hear from her about that.

Fanelli: You have probably heard the adage that we assess what we value, but I think what we are hearing out of Washington is that we are going to have to value what they assess, and that’s a concern that I have. I think that part of the Washington perspective is that somehow or other we are not holding ourselves accountable to our stakeholders, and we are not doing anything in the form of assessment. So I am going to show what one institution is doing—and I think we’re illustrative of every institution in this room—with respect to assessment. Since you assess what you value, you have to start with your mission statement. At Nassau Community College, we began with a mission review, which to its credit the state university. As a result of that review process, we came up with what is called an institutional report card. The report card compares us to peer community colleges across the nation that have at least 20,000 or more students, as we do, as well as with local institutions from the metropolitan area down through southern New Jersey and northern New York State. Finally, we compared ourselves to sister institutions in the State University of New York.

We then shared this report with our stakeholders. We placed it in every local library, and sent it to every local legislator, state legislator, and our congressional delegation to let them know what Nassau is all about. In it, we looked at first-time full-time students—3,500 students in the fall and perhaps another 1,500 in the spring. We compared ourselves in a number of different ways, gave ourselves a report card, and asked ourselves where we needed to improve. Here’s the report card.

We actually gave ourselves mixed rankings. Many of them, I’m proud to say, are A’s, but we even gave ourselves some D’s in areas—and that helped us in a lot of different ways. Because we have a many different activities going on campus, and one of the ways in which you assess what you are doing is with respect to planning. If planning and assessment are not connected, then there is something wrong. And so we have a strategic planning committee, and we come up with what we call, on a tri-annual basis, strategic themes. We go through these themes, and we assess how well we are in keeping with the themes that we have established for ourselves. Beyond that we established a process by which we do academic program review. We require every department in the college go through an academic program review that, at its conclusion, goes to our assessment committee. We have a committee that assesses these—and this is just one program review document that must be completed then given to an external set of reviewers.

We also do assessment with respect to costs. Every department must assess each of its courses according to a matrix, and departments have to go through this matrix periodically and submit it to the college’s assessment committee. And beyond that we find out that we look carefully at our assessment processes themselves. We recently went through our Middle States accreditation/reaccredidation process, and in that effort you do a lot of soul searching. We look at ourselves very carefully in that review process. I was pleased to note the work of the Middle States in changing standards. The first set of standards deals with institutional assessment and the second set with student learning outcomes assessment. I think that’s appropriate. Accountability is more than dollars and cents. It is a fine evaluation document, and I wish that some other entities would use it and accept some of the findings in Middle States reports that we’ve furnished as their method of assessing us. But that has yet to come.

I say that because one of the things that occurred in SUNY several years ago was the imposition—and I use that word advisedly—of a knowledge and skills general education requirement. There were 12 areas, 10 courses, and two skills: critical thinking and information management. We had to look at every course we offer and see if it met one of these standards, one of these knowledge-based or skill-based requirements. Once that was done, a report was sent to SUNY committee for review. The committee began to expect that we would send periodic reports on how we were assessing courses in general education. This led to what is known as the General Education Assessment Review (GEAR) process.

So we go through that, independently of our own assessment. We said to SUNY, “Why can’t we use what we’ve been doing if it’s such a model for others?” The answer was no. So now here’s another layer of assessment that’s placed on us. And in regard to student development outcomes, we go through it again—the same kinds of assessment that we do with our courses, programs, student activities, and student personnel services. We also engage in what SUNY calls a Student Opinions Survey, a random survey of students about how they feel they are being educated at Nassau Community College. That is sent to SUNY, and SUNY publishes a report, which is sent to each individual campus, showing comparative data without naming the campuses.

We have recently been informed that SUNY will require a value-added assessment, so now we will be required to do pre-tests and post-tests. Now I think we have an excellent college. Some 70 percent of our students earn baccalaureate degrees. But what really troubled almost everyone was that there was no thought given to what faculty thought should be assessed. It’s assessment for the sake of assessment. Two governance bodies in the state university rejected it: the faculty council, which represents community colleges, and the faculty senate, which represents the four-year colleges. The presidents’ association of community colleges has written a letter to the chancellor saying no individual governance groups throughout the state. Not so much no to assessment itself, but no to the process. And no to being told that we have to do something when there hasn’t been sufficient thought about what we’re going to do.

Brabeck: I have a short story about what happens when you have externally imposed standards. I was a new dean at Boston College in 1997 when the results of the first teacher test were released in Massachusetts, and this has been called "the flunk heard around the world." Fifty-eight percent of teacher education candidates failed this state test of teacher quality. The speaker of the house referred to the teacher candidate who failed as "idiots." Now these students had not received the results on their own tests at the time when they were being called idiots. They didn’t know if they were in the idiot category or not-idiot category.

There were problems with the psychometrics of the test. The test was originally to be given as a pilot because it had not been piloted before. Two weeks before the test was to be administered, the Department of Education changed its stand and said that the test would count towards certification—it was no longer a pilot. There were problems in the administration of the test. Part of it was to test verbal ability, so students were asked to take dictation from the Federalist Papers and correctly spell all of the passage. Now the Federalist Papers, of course, were produced in the 1770’s, and they were using a different spelling at the time. Public, for example, was p-u-b-l-i-k. And there were problems with accommodations provided during the exam. As this was supposed to be a pilot, no one had taken the necessary steps to accommodate students with special learning needs.

And there is a political consideration. Testing, there are lots of different agendas that the test results will serve. Institutions cannot always control the message. The Chronicle of Higher Education carried a story in which John Silber, President of the other Boston-named institution, expressed himself to be horrified by the results and publicly mocked all of the teachers. I wrote a letter to the editor saying, “Watch out, we all may be lacking in teacher education, but this is the canary in the coalmine and it’s coming your way.”

Of course, teacher education is particularly vulnerable in a number of ways. It is still a cash cow in most institutions, and it is not central to the mission of research universities. It is also vulnerable because there is tremendous variability among institutions that prepare teachers, variability in admission requirements, in course content, in quality of the instructors, and in commencement requirements. But when people think about teacher education institutions, they tend to think of the worst cases and the most problematic situations. It’s vulnerable because it’s a profession that has failed to police itself, unlike our English departments and our sociology departments! It is also vulnerable because faculty at schools of education do not welcome assessment that includes standardization.

Professional bodies and organizations have been using research to describe what it takes to produce a high quality teacher, but the definition of teacher quality is something for us all to take a look at because there is not consensus about what makes a highly gualifies teacher. There are attempts now in Congress to try to identify what a high quality teacher is. The good news is that this task of defining a high quality teacher is being given to the National Research Council, which is a great body in my judgment to wrestle with this complex issue. They do understand the strengths and the limitations of assessment. But the whole issue of defining teacher quality is an important one right now, because ,of course, we do need high quality teachers in our classrooms.

When you think of all that contributes to student success, you can think about it as a pie. About seven percent of that pie, or seven percent of the variance is attributable to small class size, and the rest of the variance splits in half. Half of that are out-of-classroom factors, and the other half is teacher quality. We know teacher quality makes a difference. However, the reauthorization of the Higher Education Act poses a dilemma, because Congress has centered on one definition of a highly qualified teacher, and that is a person who has been able to pass a standardized test of content knowledge. Right now, the most popular test before Congress is a test being produced by a group called the American Board for the Certification of Teacher Excellence (ABCTE). This group has been given $35 million by the federal government to develop the test, and No Child Left Behind defines teacher quality as someone who has a baccalaureate degree, passes this test, and passes a criminal background check.

So this is a lesson for higher education. It begins and ends with a test. We have to watch the political agendas here. The ABCTE test will do a number of things. It will resolve a problem of teacher shortage. It will also create a fast track for new teachers to get into the classroom. It will solve a political problem. However, it will create a series of problems for us, one of which is the issue of who is going into teacher preparation. Tests tend to discourage minority candidates in particular from entering into a profession, and we’re seeing already a drop in the number of minorities entering the profession of teaching, which is another thing for all of us in higher education to watch carefully.

Miller: I became involved in assessment when I was the Chief Academic Officer at the Council of Higher Education for Virginia, and it was a trial by fire. The state legislature had mandated campus-based assessment, which was very enlightened of them. Every campus was to go about assessment in its own way. We had hoped that accountability and improvement—that double agenda of assessment—could be satisfied by a campus-based process, and for about 10 years it worked. The legislature seemed satisfied that higher education was tending to business, and we felt that those institutions that took it seriously got some real value out of it.

However, for 10 years I read one of these reports from every campus in the state, and at the end of the day I couldn’t answer the legislative question, “So how are we doing?” I also couldn’t answer the question, “So what does it mean to get a baccalaureate degree, what does that signify?” In the early 90s those questions started to surface again. I think they tend to surface when economic times are tough, when dollars are tight and everybody wants to know that they are spending their money wisely. The agreement sort of came unraveled about that point. The state started to put standardized testing accountability measures into place.

Fast forward 10 years to 2000, and the press release for the first issue of Measuring Up, which was the first national report card on higher education. Measuring Up is a very interesting exercise, because what it did was to take higher education and look at its quality differently from the way it had been looked at before. It was taking a state-level look at higher education quality and saying, “If you are a student in a state, what are your chances of getting a good education in that state?” That’s really thinking about higher education in the larger context of education and the whole thing in the larger context of state policy. The report card graded every state on the effectiveness of its preparation of students for college, the participation rate of students in college, how affordable it made college for students, particularly low income students, the degree to which students in that state completed college, and, finally, the benefits that accrued to the states from having a college educated population.

There was a sixth category—learning. That’s the category in which all the states received an incomplete. It wasn’t because nobody had been doing anything. We said over and over that each individual state is collecting information about learning, and individual campuses are collecting information about learning, and those that are taking it seriously are improving practices. But we had no nationally comparable data. One of the things I learned in Virginia was that you make meaning out of documents like that by having something to compare them to. If you have an 80 percent pass rate on a home grown exam in a given department, you have no idea if that is good news or bad news. I learned that the hard way, because I thought the first comparisons were pernicious. I couldn’t make any sense of those kinds of responses.

I was asked to figure out what to do about this incomplete. I took the job on for a couple of reasons. One, because I had had this experience of trying to make meaning out of campus-based assessment reports, and I really did think there was a legitimate question behind all the public brouhaha. We really ought to be paying attention to our major results in some systematic way. The other thing I thought was, if we don’t do it somebody will do it to us. The Higher Education reauthorization thing is my worst fear—the idea that somebody who has not thought about this long and hard and who hasn’t learned the lessons that need to be learned from No Child Left Behind will come in and do the same thing all over again.

I spent six months talking to people about this and posing three questions. First, should we be doing this at all? Is it too dangerous to do it, or too dangerous not to do it? Which is worse? Second, if we do it, what kinds of questions are we trying to answer? And third, if we are going to do it, how do we do it? After talking to higher education leaders and a lot of technical testing measurement people and so on, this whole conversation culminated in a meeting of national CEO’s, governors, former governors, and higher education leaders in which we asked the same three questions. The answers were yes, we should be doing it and it’s not too soon to begin.

Ten years after the National Education Goal had already said we have an objective for higher education, we’ve identified what it is. Do you all know what the objective is? National Goal Six is the only education goal directed at adults. It says, basically, that adults need to know a lot, because we have to be globally competitive and they have to be able to exercise the rights and responsibilities of citizenship. Under that goal is an objective for college education: that by the year 2000, the proportion of college students who are able to think critically, solve problems, and communicate effectively should increase substantially. But then the question became this: Why do we have to have this information? What are the questions behind the question: can college graduates solve problems, think critically, and communicate effectively? There were two that people at that meeting wanted to know. First, what kind of intellectual capital for a state is embodied in the college educated residents of that state? And second, how does the collective system of higher education in that state—two and four year, public and private—contribute to that social capital, that educational capital?

Those were the two questions. Then we came up with a model that this group thought useful, called Measuring Up. Any category in which a state gets a grade is made up a set of indicators. For instance, in the category of preparation, it is information about college preparatory course patterns of students in the eighth grade and the twelfth grade, their NAEP scores, their SAT scores, and so forth, and we have national data on that. One of the rules of Measuring Up was we always went to existing data sources, and one of the things we discovered is all the data that doesn’t exist. A lot of data by SES and race ethnicity that should be there is not there. With the learning project we realized that we were going to have to break that rule. We were actually going to have to collect data that didn’t already exist. So we are collecting in five pilot states information on graduate admissions and licensing exams, and statewide averages. We are collecting information about the national assessment of adult literacy in those states and by educational level. And we have a couple of indirect measures within colleges of effective practice. We also have an alumni survey, which asks college graduates how comfortable they feel with the intellectual skills they have.

Those are the measures we have. Then, we decided we needed to do a test of the general intellectual skills of a representative sample of students in the state. We picked two different instruments. For the four-year institutions, we are in the process of administering something called The Collegiate Learning Assessment. It’s a performance-based test that was developed by faculty in New Jersey years ago and has been further developed by the Rand Corporation. Basically, it presents students with a set of documents in a given area, for instance in the humanities or social sciences or sciences, and asks them to solve a set of problems based on the information they’ve just received. We are giving it to a random sample of students at a representative sample of institutions in each of these states. The institutions do it voluntarily, and the students do it voluntarily. This is not a test that you could do as a high-stakes involuntary test, because it is designed to be a sample-based test. The one we are doing for the two-year colleges is called Work Keys, which looks at the kinds of skills that people need in the work force. They profiled thousands of jobs for this. It measures students on those scales. The Work Keys tests we are giving are applied math, reading for information, locating information, and writing. You can see the ways these things are lining up with the national goals of problem solving, critical thinking, and communication skills. We are aware that this doesn’t begin to touch what students get out of college, but it is a core set of competency areas in which most everybody agrees we ought to increase student capacity. There are all kinds of issues here, but I think we have a solid beginning for thinking about how to do this. We ought to be able to provide an alternative to the really bad ideas that are out there about how we will account for our primary results in higher education. This is one way to do it. We are going to learn a lot of lessons from this project about how to do it and how not to do it, but I am interested in finding out whether or not we are contributing to our students intellectual development, and if so, to what degree are we doing it.

Marchese: Back in 1972, the State of Georgia instituted zero-based budgeting. The image that remains from that is that every program, department, and office in the entire state government of Georgia—including the University of Georgia and all the state colleges—had to imagine that it had never existed before and start from zero to design a program and a budget that would go with it. The University of Georgia sent its report to the state capitol in a trailer truck. The second image is that of John Folger, a professor at Vanderbilt who had been in the state government and had taken part in the first accountability plan in the State of Tennessee around 1979-1980. After he retired from state government and was working at Vanderbilt, he decided that he would be like Diogenese, who went around looking for an honest man. He decided he would go to Nashville, the state capitol, and see if he could find anybody who had either seen or read these reports. All these reports had taken tens of thousands of person hours to compile, and of course he couldn’t find anybody at all. And these are not free goods. Every time that we do a bad assessment there are financial costs, there are costs to morale, and there are opportunity costs. Is bad assessment driving out good assessment?

Miller: The kind of assessment I am talking about is not meant to replace campus-based assessment. It can’t do the kind of fine-grained improvement work that needs to be done program by program. What it can do is give a state a snapshot of where it’s doing well and where it isn’t, and where it’s doing better than expected and where it’s doing worse than expected. Therefore, as a state you could hone in on things that need working on.

Marchese: That is an argument then for pushing it back more fully to the campus level?

Miller: I would say let’s not break it out by institution. Let’s break it out by other factors—by areas of the state or populations in the state. Where can we as a higher education community in the state target our efforts?

Marchese: One of the most useless things you can do in assessment is document an outcome when you don’t know where it came from, otherwise you can’t fix it. If it is a general phenomenon or outcome for the state statistically, you don’t know whether you have problems because what you have in front of you is averages.

Fanelli: Whenever I hear of accountability budgeting, I wonder if they have it right. I’ve seen examples where, for instance, universities look at graduation rates. They decided that those colleges that had high graduation rates would be allowed to hire more faculty, they’d hire more faculty, while those with lower graduation rates would hire fewer faculties. What worries me is whether this information is being used correctly? Another thing: I’ve always been troubled that teachers have been given a bad rap on how they’re not doing the job, but there’s another aspect to this. A family comes before the school. The family as a factor in student performance is frequently left out of the equation. It’s all left on the backs of the teachers.

Miller: There’s a distinction that has to be made when you think about how you evaluate a program or an institution or a school or a college. That is the distinction between what the standards are that you are using for evaluation, and what the processes are that you are using for it. Then there’s another consideration, and that is what you are going to do with the results. When it comes to standards, to begin with you have standards that a profession or a community of scholars have agreed upon. It’s the old adage that if you ask the wrong question, the answer doesn’t matter. You have to get the right questions about what standards are for evaluating an institution. And that is difficult, because we’re not all the same institution. Even if you say higher education, institutions can be community colleges, private or public, colleges within larger institutions, or professional schools.

Once you get the standards, then you have to think about the process. The problem with what is currently the process for evaluating teacher education—not only teacher education, but also most professions—is that the process for assessing quality is being reduced to a single measure. It’s psychometrically flawed. I don’t think you can take a complex construct like the quality teacher or quality anything and reduce it to a single measure.

After standards and process, the third problem is how the results are used. Results were originally intended to be fed back into the curriculum so as to inform those responsible for it to revise the curriculum, but they are being used for political purposes. They are being used to judge units, to rank them, and then to shame them and make budget decisions about them. You have to look at what it is you are trying to measure. You have to look at the quality of the tool that you are using to measure, and then you have to look at the purpose that measure serves—and we switch around even while we keep the same instrument.

Marchese: I’m much more concerned than I was a year ago about college costs. Not least because charts show big increases from 1980, but what is on the lips of people in Washington is this past year public tuition has gone up 14 percent and private tuitions have gone up six percent for the third year in a row. Many presidents of leading institutions are figuring that this is not sustainable. But is the kind of assessment that we are in favor of, and that we claim we are doing, the answer to the right question?

Miller: It isn’t the answer to that question. The kind of dilemma that Sean finds himself in—where he has layer upon layer upon layer of regulations, while no one has actually thought through whether or not they make sense—that’s a real cost question.

Brabeck: Unfortunately good assessment is costly. Assessment is not going to solve the escalating cost of higher education; it’s only going to add to it.

Marchese: I visit a lot of campuses in the work that I do, and there is an amazing amount of good, useful, local assessment going on. What I want to do is keep watering that garden and not to get overwhelmed by a big weed. There is a lot going on that would make you pleased and proud of our faculty colleagues.

Brabeck: We have to remember that we’re in a particular political moment, and the coin of the realm right now is standardized testing and rankings. This pendulum will swing, but rankings are so important to push forward a particular agenda. Let me give you an example—the Progress in International Reading Skills (PIRS) study at Boston College. It’s an international comparison of 28 countries on reading scores of fourth graders. The results were released last year. How did the United States do? Everybody is always talking about how badly our kids read, but the United States was in the top five countries. No word came out of Washington D.C. on this; there was not even a press conference on these results. You remember when TIMSS came out, the Third International Math Science Skills, and the United States was at the mean, dismal, and it was all over every press in the United States. Why are kids not performing at math and science? But they do well on a test and the story is deep-sixed. What’s that about?

One of the other things you have to do is get a third, disinterested party to preach the good news. Higher education institutions that are crediting bodies are suspect. Let me give you an example, again from teacher preparation. One of the most encouraging things that happened in teacher preparation was that the Carnegie Corporation said that it were going to put millions, with other foundations, into enhancing teacher preparation at 13 institutions. That’s a credible voice that says teacher preparation in colleges and universities counts. Art Wise, head of the National Council of Accreditation of Teacher Education, tried valiantly to get the good news out about accredited organizations, but wasn’t heard. I suspect the same is true for organizations like yours. In this political moment, it’s perceived that such an authority has too much self-interest to be a credible voice.