Inside DFAT's independent evaluation experiment

By Stephen Easton

June 5, 2015

The key to a successful evaluation unit is the right balance between accountability and learning, according to the bureaucrat who keeps an eye on the performance of Australia’s foreign aid program.

Dereck Rooken-Smith
Dereck Rooken-Smith

Setting up a panel of independent evaluation experts as advisors isn’t a bad idea either, according to Dereck Rooken-Smith, an assistant secretary with the Department of Foreign Affairs and Trade who heads up the Office of Development Effectiveness (ODE).

Unique in the federal bureaucracy, ODE runs all its evaluation and performance assessment work past an Independent Evaluation Committee, which has played a highly beneficial role over the past three years.

“For us it’s really proved its worth and it’s been a very positive and constructive approach to evaluation in the old AusAID, and now DFAT,” Rooken-Smith told the audience at a recent Canberra Evaluation Forum event. He believes the existence of the IEC is a key reason his office survived the 2013 merger.

Since then, DFAT secretary Peter Varghese has expressed interest in ODE looking beyond aid at some stage and evaluating other things like the merger process itself and the new innovationXchange project. There’s no reason why not, said Rooken-Smith, and he had plenty of lessons on evaluation that could apply to other government agencies.

As well as being clear about how evaluation can serve the agency internally through learning, and fulfil its external duty to accountability, he listed several other areas where the balance needs to be just right.

“You need balance between what you want to make mandatory, and what should be voluntary. You’ve got to think through the balance between what’s going to be centralised and what’s going to be decentralised. You’ve got to think of the degree of independence that you want a unit like ODE to have. You’ve got to be very clear about your policies on transparency.”

When DFAT absorbed AusAID last year, it was the smaller agency that had the stronger evaluation culture, but this came as no surprise. Foreign aid is one of the most closely watched areas of public expenditure in every donor country, and aid agencies have had a long tradition of evaluation, for which Rooken-Smith credits the leadership of the World Bank and the Organisation for Economic Cooperation and Development on evaluation.

As a central policy department, DFAT was a different world.

“There were parts of it that had some evaluation, but the key issue for DFAT as a policy department really was feedback from the minister,” he told the audience. “And this, from all accounts, makes a lot of sense.”

A big part of the IEC’s success is the experts who were brought in to sit on it. Committee chair Jim Adams is the former vice-president of East Asia and the Pacific with the World Bank, who brought “terrific experience in big bureaucracies” and knowledge of Australian aid programs, according to Rooken-Smith.

He was joined by Wendy Jarvie, a public policy professor and early childhood education expert who has also worked in monitoring and evaluation with the World Bank and in the Australian Government, where she rose to the level of deputy secretary.

“I think the IEC has actually worked much better than we expected in many ways and I think partly it was because of the choice of chair, I think that’s really important,” said Jarvie, during the CEF discussion.

“I’ve headed up evaluation units within Commonwealth departments and if you put up the idea of ‘let’s have an independent evaluation committee’, everyone would run a mile. … I think the choice of the chair here was quite critical because Jim Adams is so well respected. He’s really, really robust, he knows a huge amount and he’ll take on anyone but he knows how business works, how politics works, how bureaucracy works.”

The third external member was Patricia Rogers, a professor of public sector evaluation and one of Australia’s leading academics in the field. Her replacement, former mining boss Stephen Creese, was recently appointed to a three-year term on the IEC by Foreign Minister Julie Bishop.

“The minister really wanted to shake up things a little bit and she wanted a private sector representative on the IEC,” said Rooken-Smith, who paid tribute to Creese’s “terrific credentials” as former managing director of Rio Tinto Australia and former CEO of NewCrest Mining, which allowed him to see development “from a very different perspective” while working in Chile, Indonesia and Papua New Guinea.

“The IEC I think has turned out to be terrifically effective, we’ve been very lucky with the skill-match of these individuals.”

Only one committee member comes from the department, and there is an observer from the Department of Finance.

Building an evaluation culture

According to Rooken-Smith, “the number one thing that really makes a difference” in building a culture of evaluation is leadership. Department of Infrastructure and Regional Development secretary Mike Mrdak detailed his attempts to inculcate “evaluative thinking” from the top level in the last CEF presentation.

One point emphasised by Rooken-Smith was the need for senior managers to “give evidence that evidence is being used” so that staff can see the results of their own contributions.

“You really do need that top-down leadership and that’s the first thing you need,” he said. “It’s also important that staff feel enabled and motivated to keep the system working. … It’s incumbent on leaders to let staff know that the evidence they are providing is making a difference in decision-making. Because if staff feel that all the work and effort they’re going to is to no use at all, it really deflates the enthusiasm, if you like, for what can be quite complicated systems.”

Staff surveyed by ODE six months ago revealed a pleasing level of enthusiasm, he added later.

“They felt they were supported, the guidelines and everything else were very helpful, and more importantly … they could see, through this hierarchy of results reporting how the work that they were doing even at the activity level really was making a difference, and that they were providing information that could be used for management decision making.”

Monitoring versus evaluation

At DFAT, aid monitoring refers to self-assessment of individual activities for internal purposes. Each of the hundreds of individual aid projects has its own activity manager and any that cost over $3 million or are considered “sensitive” must run an annual self-assessment process called an Aid Quality Check (AQC), which is not reported publicly but forms the basis of the internal performance management system.

Evaluation, on the other hand, must involve some element of independence — provided by an external academic expert or consultant — and is always made public. In the aid industry, “M&E” are “inseparable” terms, and monitoring is the “bedrock of the DFAT system,” according to Rooken-Smith.

“A personal observation: don’t confuse independence with correctness,” he warned the audience.

“I don’t think independent is always right. In fact, I’ve seen many cases where it’s wrong. The importance of independence really is the contestability process. It’s the importance of bringing someone outside in to really ask the hard questions, to ask you: is there logic in what you’re doing, and can you provide evidence it’s working?

“It’s the dialogue it starts. It’s actually that negotiation process or contestability, I think, that is the really important part of independence.”

When aid activities reach $10 million, their managers are required to provide a more comprehensive AQC that includes a narrative explaining the reasoning behind the scoring. Those feed into an Annual Program Performance Review for all the activities in a particular country.

Managers of activities over the $10 million mark also have to commission outside experts to conduct an independent Operational Evaluation at least once in the life of the project, and they generally try to pick a time when it will be most useful.

“So quite often it’s like a mid-term review, or they’re having a project that is presenting particular problems, or they think it needs a redirection,” explained Rooken-Smith. “That’s normally when they call in an independent evaluation, to give them a bit more confidence, a bit more support if you like, for major changes to an activity, or even for confirmation that what they’re doing is correct.”

Operational Evaluations are focused on learning, he explained, but they still must be published. That often takes a long time, either because the team is not happy with the quality of the evaluation, or the report is stuck in a long, slow approval chain.

But “just getting them out” is a positive thing for the agency, in Rooken-Smith’s view, and could help others working elsewhere in a similar space. “Every evaluation’s going to have negative things in it,” he pointed out. “Otherwise, why would you have an evaluation?”

Independent Evaluation Committee

The role of the ODE, which was originally set up in 2006, was overhauled following 2011’s Independent Review of Aid Effectiveness, which also recommended the establishment of the Independent Evaluation Committee.

The review acknowledged the value of Rooken-Smith’s unique office and its then key piece of work, the Annual Review of Aid Effectiveness, but also a key weakness:

“No other bilateral donor has an equivalent to the ARDE. Overall, however, the ARDE has been a limited success, being released with increasing delay.”

The holdup was in the then minister’s office.

“I wouldn’t say it became politicised, but it was administratively very difficult get the [document] through the minister’s office and eventually we were releasing the report for the year before in about November of the year afterwards, so it wasn’t looking very good,” Rooken-Smith told the CEF.

The review concluded that ODE would have more influence and a stronger focus on learning if it remained within AusAID, rather than being outside the agency as in the UK model. The 2011 review felt an independent committee would provide enough independence. “And I think that really has shown to be true,” said Rooken-Smith.

“One thing we can do … is access all the internal electronic documents. We can actually look at every AQC from every project, we can mine that data at the beginning of every evaluation to see what the self-assessment is telling us, before we actually start the main evaluation. So that I think has been very important.

“And having links and networks across the department has also been important. Because it’s impossible to do an evaluation without the support of … the people you are evaluating.”

For legal reasons, the committee only plays an advisory role but it has oversight over everything ODE does and its boss said it would be “very foolish” for him to ignore it.

“Another key point for ODE’s work is credibility, and the IEC has been very important in allowing us to establish credibility,” he added.

Part of the office’s role is to monitor the aid monitoring, through randomly reviewing the quality of the self-assessment behind individual Aid Quality Checks, and each of 26 whole-of-country Aid Program Performance Reports. This all informs a new report called the Performance of Australian Aid (PAA), published for the first time this February.

The PAA is the main public report on DFAT’s newly acquired aid program and includes a two-page ODE analysis signed off by IEC chair Jim Adams. Likewise, ODE regularly does an overall analysis of all the independent evaluations that have been commissioned.

But wait, there’s more…

ODE also runs its own large-scale evaluations, which usually take more than a year, and sometimes even look at activities of other agencies that administer parts of the aid budget, like the Attorney-General’s Department or the Australian Federal Police.

“They’re large, complex and cost a lot of money, but they’re strategic and they’re thematic,” said Rooken-Smith.

“We are able to look across aid activities across the whole of the operations of the department, so unlike the operational evaluations which look at specific activities, we might look at a theme or sector such as rural development [or] child nutrition, and what are we doing in Africa versus what we’re doing in the Pacific.”

These evaluations could look at why activities in one particular country are working out better than most, or how multiple programs empower women or promote law and justice. The ODE also tries to work with people who want to work with it.

“Technically we can evaluate whenever we like … but that’s not a very strategic approach and we do rely on working with people who want to work with us, and I was pleasantly surprised when I joined the office that a lot of areas actually wanted us to come and help them, [and] look at the particular issues in their programs,” said Rooken-Smith.

He said public sector evaluators could expect to run into common issues like the difficulty of measuring value for money — which isn’t always the same as efficiency or effectiveness — and a lack of quantitative data, and left the audience with a final thought:

“If you are a program designer, think at the beginning what success will look like and how you’re going to measure it. Engage with your peers, with your stakeholders, make that clear and that will make the evaluator’s job easier.”

About the author

Any feedback or news tips? Here’s where to contact the relevant team.

The Mandarin Premium

Try Mandarin Premium for $4 a week.

Access all the in-depth briefings. New subscribers only.

Get Premium Today